From af59fabd34c0f8589e8368e9fd8ab80bd2c7781d Mon Sep 17 00:00:00 2001
From: Ilia Choly <ilia.choly@gmail.com>
Date: Sat, 30 May 2026 13:12:22 +0000
Subject: [PATCH 1/2] docs: rework channels proposal around local mcp bridge

Reframe the Claude Code channels proposal: the primary use case is
notifying a local user-driven Claude Code session that orchestrates
xagent tasks through the user-facing MCP server, not pushing into
in-container agents.

Replace the new task_events table and PollEvents RPC with reuse of
the existing apiserver -> notifyserver SSE pipeline. Introduce a
local stdio "xagent mcp" bridge that proxies the user-facing tools
and translates model.Notification events into
notifications/claude/channel. Update channel docs facts to match
the current research preview, and re-weigh Go vs TS/Bun for the
bridge now that the in-container Node objection no longer applies.
---
 proposals/draft/claude-code-channels.md | 306 ++++++++++++++----------
 1 file changed, 179 insertions(+), 127 deletions(-)
diff --git a/proposals/draft/claude-code-channels.md b/proposals/draft/claude-code-channels.md
index e221a629..dbc7eaa8 100644
--- a/proposals/draft/claude-code-channels.md
+++ b/proposals/draft/claude-code-channels.md
@@ -1,203 +1,255 @@
-# Claude Code Channels support for xagent MCP server
+# Claude Code Channels for a local xagent MCP bridge
 
 Issue: https://github.com/icholy/xagent/issues/466
 
 ## Problem
 
-xagent agents running inside containers interact with the orchestrator through a request-response MCP tool model. The agent must explicitly call `get_my_task` to discover new events, status changes, or instructions. There is no mechanism for xagent to proactively push information into a running Claude Code session.
+A common way to drive xagent is from a local Claude Code session: a developer runs `claude` on their workstation, and that session creates and supervises xagent tasks through xagent's user-facing MCP server. Today, after creating a task, the local Claude has no way to know when something changes — it must poll `get_task` or `list_tasks` to discover new logs, new instructions, status transitions, or completion. Polling wastes turns, delays reactions, and bloats the model's context with repeated reads.
 
-Claude Code Channels (research preview, v2.1.80+) allow an MCP server to push events into a session via `notifications/claude/channel`. This would let xagent notify agents about child task completions, new parent instructions, external webhook events, and child task log output — without polling.
+Claude Code Channels (research preview, v2.1.80+) provide exactly the primitive that's missing: an MCP server can push `notifications/claude/channel` events into a running session as `<channel>` tags in Claude's context, so the model reacts on the next turn without polling. The C2 server already publishes structured change notifications for every task mutation; the gap is the transport that delivers them to the local Claude.
 
 ## Background: How Channels Work
 
-A channel is an MCP server that declares `claude/channel` in its experimental capabilities and emits `notifications/claude/channel` notifications. Key contract:
+Sources: [code.claude.com/docs/en/channels](https://code.claude.com/docs/en/channels), [code.claude.com/docs/en/channels-reference](https://code.claude.com/docs/en/channels-reference).
 
-- **Capability declaration**: `capabilities.experimental["claude/channel"] = {}` in the MCP server constructor
-- **Notification format**: `method: "notifications/claude/channel"`, `params: { content: string, meta: Record<string, string> }` — content becomes the body of a `<channel>` tag, meta entries become tag attributes
-- **Reply tools**: Optional. Standard MCP tools exposed alongside the channel for two-way communication
-- **Permission relay**: Optional. `claude/channel/permission` capability allows forwarding tool approval prompts remotely
-- **Transport**: stdio only (Claude Code spawns the server as a subprocess)
-- **Instructions**: A string added to Claude's system prompt describing what events to expect
+- **Status**: research preview. Requires Claude Code **v2.1.80+** (one-way + tools); permission relay needs **v2.1.81+**.
+- **Capability declaration**: an MCP server registers as a channel by setting `capabilities.experimental["claude/channel"] = {}`. The value is always an empty object — its presence is the signal. Two-way channels additionally declare `tools: {}`; permission relay adds `capabilities.experimental["claude/channel/permission"] = {}`.
+- **Notification format**: method `notifications/claude/channel`, params `{ content: string, meta: Record<string, string> }`. `content` becomes the body of a `<channel>` tag; each `meta` entry becomes a tag attribute. The `source` attribute is auto-populated from the server's configured name.
+- **`meta` keys must be identifiers**: letters, digits, and underscores only. Keys containing hyphens or other characters are **silently dropped**. Use `task_id`, not `task-id`.
+- **Transport is stdio-only**: a channel server must be a subprocess spawned by Claude Code. Streamable HTTP MCP servers cannot register as channels.
+- **Delivery is fire-and-forget**: the notification call resolves when the JSON-RPC frame is written to the transport, not when Claude processes it. If the session didn't load the server with `--channels`, or org policy blocks channels, events are dropped silently with no error returned. Guaranteed delivery requires a reply tool that the model can call back through.
+- **Queuing**: events are delivered in order, and multiple events arriving while Claude is mid-turn are batched onto the next turn.
+- **Allowlist constraint**: during the preview, `--channels` only accepts plugins on an Anthropic-curated allowlist. A custom server like `xagent` is not on it, so the session must launch with `--dangerously-load-development-channels server:xagent` (which prompts for confirmation per entry), or the org must add the server to the `allowedChannelPlugins` managed setting. Being listed in `.mcp.json` is necessary but not sufficient — the server must also be named in `--channels`.
+- **Auth/platform constraints**: channels require Anthropic auth via claude.ai or a Console API key. They are not available on Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. Team/Enterprise orgs must enable `channelsEnabled` in managed settings.
+- **Channels are a notification layer, not a data layer**. The event says "task 42 updated" with small `meta` attributes; Claude then calls `get_task` for the full payload.
+
+An event arrives in Claude's context as:
 
-Events arrive in Claude's context as:
 ```
-<channel source="xagent" task_id="42" event_type="child_completed">
-Child task 43 (fix auth bug) completed successfully.
+<channel source="xagent" action="updated" resource="task" id="42">
+Task 42 was updated.
 </channel>
 ```
 
 ## Design
 
-### Architecture
+### Two MCP servers already exist
 
-The xagent MCP server (`xagent mcp`) already runs as a stdio subprocess inside each agent container. The channel capability would be added to this same process rather than introducing a separate channel server.
+The proposal hinges on distinguishing the two xagent MCP servers in the tree today:
 
-```
-Claude Code agent
-    ↕ stdio (MCP tools + channel notifications)
-xagent mcp process
-    ↕ HTTP over unix socket
-xagent server
-```
+1. **User-facing MCP server** (`internal/server/mcpserver/mcpserver.go`, backed by package `mcpserver`). Served as MCP **Streamable HTTP** via `mcp.NewStreamableHTTPHandler` with `Stateless: true`, mounted on the C2 HTTP API at `/mcp`. Exposes `list_workspaces`, `create_task`, `get_task`, `list_tasks`, `update_task`. This is what the developer's local Claude Code talks to today.
 
-The `xagent mcp` process gains a background goroutine that polls the xagent server for new events and pushes them as channel notifications to Claude Code.
+2. **In-container agent MCP server** (`internal/command/mcp.go` — `McpCommand` — backed by `internal/agentmcp`). stdio transport. Spawned by the runner inside each task's container. Exposes `get_my_task`, `update_my_task`, `report`, `create_link`, and the child-task tools. (A separate task is moving this command out of the top-level `mcp` slot to `xagent tool agent-mcp`, freeing `xagent mcp` for the new bridge described below.)
 
-### Go MCP SDK Gap
+This proposal only affects path (1): how the local Claude that drives the user-facing server receives push notifications. Pushing into in-container agents is explicitly out of scope (see "Future work").
 
-The Go MCP SDK v1.2.0 (`github.com/modelcontextprotocol/go-sdk`) supports setting experimental capabilities via `ServerOptions.Capabilities.Experimental`, but **does not expose a public API to send arbitrary notifications**. The internal `handleNotify` function and `ServerSession.getConn()` are both unexported.
+### The hard constraint
 
-The existing public notification methods (`NotifyProgress`, `Log`, `ResourceUpdated`) only send predefined MCP notification types — none support the custom `notifications/claude/channel` method.
+The user-facing server is stateless Streamable HTTP. **Channels require stdio.** We cannot simply add `claude/channel` to the experimental capabilities in `mcpserver.go` and have it push notifications into a session — the session does not have a long-lived bidirectional connection to that handler, and `--channels` does not accept HTTP MCP servers. Any push delivery has to happen over a stdio subprocess that Claude Code spawns.
 
-**Resolution options (in order of preference):**
+### The bridge
 
-1. **Contribute upstream**: Add a `ServerSession.Notify(ctx, method string, params any) error` method to the Go SDK. This is a small, well-scoped change — it just needs to call `conn.Notify()` on the underlying jsonrpc2 connection. This would unblock all Go-based channel implementations.
+Introduce a new top-level subcommand:
 
-2. **Use the jsonrpc2 layer directly**: The `internal/jsonrpc2.Connection` type has a public `Notify(ctx, method, params)` method. We could create a custom `mcp.Transport` wrapper that intercepts the connection before it's passed to the SDK, retaining a reference for direct notification sending. This is hacky but avoids forking.
+```
+xagent mcp [--server URL] [--token TOKEN]
+```
 
-3. **Separate TypeScript channel process**: Run a small Node/Bun process as the channel server alongside `xagent mcp`. The TypeScript MCP SDK has native channel support. The channel process would communicate with the Go `xagent mcp` process (or directly with the xagent server) to get events. This adds operational complexity (Node runtime in containers).
+A local stdio MCP server that the developer's `.mcp.json` launches. It does two things:
 
-4. **Fork the Go SDK temporarily**: Add the `Notify` method in a fork, use it until upstream merges the change.
+1. **Re-exposes the user-facing tools** (`list_workspaces`, `create_task`, `get_task`, `list_tasks`, `update_task`) over stdio, proxying each call to the C2 server via `xagentclient.New(...)` (the existing Connect RPC client). For a CLI-driven setup this replaces the remote HTTP MCP entry, so the developer only needs **one** MCP entry instead of an HTTP endpoint plus a separate channel process.
 
-Option 1 is strongly preferred. The change is minimal and benefits the broader Go MCP ecosystem.
+2. **Declares the `claude/channel` capability** and pushes `notifications/claude/channel` events for task changes by translating an SSE subscription to the existing notification stream.
 
-### Event Polling
+The user-facing HTTP MCP endpoint at `/mcp` stays in place for hosted/web-driven Claude clients that cannot spawn local subprocesses.
 
-The `xagent mcp` process would start a background goroutine after connecting:
+#### Updated architecture
 
-```go
-// After server.Run starts (requires refactoring to use server.Connect directly)
-go s.pollEvents(ctx, session)
+```
+Local Claude Code session
+    ↕ stdio (MCP tools + notifications/claude/channel)
+xagent mcp  (NEW local bridge — proxies tools, translates SSE → channel)
+    ↕ HTTP: Connect RPC (tools)  +  SSE subscription (notifyserver)
+xagent C2 server  (already publishes task notifications on every change)
 ```
 
-The poll loop would call a new RPC endpoint on the xagent server:
+### Reusing the existing notification pipeline
 
-```protobuf
-message PollEventsRequest {
-  int64 task_id = 1;
-  int64 after_event_id = 2;  // cursor for incremental polling
-}
+This is a translator, not a new event system. The pieces are already in place:
 
-message PollEventsResponse {
-  repeated TaskEvent events = 1;
-}
+- `internal/server/apiserver/apiserver.go` calls `s.publish(model.Notification{...})` on every mutating RPC (`task.go`, `event.go`, `log.go`, `link.go`, `workspace.go`, `key.go`, `org.go`, `runner.go`). Every task create / update / status change / log append / link append already produces a notification.
+- `internal/server/notifyserver/sse.go` fans these notifications out per-org over an SSE endpoint mounted at `/events`. The web UI is already a consumer of this stream for live updates.
+- `internal/model/notification.go` defines the payload:
 
-message TaskEvent {
-  string type = 1;           // "child_completed", "child_failed", "instruction_added", "external_event", "child_log"
-  string content = 2;        // human-readable description
-  map<string, string> meta = 3;  // routing attributes
-  int64 id = 4;              // monotonic ID for cursor
-}
-```
+  ```go
+  type Notification struct {
+      Type      string                 `json:"type"`       // "ready" | "change"
+      Resources []NotificationResource `json:"resources,omitempty"`
+      Time      time.Time              `json:"timestamp"`
+      OrgID     int64                  `json:"org_id"`
+      UserID    string                 `json:"user_id,omitempty"`
+      ClientID  string                 `json:"client_id,omitempty"`
+      Runner    string                 `json:"for_runner,omitempty"`
+  }
+
+  type NotificationResource struct {
+      Action string `json:"action"` // created | updated | appended
+      Type   string `json:"type"`   // task | event | log | link | task_logs
+      ID     int64  `json:"id"`
+  }
+  ```
+
+  Every field that ends up in a channel `meta` attribute (`action`, `type`, `id`) is already identifier-safe — letters, digits, underscores only — so they pass the channel `meta` key/value rules without transformation.
+- `internal/x/sse` is an existing SSE Reader/Writer the bridge can consume.
+
+The bridge:
 
-This is a new endpoint because the existing `GetTaskDetails` returns the full task state — we need an incremental, cursor-based stream of changes.
+1. Connects to the C2 SSE endpoint (`GET /events`, `Accept: text/event-stream`) using the same auth token configured for the RPC client.
+2. Reads `model.Notification` JSON payloads via `internal/x/sse.Reader`.
+3. Filters down to task-relevant resources (`type` in `{task, log, link, task_logs, event}`).
+4. For each surviving `NotificationResource`, emits one `notifications/claude/channel`:
 
-### Event Types
+   ```jsonc
+   {
+     "method": "notifications/claude/channel",
+     "params": {
+       "content": "Task 42 was updated.",
+       "meta": {
+         "action":   "updated",
+         "resource": "task",
+         "id":       "42"
+       }
+     }
+   }
+   ```
 
-| Type | Trigger | Content | Meta |
-|------|---------|---------|------|
-| `child_completed` | Child task status → COMPLETED | "Child task {id} ({name}) completed" | `task_id`, `child_id` |
-| `child_failed` | Child task status → FAILED | "Child task {id} ({name}) failed" | `task_id`, `child_id` |
-| `instruction_added` | Parent adds instruction via `update_child_task` | The instruction text | `task_id`, `source` |
-| `external_event` | Webhook routed via subscribed link | Event description + data | `task_id`, `event_id`, `url` |
-| `child_log` | Child task uploads a log with type "llm" | The log message | `task_id`, `child_id` |
+   Channel `meta` requires identifier keys, so we rename `Type` → `resource` (since `type` is also reserved in some contexts) and stringify `ID`. `OrgID`, `UserID`, `ClientID` are not forwarded to the model.
 
-### Capability Declaration
+5. Reconnects the SSE stream on transport errors with backoff, mirroring the web UI's behavior.
 
-In `internal/command/mcp.go`, change the server constructor:
+The bridge does **not** open new RPCs or read full task payloads. Claude does that itself by calling `get_task` through the same bridge after the channel event arrives.
+
+### `mcpserver.AddTools` refactor
+
+To avoid duplicating the tool schemas between the HTTP handler and the new stdio bridge, extract the tool registrations currently inline in `mcpserver.Server.Handler()` (the five `mcp.AddTool(server, ...)` calls plus the input/output types) into a reusable function on the `mcpserver` package, roughly:
 
 ```go
-server := mcp.NewServer(&mcp.Implementation{
-    Name:    "xagent",
-    Version: "1.0.0",
-}, &mcp.ServerOptions{
-    Capabilities: &mcp.ServerCapabilities{
-        Experimental: map[string]any{
-            "claude/channel": map[string]any{},
-        },
-    },
-    Instructions: "Events from the xagent channel arrive as <channel source=\"xagent\" ...>. " +
-        "They notify you about task status changes, new instructions, and external events. " +
-        "You do not need to reply to these events — they are informational. " +
-        "Use the existing xagent MCP tools to take action based on them.",
-})
+// AddTools registers the user-facing xagent tools on the given MCP server.
+// Both the HTTP handler and the local stdio bridge call this so they share
+// schemas, descriptions, and behavior.
+func AddTools(server *mcp.Server, service xagentv1connect.XAgentServiceHandler, baseURL string) {
+    s := &Server{service: service, baseURL: cmp.Or(baseURL, xagentclient.DefaultURL)}
+    mcp.AddTool(server, &mcp.Tool{Name: "list_workspaces", /* ... */}, s.listWorkspaces)
+    mcp.AddTool(server, &mcp.Tool{Name: "create_task",     /* ... */}, s.createTask)
+    mcp.AddTool(server, &mcp.Tool{Name: "get_task",        /* ... */}, s.getTask)
+    mcp.AddTool(server, &mcp.Tool{Name: "list_tasks",      /* ... */}, s.listTasks)
+    mcp.AddTool(server, &mcp.Tool{Name: "update_task",     /* ... */}, s.updateTask)
+}
 ```
 
-### Server.Run Refactoring
+`mcpserver.Handler()` calls `AddTools(server, s.service, s.baseURL)` after constructing the server; the bridge calls the same function with a Connect-client-backed `service` (the existing `xagentclient.Client` type already satisfies the same `XAgentServiceHandler` interface used by `apiserver.Server`, since tool calls just forward to RPCs).
 
-Currently `xagent mcp` calls `server.Run(ctx, &mcp.StdioTransport{})` which blocks. To start the poll goroutine, we need the `ServerSession`:
+The handler keeps its `Stateless: true` Streamable HTTP wrapper; the bridge wraps the same server with `mcp.StdioTransport` and additionally sets `Capabilities.Experimental["claude/channel"] = map[string]any{}` plus channel-specific `Instructions`.
+
+### `xagent mcp` skeleton
 
 ```go
-session, err := server.Connect(ctx, &mcp.StdioTransport{}, nil)
-if err != nil {
-    return err
+var McpCommand = &cli.Command{
+    Name:  "mcp",
+    Usage: "Local stdio MCP bridge: re-exposes xagent tools and pushes task change notifications as Claude Code channel events",
+    Flags: []cli.Flag{
+        &cli.StringFlag{Name: "server", Value: xagentclient.DefaultURL, Usage: "C2 server URL"},
+        &cli.StringFlag{Name: "token",  Required: true,                 Usage: "API token"},
+    },
+    Action: func(ctx context.Context, cmd *cli.Command) error {
+        client := xagentclient.New(xagentclient.Options{
+            BaseURL: cmd.String("server"),
+            Token:   cmd.String("token"),
+        })
+
+        server := mcp.NewServer(&mcp.Implementation{
+            Name:    "xagent",
+            Version: version.String(),
+        }, &mcp.ServerOptions{
+            Capabilities: &mcp.ServerCapabilities{
+                Experimental: map[string]any{
+                    "claude/channel": map[string]any{},
+                },
+            },
+            Instructions: "Events from the xagent channel arrive as " +
+                "<channel source=\"xagent\" action=... resource=... id=...>. " +
+                "They notify you that an xagent task, log, link, or event " +
+                "changed. Call get_task with the id for details before acting.",
+        })
+
+        mcpserver.AddTools(server, client, cmd.String("server"))
+
+        session, err := server.Connect(ctx, &mcp.StdioTransport{}, nil)
+        if err != nil {
+            return err
+        }
+        go pushTaskChannels(ctx, session, client, cmd.String("server"), cmd.String("token"))
+        return session.Wait()
+    },
 }
-go pollEvents(ctx, session, client, task)
-return session.Wait()
 ```
 
-### Runner Integration
+`server.Run` is replaced by the lower-level `server.Connect` + `session.Wait` pair so the bridge can hold a reference to the `ServerSession` and emit notifications from a background goroutine.
 
-The runner (`internal/runner/runner.go`) injects the `xagent` MCP server config into each container. For channels, Claude Code needs a separate `--channels` flag. The runner would need to:
+### The Go MCP SDK gap (relocated, not eliminated)
 
-1. Detect if the Claude Code version supports channels (v2.1.80+)
-2. Pass `--dangerously-load-development-channels server:xagent` during research preview, or `--channels server:xagent` once allowlisted
-3. Add the xagent MCP server to the channels config section
+The Go MCP SDK v1.2.0 (`github.com/modelcontextprotocol/go-sdk`) supports declaring experimental capabilities via `ServerOptions.Capabilities.Experimental`, but **does not expose a public API to send arbitrary notifications**. The internal `handleNotify` and `ServerSession.getConn()` are unexported, and the public notification methods (`NotifyProgress`, `Log`, `ResourceUpdated`) only cover predefined MCP types — none can send `notifications/claude/channel`.
 
-This requires changes to `internal/agent/config.go` and `internal/agent/claude.go` to support the channels CLI flag.
+Resolution options:
 
-### Database Changes
+1. **Upstream `ServerSession.Notify(ctx context.Context, method string, params any) error`.** The smallest possible change: a public method that delegates to the underlying jsonrpc2 connection's existing `Notify`. Strongly preferred because it unblocks the whole Go MCP ecosystem, not just xagent.
+2. **A jsonrpc2-layer wrapper.** The underlying `internal/jsonrpc2.Connection` has a public `Notify(ctx, method, params)`. Build a thin `mcp.Transport` wrapper that retains a reference to the connection before it is handed to the SDK, then call `Notify` directly. Avoids a fork but is hacky.
+3. **Temporary fork.** Add `Notify` in a fork and pin to it until upstream merges.
 
-Add an `events_log` table to store the incremental event stream:
+Note: the **TypeScript/Bun bridge alternative discussed below sidesteps this gap entirely** — the TS SDK supports arbitrary notifications natively. The Go-vs-TS choice is therefore upstream of this list (see Trade-offs and Open Questions).
 
-```sql
-CREATE TABLE task_events (
-    id BIGSERIAL PRIMARY KEY,
-    task_id BIGINT NOT NULL REFERENCES tasks(id),
-    type TEXT NOT NULL,
-    content TEXT NOT NULL,
-    meta JSONB NOT NULL DEFAULT '{}',
-    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
-);
+### What the original draft proposed and why we're dropping it
 
-CREATE INDEX idx_task_events_task_id_id ON task_events(task_id, id);
-```
-
-Events are written by the server when task state changes occur (status updates, new instructions, webhook events). The `xagent mcp` process polls `task_events WHERE task_id = ? AND id > ?`.
+The earlier draft of this proposal proposed:
 
-### Permission Relay (Future)
+- A new `task_events` table for an incremental, channel-shaped event log.
+- A new `PollEventsRequest` / `PollEventsResponse` Connect RPC the agent process would poll every few seconds with a cursor.
+- A `channel/channel` capability bolted onto the in-container `xagent mcp` process so it could push events into the *in-container* Claude Code agent.
 
-Permission relay (`claude/channel/permission`) is not in scope for the initial implementation. It would require a trusted sender path (e.g., the web UI or a chat integration) and adds significant complexity around authentication and UX. This can be added later once the one-way channel is proven.
+All three are superseded:
 
-## Trade-offs
+- The C2 server **already publishes a `model.Notification` on every relevant mutation** and **already fans them out over SSE** at `/events`. A new table and a new polling RPC would duplicate that pipeline.
+- The use case has moved to the local-developer Claude that drives the user-facing MCP server, not to in-container agents. The original framing of "replace the in-container agent's `get_my_task` polling with channels" is preserved as future work but is no longer the primary motivation.
 
-### Why modify the existing `xagent mcp` process vs. a separate channel server?
+The "Capability Declaration", "Server.Run Refactoring", and "Runner Integration" sections from the prior draft are replaced by the bridge command and `AddTools` refactor described above; "Database Changes" and "Event Polling" are removed entirely.
 
-A separate channel server (e.g., in TypeScript) would sidestep the Go SDK gap but adds:
-- Node.js runtime dependency in Docker containers
-- IPC between the channel process and the Go MCP process or xagent server
-- Two separate MCP server configs to manage
-- More failure modes
+## Trade-offs
 
-Modifying the existing process keeps everything in one binary, reuses the existing auth/transport, and is architecturally simpler. The Go SDK gap is the only blocker and is solvable.
+### Go bridge vs. TypeScript/Bun bridge
 
-### Why polling vs. server-push (WebSocket/SSE)?
+The prior draft rejected a TypeScript channel server because it would have dragged a Node runtime *into container images*. That objection no longer applies: the bridge runs on the developer's local machine, and **Bun is already a prerequisite for Claude Code Channels** — every official channel plugin in the preview ships as a Bun script. So the runtime cost of TS is essentially zero on a machine that already has channels working.
 
-The `xagent mcp` process communicates with the server over a unix socket HTTP transport. Adding WebSocket or SSE support to the unix socket proxy (`internal/runner/proxy.go`) and the Connect RPC API is significant work. Polling every 2-5 seconds with a cursor is simple, efficient (small payloads), and consistent with the runner's existing polling pattern for task assignment.
+- **Go bridge** (`xagent mcp` subcommand): reuses `xagentclient`, `internal/x/sse`, `internal/server/mcpserver` tool definitions, and `model.Notification` directly. No code duplication; lives in the existing repo, ships with the existing release pipeline (single static binary). Cost: the Go MCP SDK has no public API for arbitrary notifications, so option 1, 2, or 3 above must be picked.
+- **TS/Bun bridge**: the official `@modelcontextprotocol/sdk` server has first-class `notification()` support, so the channel-side problem is trivial. Cost: re-implements the user-facing tool proxying, Connect-client transport, auth token handling, and SSE parsing in TypeScript, and adds a second release artifact in a new language for the project to maintain.
 
-### Why a new `task_events` table vs. reusing existing events?
+This is the central open question (see below).
 
-The existing `events` table stores webhook payloads routed via subscribed links. Channel events are broader — they include task status changes and log forwarding which aren't external webhooks. A dedicated table with a simple monotonic ID makes cursor-based polling trivial and avoids complicating the existing event routing system.
+### Push into the bridge vs. point Claude at the existing HTTP endpoint
 
-## Open Questions
+We could leave `/mcp` as the only entry point and ship a separate, minimal stdio "channel-only" subprocess whose sole job is to translate SSE → notifications. Pros: smaller surface; Claude can keep talking to the proven HTTP endpoint for tools. Cons: developers configure two MCP entries; the channel server still needs auth, an SSE client, and Bun-or-Go runtime — most of the bridge's complexity — without the simplicity win of "one stdio entry replaces everything for local use." We propose the bundled bridge as the default, but the split layout remains a valid alternative.
 
-1. **Go SDK upstream appetite**: Would the `modelcontextprotocol/go-sdk` maintainers accept a `ServerSession.Notify` method? This should be validated before starting implementation.
+### Reusing the existing SSE stream vs. building a per-task subscription
 
-2. **Poll interval**: What's the right balance between responsiveness and load? 2 seconds seems reasonable but should be configurable. Long-polling would be better but requires more transport work.
+The notify SSE stream is per-org: a bridge subscribes once and sees every notification the user's org generates. We could instead build a per-task channel-shaped endpoint and have the bridge open one subscription per task it has touched. Cons: more state on the bridge, more reconnect/lifecycle handling, more endpoints on the server. Pro of going per-task: trivially scoped — no risk of leaking another user's activity into the local Claude. We propose reusing the existing org-scoped stream and filtering in the bridge, but the filtering policy is itself an open question (next section).
 
-3. **Event retention**: How long should `task_events` rows be kept? Options: delete when task is archived, TTL-based cleanup, or keep indefinitely. Archival cleanup is simplest and aligns with existing patterns.
+## Open Questions
 
-4. **Research preview constraints**: Channels require `--dangerously-load-development-channels` for custom servers. Should xagent wait for a stable release, or ship with the development flag during the preview? The flag requires user confirmation on each launch.
+1. **Go or TypeScript/Bun for the bridge?** The container-runtime objection that drove the original Go preference is gone. Decide deliberately whether to absorb the upstream-Notify work (Go) or maintain a second small TypeScript artifact (TS). Both are real, neither is obviously right.
+2. **Scope of forwarded notifications.** Should the bridge push every task notification on the org's SSE stream, or filter to tasks created by the same user (`Notification.UserID`) or even the same session (`Notification.ClientID`)? The `model.Notification` envelope carries both, so a filter is cheap; the policy choice (and the UX of "I created task 42 from this terminal — only tell me about it" vs. "tell me about everything in my org") is the question.
+3. **Bridge-as-everything vs. channel-only bridge.** Should `xagent mcp` re-expose the user-facing tools alongside the channel (one MCP entry for the local user, as proposed), or stay channel-only and let the user keep pointing Claude at the HTTP `/mcp` endpoint for tools (two entries, sharper layering)?
+4. **How rich should `content` be?** Channel `content` is the `<channel>` tag body. We could send a minimal `"Task 42 updated."` and rely on Claude calling `get_task`, or we could embed a short human-readable summary (status transition, instruction author) to save a round-trip. Richer `content` means the bridge fetches details before emitting, which costs an RPC per change.
+5. **Permission relay.** Two-way channels and the `claude/channel/permission` capability would let xagent prompt the local Claude for approvals (e.g. before running a destructive task action). Out of scope for v1; flagged so we don't paint ourselves into a corner on transport/auth choices.
 
-5. **Which events to start with**: The full set (child status, instructions, external events, child logs) may be too ambitious for v1. Starting with just `external_event` (webhook forwarding) would deliver the highest value with the least new infrastructure, since events already exist in the database.
+## Future work: pushing into in-container agents
 
-6. **Channel vs. tool hybrid**: Should channel events supplement or replace the polling done by `get_my_task`? If Claude receives a channel event about a child completing, it may still need to call `get_my_task` or `list_child_tasks` to get the full details. The channel serves as a notification layer, not a data layer.
+The original framing — replacing the in-container `xagent mcp` process's reliance on the agent polling `get_my_task` with pushed channel events — is still achievable on top of this work once the Go SDK gap (resolution 1, 2, or 3 above) is closed. The in-container agent server already runs over stdio, so adding the capability is mechanically straightforward; the design question is which agent-side state changes (child completions, parent instructions, routed external events, child logs) deserve a push. That work is deferred to a follow-up proposal so this one can ship the local-developer use case first.

From 23b8ea2e27cbea43fe60e903edc071c08cc3c21d Mon Sep 17 00:00:00 2001
From: Ilia Choly <ilia.choly@gmail.com>
Date: Sat, 30 May 2026 13:34:36 +0000
Subject: [PATCH 2/2] docs: address review on channels proposal

Fold in review findings from PR #706:

- Correct vendored Go MCP SDK version: v1.4.1 (v1.6.1 latest); gap
  persists in all of them.
- Split the prior "SDK gap" section into two: declaring the
  claude/channel capability is already supported by the public
  ServerOptions.Capabilities.Experimental and needs no patch;
  sending notifications/claude/channel is handled by a ~30-line
  transport wrapper over the public mcp.Transport, mcp.Connection,
  and jsonrpc.Request APIs.
- Drop upstream ServerSession.Notify as the preferred path: the
  send-only design was rejected by maintainer @jba on go-sdk #898;
  combined send+receive design is tracked in go-sdk #745.
- Drop the temporary-fork option entirely.
- Resolve the Go-vs-TS/Bun open question toward Go now that the
  transport wrapper eliminates Bun's only structural advantage.
- Update the xagent mcp skeleton to thread the wrapped transport
  to the SSE->channel goroutine.
---
 proposals/draft/claude-code-channels.md | 59 ++++++++++++++++---------
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/proposals/draft/claude-code-channels.md b/proposals/draft/claude-code-channels.md
index dbc7eaa8..981fca2f 100644
--- a/proposals/draft/claude-code-channels.md
+++ b/proposals/draft/claude-code-channels.md
@@ -184,29 +184,45 @@ var McpCommand = &cli.Command{
 
         mcpserver.AddTools(server, client, cmd.String("server"))
 
-        session, err := server.Connect(ctx, &mcp.StdioTransport{}, nil)
+        // channelTransport wraps StdioTransport and exposes a public
+        // Notify(method, params) for the SSE→channel goroutine. See
+        // "Sending notifications/claude/channel" below.
+        transport := newChannelTransport(&mcp.StdioTransport{})
+        session, err := server.Connect(ctx, transport, nil)
         if err != nil {
             return err
         }
-        go pushTaskChannels(ctx, session, client, cmd.String("server"), cmd.String("token"))
+        go pushTaskChannels(ctx, transport, client, cmd.String("server"), cmd.String("token"))
         return session.Wait()
     },
 }
 ```
 
-`server.Run` is replaced by the lower-level `server.Connect` + `session.Wait` pair so the bridge can hold a reference to the `ServerSession` and emit notifications from a background goroutine.
+`server.Run` is replaced by the lower-level `server.Connect` + `session.Wait` pair so the bridge can hold a reference both to the `ServerSession` (for shutdown) and to the transport wrapper (for sending `notifications/claude/channel` from the background goroutine).
 
-### The Go MCP SDK gap (relocated, not eliminated)
+### The Go MCP SDK: capability vs. notification
 
-The Go MCP SDK v1.2.0 (`github.com/modelcontextprotocol/go-sdk`) supports declaring experimental capabilities via `ServerOptions.Capabilities.Experimental`, but **does not expose a public API to send arbitrary notifications**. The internal `handleNotify` and `ServerSession.getConn()` are unexported, and the public notification methods (`NotifyProgress`, `Log`, `ResourceUpdated`) only cover predefined MCP types — none can send `notifications/claude/channel`.
+The repo currently vendors `github.com/modelcontextprotocol/go-sdk` **v1.4.1**; latest is **v1.6.1**. The relevant API surface is identical across those versions. Splitting the prior "SDK gap" into the two things it actually was:
 
-Resolution options:
+**Advertising `claude/channel` — already public API.** `ServerOptions.Capabilities.Experimental` is a public `map[string]any` (`protocol.go` ~ L1547 in v1.4.1) and is plumbed into the InitializeResult the server returns to Claude Code. Setting `Experimental: map[string]any{"claude/channel": map[string]any{}}` on stock SDK is sufficient to register the listener; **no patch, fork, or wrapper is needed** for the capability declaration shown in the `xagent mcp` skeleton above.
 
-1. **Upstream `ServerSession.Notify(ctx context.Context, method string, params any) error`.** The smallest possible change: a public method that delegates to the underlying jsonrpc2 connection's existing `Notify`. Strongly preferred because it unblocks the whole Go MCP ecosystem, not just xagent.
-2. **A jsonrpc2-layer wrapper.** The underlying `internal/jsonrpc2.Connection` has a public `Notify(ctx, method, params)`. Build a thin `mcp.Transport` wrapper that retains a reference to the connection before it is handed to the SDK, then call `Notify` directly. Avoids a fork but is hacky.
-3. **Temporary fork.** Add `Notify` in a fork and pin to it until upstream merges.
+**Sending `notifications/claude/channel` — chosen path is a transport wrapper.** The SDK's public notification methods (`NotifyProgress`, `Log`, `ResourceUpdated`) only cover predefined MCP types, and there is no exported general-purpose `Server.Notify(method, params)`. However, the transport-level types are fully exported and sufficient:
 
-Note: the **TypeScript/Bun bridge alternative discussed below sidesteps this gap entirely** — the TS SDK supports arbitrary notifications natively. The Go-vs-TS choice is therefore upstream of this list (see Trade-offs and Open Questions).
+- `mcp.Transport` and `mcp.Connection` are public interfaces (`transport.go` L37–67). `Connection` exposes `Write(context.Context, jsonrpc.Message) error`; its contract documents that "Write may be called concurrently."
+- `jsonrpc.Request`, `jsonrpc.Message`, and `jsonrpc.EncodeMessage` are public re-exports from `github.com/modelcontextprotocol/go-sdk/jsonrpc`. A `*jsonrpc.Request{Method: "notifications/claude/channel", Params: raw}` **with no ID** is, by JSON-RPC 2.0 definition, a notification (see `Request.IsCall()` — `messages.go:110`).
+
+The bridge therefore ships a ~30-line wrapper that:
+
+1. Wraps `mcp.StdioTransport` (`type channelTransport struct { inner mcp.Transport; conn *channelConn }`).
+2. On `Connect`, calls `inner.Connect(ctx)`, retains the returned `Connection`, and returns its own wrapper that delegates `Read`/`Close`/`SessionID` straight through.
+3. Exposes a public `Notify(ctx, method string, params any) error` that JSON-marshals `params` to `json.RawMessage`, constructs `&jsonrpc.Request{Method: method, Params: raw}` (no ID), and calls the wrapped `Connection.Write`.
+4. Holds a `sync.Mutex` around `Write` so injected notification frames cannot interleave with the SDK's own writes. (The `Connection` contract already promises concurrent-safe `Write`, but the lock keeps the framing easy to reason about and matches the reviewer's recommendation.)
+
+The bridge constructs `channelTransport{inner: &mcp.StdioTransport{}}`, passes it to `server.Connect`, and keeps a handle to the wrapper so the background SSE→channel goroutine can call `wrapper.Notify(ctx, "notifications/claude/channel", params)` directly. 100% public API; no fork; no internal-package access.
+
+**Why not upstream `ServerSession.Notify`?** This was the prior draft's preferred option. It is the wrong bet for now: it has been proposed in [`go-sdk` PR #898](https://github.com/modelcontextprotocol/go-sdk/pull/898) (which explicitly cites `notifications/claude/channel` as motivation) and rejected by maintainer @jba — "A send-only solution isn't sufficient. There must be a story on the receive side… let's not write more code until we understand the solution." The unified send/receive design is tracked in [`go-sdk` #745](https://github.com/modelcontextprotocol/go-sdk/issues/745), with competing PRs [#844](https://github.com/modelcontextprotocol/go-sdk/pull/844) and [#956](https://github.com/modelcontextprotocol/go-sdk/pull/956) still in flight. The net is: ship the transport wrapper now, add a `TODO` referencing #745, and delete the wrapper once upstream lands a combined design. A temporary fork is now unnecessary and is dropped from consideration.
+
+The receive-side concern @jba raised matters only if we add **permission relay** (Claude Code → bridge → user → response). That is explicitly out of scope for v1; the send-only one-way "task updated" push is fully covered by the wrapper.
 
 ### What the original draft proposed and why we're dropping it
 
@@ -225,14 +241,16 @@ The "Capability Declaration", "Server.Run Refactoring", and "Runner Integration"
 
 ## Trade-offs
 
-### Go bridge vs. TypeScript/Bun bridge
+### Go bridge vs. TypeScript/Bun bridge — resolved toward Go
+
+The prior draft rejected a TypeScript channel server because it would have dragged a Node runtime *into container images*. The bridge runs locally — Bun is already a Claude Code Channels prerequisite — so the runtime cost of TS is no longer an objection. That reopened the choice.
 
-The prior draft rejected a TypeScript channel server because it would have dragged a Node runtime *into container images*. That objection no longer applies: the bridge runs on the developer's local machine, and **Bun is already a prerequisite for Claude Code Channels** — every official channel plugin in the preview ships as a Bun script. So the runtime cost of TS is essentially zero on a machine that already has channels working.
+The transport-wrapper path described above closes it again, this time on its own merits:
 
-- **Go bridge** (`xagent mcp` subcommand): reuses `xagentclient`, `internal/x/sse`, `internal/server/mcpserver` tool definitions, and `model.Notification` directly. No code duplication; lives in the existing repo, ships with the existing release pipeline (single static binary). Cost: the Go MCP SDK has no public API for arbitrary notifications, so option 1, 2, or 3 above must be picked.
-- **TS/Bun bridge**: the official `@modelcontextprotocol/sdk` server has first-class `notification()` support, so the channel-side problem is trivial. Cost: re-implements the user-facing tool proxying, Connect-client transport, auth token handling, and SSE parsing in TypeScript, and adds a second release artifact in a new language for the project to maintain.
+- **Go bridge** (`xagent mcp` subcommand): reuses `xagentclient`, `internal/x/sse`, the `mcpserver` tool definitions, and `model.Notification` directly. No code duplication, single static binary, ships through the existing release pipeline. The "no public arbitrary-notify API" cost that previously offset these gains is paid by ~30 lines of transport-wrapper code with 100% public-API surface.
+- **TS/Bun bridge**: the official `@modelcontextprotocol/sdk` server has first-class `notification()` support, so the channel-side problem is trivial — but the bridge would re-implement Connect-RPC tool proxying, auth token handling, and SSE parsing in TypeScript and introduce a second release artifact in a new language for the project to maintain.
 
-This is the central open question (see below).
+Once `Notify` is no longer a real engineering cost on the Go side, the TS bridge's only remaining argument is "native channel support," which the wrapper provides for free. **Go wins.** This trade-off is resolved here rather than left as an open question.
 
 ### Push into the bridge vs. point Claude at the existing HTTP endpoint
 
@@ -244,12 +262,11 @@ The notify SSE stream is per-org: a bridge subscribes once and sees every notifi
 
 ## Open Questions
 
-1. **Go or TypeScript/Bun for the bridge?** The container-runtime objection that drove the original Go preference is gone. Decide deliberately whether to absorb the upstream-Notify work (Go) or maintain a second small TypeScript artifact (TS). Both are real, neither is obviously right.
-2. **Scope of forwarded notifications.** Should the bridge push every task notification on the org's SSE stream, or filter to tasks created by the same user (`Notification.UserID`) or even the same session (`Notification.ClientID`)? The `model.Notification` envelope carries both, so a filter is cheap; the policy choice (and the UX of "I created task 42 from this terminal — only tell me about it" vs. "tell me about everything in my org") is the question.
-3. **Bridge-as-everything vs. channel-only bridge.** Should `xagent mcp` re-expose the user-facing tools alongside the channel (one MCP entry for the local user, as proposed), or stay channel-only and let the user keep pointing Claude at the HTTP `/mcp` endpoint for tools (two entries, sharper layering)?
-4. **How rich should `content` be?** Channel `content` is the `<channel>` tag body. We could send a minimal `"Task 42 updated."` and rely on Claude calling `get_task`, or we could embed a short human-readable summary (status transition, instruction author) to save a round-trip. Richer `content` means the bridge fetches details before emitting, which costs an RPC per change.
-5. **Permission relay.** Two-way channels and the `claude/channel/permission` capability would let xagent prompt the local Claude for approvals (e.g. before running a destructive task action). Out of scope for v1; flagged so we don't paint ourselves into a corner on transport/auth choices.
+1. **Scope of forwarded notifications.** Should the bridge push every task notification on the org's SSE stream, or filter to tasks created by the same user (`Notification.UserID`) or even the same session (`Notification.ClientID`)? The `model.Notification` envelope carries both, so a filter is cheap; the policy choice (and the UX of "I created task 42 from this terminal — only tell me about it" vs. "tell me about everything in my org") is the question.
+2. **Bridge-as-everything vs. channel-only bridge.** Should `xagent mcp` re-expose the user-facing tools alongside the channel (one MCP entry for the local user, as proposed), or stay channel-only and let the user keep pointing Claude at the HTTP `/mcp` endpoint for tools (two entries, sharper layering)?
+3. **How rich should `content` be?** Channel `content` is the `<channel>` tag body. We could send a minimal `"Task 42 updated."` and rely on Claude calling `get_task`, or we could embed a short human-readable summary (status transition, instruction author) to save a round-trip. Richer `content` means the bridge fetches details before emitting, which costs an RPC per change.
+4. **Permission relay.** Two-way channels and the `claude/channel/permission` capability would let xagent prompt the local Claude for approvals (e.g. before running a destructive task action). Out of scope for v1; flagged because permission relay would need the receive-side story that [`go-sdk` #745](https://github.com/modelcontextprotocol/go-sdk/issues/745) is blocking on, so picking it up later is bounded by that upstream design.
 
 ## Future work: pushing into in-container agents
 
-The original framing — replacing the in-container `xagent mcp` process's reliance on the agent polling `get_my_task` with pushed channel events — is still achievable on top of this work once the Go SDK gap (resolution 1, 2, or 3 above) is closed. The in-container agent server already runs over stdio, so adding the capability is mechanically straightforward; the design question is which agent-side state changes (child completions, parent instructions, routed external events, child logs) deserve a push. That work is deferred to a follow-up proposal so this one can ship the local-developer use case first.
+The original framing — replacing the in-container `xagent mcp` process's reliance on the agent polling `get_my_task` with pushed channel events — is still achievable on top of this work. The transport wrapper used by the local bridge is reusable as-is inside the agent server, since the agent already runs over stdio. The design question is which agent-side state changes (child completions, parent instructions, routed external events, child logs) deserve a push, and whether the agent should subscribe to its own per-task slice of `model.Notification`s or get a curated stream. That work is deferred to a follow-up proposal so this one can ship the local-developer use case first.