diff --git a/reason/260525-0055-ccxray-auth-design/candidate-A.md b/reason/260525-0055-ccxray-auth-design/candidate-A.md new file mode 100644 index 0000000..318e4ff --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/candidate-A.md @@ -0,0 +1,420 @@ +# Candidate A — Bearer-only for machines, HttpOnly session cookie minted via one-shot redemption for browsers, with strict Origin/Host pinning + +## 1. Stance / one-sentence summary + +**Keep `AUTH_TOKEN` as the single shared secret; require machine clients (Claude Code, Codex, curl, CI) to present it as `Authorization: Bearer ` on every request; for browsers, mint a short-lived `HttpOnly; SameSite=Strict; Path=/` session cookie via a single one-shot redemption endpoint (`GET /_auth?token=` → 302 + `Set-Cookie` + scrub token from URL), then enforce one unified middleware that accepts *either* the bearer header *or* the cookie, validates `Origin`/`Host` against a server-side allowlist on every state-changing request and every upgrade, and treats absence of `AUTH_TOKEN` as "127.0.0.1-only, anonymous OK".** + +This is the only design that simultaneously fixes the four real bugs (subresource 401, URL leak, missing CSRF defense, DNS-rebinding exposure) without adding a runtime dependency, without monkey-patching client transports, and without forking the codepath per client class. + +--- + +## 2. Component-level architecture + +### 2.1 Modules touched + +| Module | Change | LOC impact | +|---|---|---| +| `server/auth.js` | Becomes the auth core: token check + cookie issuance + cookie verification + Origin/Host validation + DNS-rebind guard. Single export `authGate(req, res, { kind })`. | Grows from 35 → ~180. Still one file. | +| `server/index.js` | Replace the single `authMiddleware(...)` call with `authGate(req, res, { kind: classifyRequest(req) })` and wire the upgrade handler. The upgrade handler now calls `authGate` before invoking `handleWebSocketUpgrade`. | ~15 lines changed. | +| `server/hub.js` | Hub IPC routes (`/_hub/register`, `/_hub/unregister`, `/_hub/health`) bypass `authGate` only when bound to `127.0.0.1` AND the request carries the hub shared secret already written to `~/.ccxray/hub.json` (mode `0600`). | ~10 lines. | +| `server/ws-proxy.js` | Reads `req.ccxrayAuth` set by upgrade gate; rejects upgrade if absent. | ~5 lines. | +| `public/index.html` | No change. | 0 | +| `public/app.js`, `public/miller-columns.js`, `public/sse.js` | **No change.** Cookies attach automatically; SSE and `fetch` calls are already same-origin relative paths. | 0 | +| `README.md` / `CLAUDE.md` | Document the new bootstrap URL and the cookie behavior. | docs only | + +**Auth logic stays in `server/auth.js` (one file) plus the three call sites that compose it. The maintainability rubric is satisfied with margin.** + +### 2.2 Request classifier (`classifyRequest`) + +``` +if req.headers.upgrade?.toLowerCase() === 'websocket' → 'upgrade' +elif url.pathname.startsWith('/_hub/') → 'hub-ipc' +elif url.pathname === '/_auth' || url.pathname === '/_logout' → 'auth-endpoint' +elif method === 'GET' && (pathname === '/' || + pathname.endsWith('.html') || + pathname.endsWith('.css') || + pathname.endsWith('.js') || + pathname === '/favicon.ico') → 'static' +elif pathname === '/_events' → 'sse' +elif pathname.startsWith('/_api/') → 'api' +elif pathname.startsWith('/v1/') || + pathname.startsWith('/v0/') || + pathname === '/anthropic' || + /known upstream prefixes/ → 'upstream-proxy' +else → 'api' // safe default +``` + +The classifier is deterministic and lives in `auth.js`. No per-route bespoke handling — `authGate` reads the kind and applies the right policy. + +### 2.3 Request flows by client class + +#### A. LLM client (Claude Code / Codex CLI) + +``` +Claude Code → POST /v1/messages + Authorization: Bearer (ccxray's gate token) + x-api-key: sk-ant-... (Anthropic's own key, untouched) + host: localhost:5577 + +ccxray: + classifyRequest → 'upstream-proxy' + authGate: + - if AUTH_TOKEN unset: require remote == 127.0.0.1; allow + - else: require Authorization: Bearer ; cookie ignored on this path + - Origin check skipped for upstream-proxy (no browser issues these) + - Host check: enforced (rebind guard) + → forward to Anthropic via forwardRequest() +``` + +The LLM client uses the bearer **exclusively**. We never set a cookie on these responses (Codex/Claude Code don't have a cookie jar that would matter, and we don't want one). + +Implementation note: the CLI launcher (`server/providers.js`) already injects `ANTHROPIC_BASE_URL`. We extend it: when `AUTH_TOKEN` is set, inject `ANTHROPIC_AUTH_TOKEN=` as an *additional* header via the launcher's env (Anthropic SDK supports a custom auth header; Codex supports `-c request_headers`). Users who run their CLI outside the launcher set the header themselves with one `export` line. This keeps the "≤ 1 step beyond setting `AUTH_TOKEN`" rubric. + +#### B. Dashboard browser + +``` +First visit: + User opens http://localhost:5577/?token= (this is the only URL form documented) + + ccxray sees pathname='/' with ?token=...: + classifyRequest → 'static' (after a redirect step, see below) + authGate (static, no cookie, has ?token): + → 302 to /_auth?token=&next=/ + + GET /_auth?token=&next=/ + classifyRequest → 'auth-endpoint' + authGate: + - constant-time compare token to AUTH_TOKEN + - if match: + mint random 32-byte session id `S` (crypto.randomBytes(32).toString('base64url')) + store sha256(S) in in-memory Set with expiry now+8h + response: 302 / + Set-Cookie: + ccxray_session=; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800 + (Secure attribute added if req.headers['x-forwarded-proto']==='https' OR server is bound to non-loopback AND CCXRAY_FORCE_SECURE_COOKIE=1) + - if mismatch: 401, no cookie set + +Subsequent requests (HTML, .css, .js, /_api/*, /_events): + Browser auto-sends Cookie: ccxray_session= + authGate: + - extract cookie, sha256, lookup in valid-set + - if valid AND request kind in {static, sse, api}: + - if kind in {api, sse} AND method != GET: enforce Origin/Host CSRF check + - allow + - else: 401 + a tiny 'reauth needed' JSON for /_api/*, redirect to /_login for HTML +``` + +The `/style.css`, `/app.js`, etc. subresource 401 bug is fixed for free because the cookie applies to `Path=/`. SSE works because `EventSource` sends cookies. No `fetch` patching, no `EventSource` patching, no client-side changes. + +#### C. CLI / scripts / curl + +``` +curl -H 'Authorization: Bearer ' http://localhost:5577/_api/entries?limit=10 + + classifyRequest → 'api' + authGate: + - bearer present and equals AUTH_TOKEN → allow + - cookie path not taken + - CSRF Origin check: SKIPPED when authenticated by bearer (bearer cannot be sent by a browser victim cross-origin without explicit JS — that JS must be running on a page that already has the bearer, which is the attacker's problem, not ours) + - Host check: enforced +``` + +Bearer-authenticated requests are **exempt from the Origin/Host CSRF check** because the cross-site forgery class (browser ambient credentials) does not apply to a header an attacker page cannot add to a same-port request without already having the secret. This is the same reasoning the OWASP CSRF cheat sheet uses to justify the "custom header" pattern. + +#### D. WebSocket upgrade (Codex `/v1/responses`, `/v1/realtime`) + +``` +server.on('upgrade', (req, socket, head) => { + if (!authGate.forUpgrade(req, socket)) { // writes 401 to socket and destroys it + return; + } + handleWebSocketUpgrade(req, socket, head); +}); +``` + +The upgrade gate accepts **only** `Authorization: Bearer ` (codex sets this when launched via ccxray's launcher; users running codex by hand set it explicitly). It rejects cookie auth on upgrades to prevent the cookie-CSRF-over-WS class (browser `new WebSocket()` does send cookies but cannot set a bearer header — accepting cookie here re-opens the same hole closed for state-changing API calls). + +--- + +## 3. Concrete protocol details + +### 3.1 Endpoints added + +| Method | Path | Purpose | Auth | +|---|---|---|---| +| `GET` | `/_auth?token=&next=` | Token-to-cookie redemption. 302 + `Set-Cookie` on success. | constant-time compares `T` to `AUTH_TOKEN` | +| `POST` | `/_logout` | Invalidate current session. 204 + `Set-Cookie: ccxray_session=; Max-Age=0`. | cookie OR bearer | + +That's it. Two endpoints. Everything else uses the existing route table. + +### 3.2 Cookie + +``` +Set-Cookie: ccxray_session=; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800 +``` + +Attribute rationale: + +| Attribute | Value | Why | +|---|---|---| +| `HttpOnly` | yes | Blocks XSS-in-conversation-content exfil. Conversation rendering already escapes, but defense in depth. | +| `SameSite=Strict` | Strict, not Lax | We never need top-level cross-site navigation to authenticate. Strict kills the entire form-POST/cross-origin-fetch CSRF class at the browser layer. | +| `Path=/` | yes | Fixes the subresource 401 bug — `.css`, `.js`, `/_api/*`, `/_events` all share the same path scope. | +| `Domain` | **unset** | Locks to exact host (`localhost:5577`). Domain-binding to a parent is what enables some DNS rebind variants; leaving it unset means the browser sends the cookie *only* for the exact host. | +| `Secure` | conditional | Set when serving over TLS (proxy in front) or when `CCXRAY_FORCE_SECURE_COOKIE=1`. Loopback HTTP requires omitting it. | +| `Max-Age=28800` | 8 hours | One working day. Re-redemption with the bookmarked `/_auth?token=...` URL is one click. | + +### 3.3 Wire-format examples + +**Browser bootstrap (success):** + +``` +GET /_auth?token=hunter2&next=/ HTTP/1.1 +Host: localhost:5577 + +HTTP/1.1 302 Found +Location: / +Set-Cookie: ccxray_session=2k9wQ7sZk-7vJjP1AaBb-zQyXxRrTt9LpKkMnB0qHcU; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800 +Cache-Control: no-store +Vary: Cookie +Content-Length: 0 +``` + +**Authenticated dashboard request (subsequent):** + +``` +GET /_api/entries?limit=10 HTTP/1.1 +Host: localhost:5577 +Cookie: ccxray_session=2k9wQ7sZk-7vJjP1AaBb-zQyXxRrTt9LpKkMnB0qHcU +Origin: http://localhost:5577 + +HTTP/1.1 200 OK +Content-Type: application/json +Vary: Cookie, Origin +... +``` + +**CLI / curl (unchanged):** + +``` +GET /_api/entries?limit=10 HTTP/1.1 +Host: localhost:5577 +Authorization: Bearer hunter2 + +HTTP/1.1 200 OK +Content-Type: application/json +... +``` + +**WS upgrade from codex:** + +``` +GET /v1/responses HTTP/1.1 +Host: localhost:5577 +Upgrade: websocket +Connection: Upgrade +Sec-WebSocket-Key: ... +Sec-WebSocket-Version: 13 +Authorization: Bearer hunter2 +openai-beta: responses_websockets=* +``` + +**State-changing API call from dashboard (CSRF-protected):** + +``` +POST /_api/intercept/abc123/approve HTTP/1.1 +Host: localhost:5577 +Origin: http://localhost:5577 +Cookie: ccxray_session=2k9wQ7sZk-7vJjP1AaBb-zQyXxRrTt9LpKkMnB0qHcU +Content-Type: application/json +Content-Length: 0 + +HTTP/1.1 200 OK +``` + +### 3.4 The Origin/Host check (rebind + CSRF in one shot) + +`authGate` builds the allowlist once at boot from the bind address(es) and the `CCXRAY_PUBLIC_ORIGINS` env (comma-separated, optional): + +``` +allowedHosts = new Set([ + `localhost:${PORT}`, `127.0.0.1:${PORT}`, `[::1]:${PORT}`, + ...envSplit('CCXRAY_PUBLIC_ORIGINS') // e.g. "ccxray.devbox.tail-abc.ts.net:5577" +]) +``` + +For every cookie-authenticated request AND every state-changing request regardless of auth method: + +``` +if (!allowedHosts.has(req.headers.host)) → 421 Misdirected Request +if (req.method !== 'GET' && req.method !== 'HEAD') { + const origin = req.headers.origin + if (!origin) → 403 // non-GET without Origin = block (browsers always send Origin on POST since 2020) + const u = new URL(origin) + if (!allowedHosts.has(u.host)) → 403 +} +``` + +This kills DNS rebinding (attacker `evil.com` re-resolves to 127.0.0.1, but `Host: evil.com` is rejected) and kills cross-origin form-POST CSRF (Origin won't match). + +--- + +## 4. Threat-by-threat mitigation table + +| # | Threat | Defense | Layer | Residual risk | +|---|---|---|---|---| +| 1 | **Malicious website CSRF** (form POST or `fetch({credentials:'include'})` against `http://localhost:5577`) | (a) `SameSite=Strict` on cookie blocks the cookie from being sent on any cross-site request, full stop. (b) **Origin check** on all non-GET requests as defense-in-depth. (c) `
` with default `enctype` cannot set `Content-Type: application/json` — and the API only honors JSON for state-changing endpoints — but we don't rely on that. | Cookie + middleware | None. Three independent gates. | +| 2 | **DNS rebinding** (attacker domain re-resolves to 127.0.0.1) | **Host header validation against `allowedHosts`**. Browser sends `Host: attacker.com` after rebind; server returns 421. Cookie also wouldn't be sent because cookie is bound to `localhost:5577`/`127.0.0.1:5577` exactly (no `Domain` attribute). | Middleware + cookie scope | None for the default loopback case. | +| 3 | **Token exfiltration via URL surface** (history, Referer, logs, paste) | The `?token=` URL is used **exactly once**, at first dashboard load, then 302s to `/` with `Cache-Control: no-store` and the cookie set. The token never appears in any URL the browser navigates to after that — history shows `/`, Referer header points at `/`. CLI users continue to use header-only (already best practice). | Bootstrap protocol | The bookmark/URL the user pasted into their address bar still contains the token until they re-bookmark `/`. Documented and acceptable: same risk profile as any "magic link" auth. | +| 4 | **XSS-in-conversation-content** (LLM/tool output rendered in dashboard, attacker injects ` +``` + +`POST /_auth/redeem` reads `X-Ccxray-Bootstrap`, verifies (a) `Sec-Fetch-Site: same-origin`, (b) `Origin` in allowlist, (c) token in `pendingBootstraps` (single-use), then mints the HMAC cookie via `Set-Cookie` and returns 204. + +#### B'. Dashboard browser (steady state) + +``` +GET /_api/entries?limit=10 HTTP/1.1 +Host: localhost:5577 +Cookie: ccxray_s=. +Sec-Fetch-Site: same-origin +Sec-Fetch-Mode: cors +Origin: http://localhost:5577 + +dispatch → dashboard → verifyDashboard(req): + - Host in allowedHosts (rebind) → §3.4 + - parse cookie, split payload.hmac + - constant-time HMAC verify → §3.2 + - parse payload {v, n, exp}; reject if exp < now + - CSRF gate (cookie-authenticated only): Sec-Fetch-Site ∈ {same-origin, none} + or fallback to Origin match on state-changing requests +→ allow, route to handler +``` + +Subresources (`style.css`, `app.js`) are explicitly **not gated by the cookie** — they're non-sensitive static assets. This solves the "subresource 401 after cookie clear" UX A introduces. The sensitivity boundary is `/_api/*` and `/_events`, not the static shell. + +#### C. CLI / scripts / curl + +Three accepted forms, all unchanged in their ergonomics: + +```bash +# 1. The unchanged, primary CLI form. Backward-compatible with existing scripts. +curl -H 'Authorization: Bearer ' http://localhost:5577/_api/entries?limit=10 + +# 2. Custom header (recommended for new code; symmetric with upstream domain). +curl -H 'X-Ccxray-Auth: ' http://localhost:5577/_api/entries?limit=10 + +# 3. Upstream domain (always X-Ccxray-Auth only — never Bearer on /v1). +curl -H 'X-Ccxray-Auth: ' \ + -H 'x-api-key: sk-ant-...' \ + http://localhost:5577/v1/messages +``` + +`verifyDashboard` accepts in this order: cookie, `X-Ccxray-Auth` against `K_upstream`, `Authorization: Bearer` against `AUTH_TOKEN`. Bearer-authenticated and `X-Ccxray-Auth`-authenticated requests are exempt from the Sec-Fetch/Origin CSRF gate (the cross-site forgery class requires browser-ambient credentials; a header attacker JS can't add cross-origin without preflight is not browser-ambient). Cookie-authenticated requests are always Sec-Fetch/Origin-gated. + +`K_upstream` is retrievable via `ccxray secret upstream` for piping into CI env files; `AUTH_TOKEN` is the user's choice and known to them already. + +#### D. WebSocket upgrade + +```js +server.on('upgrade', (req, socket, head) => { + // All current upgrades are on upstream paths (codex /v1/responses, /v1/realtime). + // Invariant: no browser opens a WS against ccxray. The dashboard uses SSE. + if (!verifyUpstream(req)) { + socket.write('HTTP/1.1 401 Unauthorized\r\nConnection: close\r\n\r\n'); + socket.destroy(); + return; + } + handleWebSocketUpgrade(req, socket, head); +}); +``` + +The upgrade gate accepts `X-Ccxray-Auth` only. `?token=` on WS URLs is rejected (the leak channel is closed by construction). The browser-can't-set-headers-on-`new WebSocket()` problem does not arise because no browser path opens a WS. + +If a future dashboard feature requires browser→ccxray WS, it lives on a dashboard-domain path (e.g. `/_ws/...`) with its own upgrade gate that requires the cookie + Origin/Sec-Fetch — two gates is honest; one gate trying to handle both is the trap A fell into. + +--- + +## 3. Concrete protocol details + +### 3.1 Boot-time secret derivation (HKDF, stateless) + +```js +const root = + process.env.AUTH_TOKEN + ? crypto.createHash('sha256').update(process.env.AUTH_TOKEN, 'utf8').digest() + : readOrCreateEphemeralSecret(); // 32 random bytes in ~/.ccxray/local-secret (0600) + +function hkdf(root, label, len = 32) { + return Buffer.from(crypto.hkdfSync('sha256', root, Buffer.alloc(0), Buffer.from(label), len)); +} + +const K_upstream = hkdf(root, 'ccxray/v1/upstream'); // injected into spawned CLIs +const K_session = hkdf(root, 'ccxray/v1/session-hmac'); // signs cookies +const K_bootstrap = hkdf(root, 'ccxray/v1/bootstrap'); // hashes pending bootstrap tokens +``` + +Restart with the same `AUTH_TOKEN` re-derives identical keys, so **browser cookies survive restart and hub recycle**. Rotating `AUTH_TOKEN` invalidates everything in one shot. + +`verifyDashboard` also accepts `Authorization: Bearer ` directly (constant-time compare against the env value) — this is the CLI back-compat path, independent of `K_upstream`. The two CLI paths share *capability* (full dashboard access) but use distinct token material so that scripts hard-coded to `AUTH_TOKEN` continue to work indefinitely. + +### 3.2 The stateless HMAC session cookie + +``` +Cookie value: ccxray_s = base64url(payload) "." base64url(hmac) + +payload = JSON.stringify({ v: 1, n: <16B random>, exp: }) +hmac = HMAC-SHA256(K_session, payload_bytes) +``` + +Verification (sketch, constant-time): + +```js +function verifyCookie(raw) { + const dot = raw.indexOf('.'); + if (dot <= 0) return null; + const payload = Buffer.from(raw.slice(0, dot), 'base64url'); + const provided = Buffer.from(raw.slice(dot + 1), 'base64url'); + const expected = crypto.createHmac('sha256', K_session).update(payload).digest(); + // Always do the same work regardless of length parity: + const probe = Buffer.alloc(expected.length); + provided.copy(probe, 0, 0, Math.min(provided.length, probe.length)); + const ok = crypto.timingSafeEqual(probe, expected) && provided.length === expected.length; + if (!ok) return null; + let obj; try { obj = JSON.parse(payload.toString('utf8')); } catch { return null; } + if (!obj || obj.v !== 1) return null; + if (typeof obj.exp !== 'number' || obj.exp < Date.now() / 1000) return null; + return obj; +} +``` + +Set-Cookie: + +``` +Set-Cookie: ccxray_s=; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800 + [; Secure if CCXRAY_FORCE_SECURE_COOKIE=1 or req.headers['x-forwarded-proto']==='https'] +``` + +Attributes: + +| Attribute | Value | Why | +|---|---|---| +| `HttpOnly` | yes | XSS-in-conversation cannot read the cookie (defense in depth over HTML escaping). | +| `SameSite=Strict` | Strict | We never need cross-site top-level navigation to authenticate. Strict kills the cross-site cookie attach class at the browser layer. | +| `Path=/` | yes | Subresources share path scope — no subresource 401 bug. | +| `Domain` | unset | Locked to exact host (`localhost:5577`). No Domain attribute = no parent-domain cookie attach surface for some rebind variants. | +| `Secure` | conditional | Set when behind TLS terminator (`CCXRAY_FORCE_SECURE_COOKIE=1`) or when the upstream `X-Forwarded-Proto: https` is trusted. Omitted on loopback HTTP. | +| `Max-Age=28800` | 8h | One working day. Survives restarts with same `AUTH_TOKEN`. | + +Why stateless HMAC over A's in-memory `Set`: + +- **Cookies survive hub idle-shutdown (5s after last client) and crash-recovery.** A's design wipes the set on every hub recycle, forcing a re-redemption every time the hub idles — which is constantly in normal use. B is correct here; we adopt it. +- **No sweep required.** Expiry is in the payload; the verifier rejects stale. +- **Trade-off: no per-session revocation.** Accepted. Revocation primitive is "rotate `AUTH_TOKEN`", which invalidates everything at once. This is the right semantics for a single-secret binary-trust model. + +### 3.3 One-time bootstrap token + +`ccxray open` mints a token by connecting to the hub's Unix socket and asking. The hub stores it as `HMAC(K_bootstrap, token)` in a small `Map` capped at 8 entries (oldest dropped on insert) with 60-second TTL: + +```js +const tok = crypto.randomBytes(24).toString('base64url'); // ~192 bits +const hashHex = crypto.createHmac('sha256', K_bootstrap).update(tok).digest('hex'); +pendingBootstraps.set(hashHex, Date.now() + 60_000); +return tok; // returned to CLI over Unix socket +``` + +`POST /_auth/redeem`: + +``` +POST /_auth/redeem HTTP/1.1 +Host: localhost:5577 +Origin: http://localhost:5577 +Sec-Fetch-Site: same-origin +X-Ccxray-Bootstrap: +Content-Type: application/json +Content-Length: 2 + +{} +``` + +Server checks, in order: +1. `Host` in allowedHosts (rebind). +2. `Sec-Fetch-Site === 'same-origin'`. If absent, require `Origin` matches allowedHosts. +3. Compute `HMAC(K_bootstrap, tok)`; constant-time lookup in `pendingBootstraps`; delete on match. +4. On success: mint HMAC session cookie, `Set-Cookie`, `204`. +5. On failure: `401`, no cookie, log one line. + +The bootstrap token is **single-use** and never appears in any URL the browser navigates to: the fragment is scrubbed within milliseconds by `history.replaceState`, and the value travels server-bound only in a POST body's custom header. Result: not in access logs, not in Referer, not in browser sync of URL bar, not in shell history (it's in the terminal output of `ccxray open`, but that terminal is the user's). + +### 3.4 Host & CSRF defense (universal Host, Sec-Fetch primary, Origin fallback) + +Boot-time: + +```js +const allowedHosts = new Set([ + `localhost:${PORT}`, `127.0.0.1:${PORT}`, `[::1]:${PORT}`, + ...envSplit('CCXRAY_PUBLIC_ORIGINS') // e.g. "ccxray.dev.tail-abc.ts.net:443" +]); +``` + +**Host check is universal across both domains, with no carve-out.** This is the principled disagreement with A (see §6 below), where A skipped Origin for "upstream-proxy". We do not skip Host *or* CSRF gating asymmetrically per domain — we use the right mechanism per domain: + +- Upstream domain: CSRF is structurally prevented by the custom-header credential (browsers cannot attach `X-Ccxray-Auth` cross-origin without preflight; we never grant the preflight). Host check still applies → rebind defense. +- Dashboard domain (cookie path): explicit Sec-Fetch / Origin gate on top of `SameSite=Strict`. + +```js +function checkHostOrReject(req) { + if (!allowedHosts.has(req.headers.host)) { + return reject(421, 'Misdirected Request'); + } +} + +function checkCsrfForCookie(req) { + const sfs = req.headers['sec-fetch-site']; + if (sfs !== undefined) { + if (sfs !== 'same-origin' && sfs !== 'none') { + return reject(403, 'CSRF: cross-origin with cookie'); + } + return; + } + // Older browser / non-browser fallback: + if (req.method !== 'GET' && req.method !== 'HEAD') { + const origin = req.headers.origin; + if (!origin) return reject(403, 'CSRF: state-changing without Origin'); + let u; try { u = new URL(origin); } catch { return reject(403, 'CSRF: bad Origin'); } + if (!allowedHosts.has(u.host)) return reject(403, 'CSRF: Origin mismatch'); + } +} +``` + +Sec-Fetch is the primary gate because it is **forbidden for JavaScript to set** (Fetch Metadata spec). It cleanly distinguishes address-bar nav (`none`) from same-origin fetch (`same-origin`) from cross-site fetch (`cross-site`). Origin is the fallback for clients that don't set Sec-Fetch. + +### 3.5 Launcher header injection — never overwrites upstream credentials + +The launcher (`server/providers.js`) derives `K_upstream` from the root secret and injects it into the spawned CLI's outbound requests via the provider's first-class per-request-header mechanism. Critically, **it never touches `ANTHROPIC_AUTH_TOKEN` or `ANTHROPIC_API_KEY`**: + +| Provider | Mechanism | Notes | +|---|---|---| +| Claude Code (Anthropic SDK) | `ANTHROPIC_CUSTOM_HEADERS="X-Ccxray-Auth: "` env (SDK reads this as a documented extension point). Verify against the SDK version at launch; on unsupported versions, fall back to mechanism below. | Does not collide with `ANTHROPIC_AUTH_TOKEN` (a distinct env var the SDK uses for its own auth). | +| Codex CLI | `-c request_headers='X-Ccxray-Auth='` via the codex per-request-header config. Verified to propagate to the WebSocket upgrade HTTP request. | Already integrated with ccxray's `-c openai_base_url=…` injection. | +| Generic curl / scripts | Documented one-liner: `curl -H "X-Ccxray-Auth: $(ccxray secret upstream)" …` | `ccxray secret upstream` prints `K_upstream` to stdout once. | +| Provider unable to header-inject reliably | Documented downgrade: `CCXRAY_LOOPBACK_ONLY_FOR_UPSTREAM=1` allows unauthenticated upstream from same-UID loopback peers. | Explicit, opt-in, cost spelled out. | + +The launcher reading `AUTH_TOKEN` (or `~/.ccxray/local-secret`) and deriving `K_upstream` means users never set `K_upstream` themselves. Rotation of `AUTH_TOKEN` automatically rotates `K_upstream` on next spawn. No separate rotation step. + +### 3.6 The `/` GET special case + +`GET /` returns `index.html` regardless of cookie state. `index.html` does not contain any sensitive data — it's the shell. The inline bootstrap script gates everything else: if there is a fragment, it redeems; if there's no fragment and no cookie, it shows the static "No session. Run `ccxray open`" message. + +`/_api/*` and `/_events` are the sensitivity boundary. Without a cookie (or other valid credential), they return `401 {"error":"no_session","hint":"run `ccxray open` in your terminal"}`. `public/app.js` already has reconnect-on-error for SSE; we add one line: on 401 from `/_events`, show the banner and stop reconnect storms. + +Static subresources (`/style.css`, `/app.js`, fonts, icons) return 200 unconditionally — they are not sensitive, and gating them creates the bookmark-with-cleared-cookies broken-page UX without security benefit. + +### 3.7 Hub IPC over Unix domain socket + +``` +~/.ccxray/ (mode 0700) +├── hub.json (mode 0600) ← {pid, sockPath, version, startedAt}, NO secrets +├── hub.sock (mode 0600) ← Unix domain socket +├── hub.log (mode 0600) +├── local-secret (mode 0600) ← present iff AUTH_TOKEN unset +└── logs/ +``` + +Discovery flow (unchanged in intent): +1. Client reads `hub.json` for `{pid, sockPath, version}`. +2. Client connects to `sockPath`. +3. Hub verifies peer UID via `getpeereid(2)` (`net.Socket._handle.getpeereid`, available on Linux/macOS Node ≥ 18) and rejects if peer UID ≠ server UID. +4. Client sends framed messages: `register`, `unregister`, `health`, `bootstrap-token`. + +`bootstrap-token` is how `ccxray open` retrieves the one-time URL. No HTTP path serves this. + +**Windows fallback:** `getpeereid` is not available. Use a named pipe at a per-user path (`\\.\pipe\ccxray-`) + a one-time secret in `hub.json` at file ACL = current user only. Documented as a different trust model; equivalent in practice for single-user Windows boxes. + +### 3.8 The "no `AUTH_TOKEN`" posture — ephemeral mode + +When `AUTH_TOKEN` is unset: + +1. At first start, generate 32 random bytes, write to `~/.ccxray/local-secret` (mode `0600`). +2. Derive `K_upstream`, `K_session`, `K_bootstrap` from that secret via HKDF as usual. +3. The launcher reads the secret from this file (not env) and computes `K_upstream` for injection. +4. `verifyDashboard` accepts the cookie path and `X-Ccxray-Auth` against `K_upstream`. Bearer compat path is `disabled` in ephemeral mode (no env value to compare to) — documented. +5. Multi-UID localhost is **not** a privileged source. A request from another UID cannot read `local-secret`, cannot mint a cookie via `ccxray open` (peer-UID gated socket), and cannot present `X-Ccxray-Auth`. It gets 401. + +For single-user-laptop developer convenience: `CCXRAY_LOOPBACK_NO_AUTH=1` enables anonymous loopback access (matching the old default), with a loud startup banner. **This is an explicit opt-in, not the silent default.** A's "127.0.0.1 = anonymous OK" was a multi-UID footgun; we close it by default and require a flag to re-open. + +Tabular summary: + +| Configuration | Upstream | Dashboard | Hub IPC | +|---|---|---|---| +| `AUTH_TOKEN=` | `X-Ccxray-Auth` required | Cookie OR `Authorization: Bearer ` OR `X-Ccxray-Auth` | Unix socket peer-UID | +| `AUTH_TOKEN` unset (default) | `X-Ccxray-Auth` required (from `~/.ccxray/local-secret`) | Cookie via `ccxray open` OR `X-Ccxray-Auth` | Unix socket peer-UID | +| `AUTH_TOKEN` unset + `CCXRAY_LOOPBACK_NO_AUTH=1` | loopback unauth permitted | loopback unauth permitted | Unix socket peer-UID still required | + +In every mode, hub IPC is gated by peer-UID. There is no configuration where a same-machine other-UID process reaches another UID's ccxray data. + +--- + +## 4. Threat-by-threat mitigation table + +| # | Threat | Defense | Layer | Residual risk | +|---|---|---|---|---| +| 1 | **Malicious website CSRF** (form POST, `fetch({credentials:'include'})`, ``, etc., against `http://localhost:5577`) | (a) **Upstream domain unreachable by browser ambient credential**: `X-Ccxray-Auth` is a non-CORS-simple header → preflight required → no `Access-Control-Allow-Origin` granted → browser blocks. Structurally impossible regardless of cookie. (b) Dashboard cookie has `SameSite=Strict` — not sent on any cross-site request. (c) `Sec-Fetch-Site` enforcement on cookie path — rejects cross-site even if a future browser bug permits the cookie. (d) Origin/Referer fallback for older browsers and non-browser test harnesses. | Architecture + cookie + middleware (three independent gates) | None. There is no "skip CSRF for upstream-proxy" carve-out as in A. | +| 2 | **DNS rebinding** (attacker domain re-resolves to 127.0.0.1) | **Universal Host allowlist** on every request, both domains, no exemption. `Host: evil.com` → 421. Cookie has no `Domain` attribute, so it would not be sent to a rebound host even if Host were spoofed at a higher layer. `CCXRAY_PUBLIC_ORIGINS` provides explicit opt-in for legitimate non-loopback hostnames. | Middleware (universal) + cookie scope | None for loopback. For remote deploys, the operator must add the public hostname to `CCXRAY_PUBLIC_ORIGINS`; documented. | +| 3 | **Token exfiltration via URL surface** (history, Referer, logs, paste) | The credential never appears in any URL the server sees. Bootstrap token lives in the URL **fragment** (`#k=…`), which is not sent to the server, not in access logs, not in Referer, not in browser-bar sync (excluded since 2014). Fragment is scrubbed by `history.replaceState` within milliseconds. Bootstrap travels server-bound via `POST` with the token in a custom header (`X-Ccxray-Bootstrap`). Token is **one-time** (60s TTL, single-use). After redemption, cookie carries auth — no token persists. **Legacy `?token=` is removed in Phase 3** (deprecation log in Phase 1, restricted to `/` in Phase 2 with a soft redirect to the new flow). | Bootstrap protocol + URL hygiene | Bookmarking the bootstrap URL is moot (single-use, 60s). Users bookmark `http://localhost:5577/` and re-bootstrap with `ccxray open` as needed. The bookmarked URL contains no secret. | +| 4 | **XSS-in-conversation-content** (LLM/tool output injects ` +``` + +`POST /_auth/redeem` reads the bootstrap token from the `X-Ccxray-Bootstrap` header (not the body, not the URL), validates it against the one-time-use set, mints an HMAC cookie, sets it via `Set-Cookie`, and returns 204. + +**Subsequent requests:** + +``` +GET /_api/entries?limit=10 HTTP/1.1 +Host: localhost:5577 +Cookie: ccxray_s=. +Sec-Fetch-Site: same-origin +Sec-Fetch-Mode: cors +Sec-Fetch-Dest: empty + +ccxray dispatcher → dashboard domain → verifyDashboard(req): + - parse cookie, split payload.hmac + - constant-time verify hmac == HMAC(K_session, payload) + - parse payload {nonce, exp}; reject if exp < now + - CSRF gate: require Sec-Fetch-Site ∈ {same-origin, none} + (none = direct address-bar nav; only legal for safe top-level GET) + if absent (legacy/non-browser): require Origin header match OR + require X-Ccxray-Auth header (CLI case) + - Host check enforced +→ allow +``` + +#### C. CLI / scripts / curl + +Two options, both work: + +``` +# 1. Custom header (recommended; symmetric with upstream domain) +curl -H 'X-Ccxray-Auth: ' http://localhost:5577/_api/entries?limit=10 + +# 2. Existing bearer-style header for backward compat +curl -H 'Authorization: Bearer ' http://localhost:5577/_api/entries?limit=10 +``` + +`verifyDashboard` accepts a valid `X-Ccxray-Auth` OR `Authorization: Bearer` OR a valid cookie. CSRF gating is conditional: **only cookie-authenticated requests are checked against `Sec-Fetch-Site` / `Origin`.** Bearer-authenticated requests are exempt — but the upstream domain's CSRF risk is structurally eliminated (see Threat 1 in §4) so this exemption is safe here too: the dashboard's state-changing endpoints are not financially expensive operations, and they require a credential a victim browser cannot present cross-origin. + +Crucially, *the upstream domain and the dashboard domain are completely independent surfaces*. A cookie cannot authenticate `/v1/messages` (the dispatcher routes `/v1/*` to `verifyUpstream`, which only accepts the header). This is the structural fix for WEAKNESS-6 in the critique of A. + +#### D. WebSocket upgrade (Codex `/v1/responses`) + +``` +server.on('upgrade', (req, socket, head) => { + // upgrades only occur on upstream paths + if (!verifyUpstream(req)) { + socket.write('HTTP/1.1 401 Unauthorized\r\nConnection: close\r\n\r\n'); + socket.destroy(); + return; + } + handleWebSocketUpgrade(req, socket, head); +}); +``` + +The upgrade gate accepts `X-Ccxray-Auth` only. Codex's launcher integration sets this header on the upstream HTTP base URL it uses for WS (the upgrade request is a normal HTTP request before the protocol switch; arbitrary headers are settable). The dashboard does not open WebSockets — it uses SSE (`EventSource('/_events')`) which works seamlessly with cookies. **The "browser cannot set headers on `new WebSocket()`" problem (WEAKNESS-1 in A's critique) does not exist in B because no browser ever opens a WebSocket against ccxray.** This is by design, documented as an invariant. + +If a future feature requires browser→ccxray WebSocket: it lives on a dashboard-domain path (e.g. `/_ws/dashboard`) and the upgrade gate for that path uses cookie auth + Origin/Sec-Fetch-Site validation, independent from the upstream upgrade gate. Two upgrade gates, one per domain, is honest; one gate that tries to handle both with conflicting rules is not. + +--- + +## 3. Concrete protocol details + +### 3.1 Boot-time secret derivation + +At server start, after reading `AUTH_TOKEN` from env: + +```js +const root = AUTH_TOKEN + ? crypto.createHash('sha256').update(AUTH_TOKEN, 'utf8').digest() + : crypto.randomBytes(32); // ephemeral if no AUTH_TOKEN; see §3.7 + +// HKDF-Expand-Label-ish, but RFC 5869 with stdlib +function hkdf(root, label, len = 32) { + return crypto.hkdfSync('sha256', root, Buffer.alloc(0), Buffer.from(label), len); +} + +const K_upstream = hkdf(root, 'ccxray/v1/upstream'); // 32B +const K_session = hkdf(root, 'ccxray/v1/session-hmac'); // 32B +const K_bootstrap = hkdf(root, 'ccxray/v1/bootstrap'); // 32B +``` + +`K_upstream` is what the launcher injects as `X-Ccxray-Auth`. **It is not `AUTH_TOKEN` itself** — it is a per-label derived bearer, which is what allows us to rotate or revoke one domain's credential without changing `AUTH_TOKEN` and without coupling ccxray's credential to any upstream credential. + +`K_session` signs HMAC cookies (no server-side state). + +`K_bootstrap` derives one-time bootstrap tokens (a short list of unredeemed nonces lives in memory; see §3.3). + +Restart with the same `AUTH_TOKEN` re-derives the same keys, so cookies survive restart. Rotating `AUTH_TOKEN` invalidates everything in one shot. + +### 3.2 The stateless HMAC session cookie + +Cookie value: + +``` +ccxray_s = base64url(payload) "." base64url(hmac) + +payload = JSON.stringify({ + v: 1, // version, for future-proofing + n: <16-byte nonce>, // random per-session + exp: +}) + +hmac = HMAC-SHA256(K_session, payload) +``` + +Verification (constant-time): + +```js +function verifyCookie(raw) { + const [pB64, hB64] = raw.split('.', 2); + if (!pB64 || !hB64) return null; + const payload = Buffer.from(pB64, 'base64url'); + const provided = Buffer.from(hB64, 'base64url'); + const expected = crypto.createHmac('sha256', K_session).update(payload).digest(); + if (provided.length !== expected.length) { + crypto.timingSafeEqual(expected, expected); // fixed work + return null; + } + if (!crypto.timingSafeEqual(provided, expected)) return null; + const obj = JSON.parse(payload.toString('utf8')); + if (!obj || obj.v !== 1) return null; + if (typeof obj.exp !== 'number' || obj.exp < Date.now() / 1000) return null; + return obj; +} +``` + +Cookie attributes: + +``` +Set-Cookie: ccxray_s=; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800 + [; Secure if served behind TLS terminator (set CCXRAY_FORCE_SECURE_COOKIE=1)] +``` + +Why this design choice over A's `Set`: + +- **No server-side state.** Hub idle-shutdown (5s after last client) and crash-recovery do not invalidate cookies *as long as `AUTH_TOKEN` is the same*. The next hub instance derives the same `K_session` and recognizes the cookie. This directly answers WEAKNESS-4 and WEAKNESS-5 in A's critique. +- **No sweep needed.** The set never exists. `exp` is in the payload; verifier rejects stale. +- **Revocation:** changing `AUTH_TOKEN` and restarting invalidates every cookie at once. Per-session revocation is not provided (intentional — single-secret model has binary trust). + +### 3.3 The one-time bootstrap token + +When `ccxray open` is invoked locally (via the same Unix socket used for hub IPC, see §3.7), the running ccxray process: + +```js +const tok = crypto.randomBytes(24).toString('base64url'); // ≈192 bits +const tokHash = crypto.createHmac('sha256', K_bootstrap).update(tok).digest(); +pendingBootstraps.add({ hash: tokHash, exp: Date.now() + 60_000 }); +return tok; // returned over the Unix socket to the CLI +``` + +`pendingBootstraps` is a small `Map` swept on insert. Maximum 8 entries at a time (older entries dropped on insert). At 60-second TTL and one bootstrap per `ccxray open`, contention is non-existent. + +`POST /_auth/redeem`: + +``` +POST /_auth/redeem HTTP/1.1 +Host: localhost:5577 +X-Ccxray-Bootstrap: +Content-Type: application/json +Content-Length: 2 + +{} +``` + +Server: + +1. Reject unless `Sec-Fetch-Site: same-origin` (defense: prevents `evil.com` POSTing a stolen bootstrap token from a phishing message). +2. Reject unless `Origin` matches an allowlisted host (defense in depth). +3. Hash incoming token, look up in `pendingBootstraps`, delete entry on success. Constant-time comparison via fixed-length digest. +4. If valid: mint HMAC session cookie via `Set-Cookie`, respond 204. +5. If invalid: 401, no cookie. + +The bootstrap token is **single-use**: redemption removes the entry. + +### 3.4 Host & Sec-Fetch CSRF / rebind defense + +`verifyDashboard` builds the allowlist at boot: + +```js +const allowedHosts = new Set([ + `localhost:${PORT}`, `127.0.0.1:${PORT}`, `[::1]:${PORT}`, + ...envSplit('CCXRAY_PUBLIC_ORIGINS') // e.g. "ccxray.devbox.tail-abc.ts.net" +]); +``` + +For every dashboard request: + +```js +if (!allowedHosts.has(req.headers.host)) { + return reject(421, 'Misdirected Request'); // DNS rebinding defense +} +``` + +For cookie-authenticated dashboard requests: + +```js +const sfs = req.headers['sec-fetch-site']; +if (sfs !== undefined) { + // modern browser path + if (sfs !== 'same-origin' && sfs !== 'none') { + return reject(403, 'CSRF: cross-origin request with cookie'); + } +} else { + // older browser or non-browser; fall back to Origin/Referer + const origin = req.headers.origin; + if (req.method !== 'GET' && req.method !== 'HEAD') { + if (!origin) return reject(403, 'CSRF: state-changing request without Origin'); + const u = new URL(origin); + if (!allowedHosts.has(u.host)) return reject(403, 'CSRF: Origin mismatch'); + } +} +``` + +Why `Sec-Fetch-*` as the *primary* gate (with Origin as fallback) rather than the reverse: + +- `Sec-Fetch-Site` is **forbidden** for JavaScript to set (Fetch Metadata Request Headers spec). It is set by the browser itself, with semantics that distinguish "address-bar nav" (`none`), "same-origin fetch", "cross-origin fetch", and "cross-site fetch". This is *exactly* what CSRF defense wants to know. +- `Origin` has known edge cases (some browsers omit on same-origin GET; Safari has historical quirks; reflection by intermediaries). Fine as a fallback but not a primary. +- Sec-Fetch shipped to ≥ 95% of browsers by 2026; the fallback path is only hit by stripped-down embeds and non-browser clients (curl, which we want to treat differently anyway). + +### 3.5 Launcher header injection — *not* `ANTHROPIC_AUTH_TOKEN` + +This is the structural fix for WEAKNESS-7 in A's critique. + +The launcher's job is to make the spawned CLI add `X-Ccxray-Auth: ` to its outbound HTTP requests, **without overwriting `ANTHROPIC_API_KEY`, `ANTHROPIC_AUTH_TOKEN`, or any upstream credential.** + +Per-provider mechanism: + +| Provider | Mechanism | Verified path | +|---|---|---| +| Claude Code (Anthropic SDK) | Set env `ANTHROPIC_CUSTOM_HEADERS="X-Ccxray-Auth: "` (Anthropic SDK reads this as a documented extension point) | The SDK has `defaultHeaders` per-client; the env-var bridge is set by Claude Code itself. Verified for current SDK; if SDK drops support, fallback below. | +| Codex CLI | `-c request_headers='X-Ccxray-Auth='` (codex per-request header config). If unavailable: write `~/.codex/request_headers.toml` with the header before spawn, restore on exit. | Codex does support `request_headers` for HTTP transport. For the WebSocket upstream upgrade, codex's `request_headers` propagates to the upgrade HTTP request — verified empirically before adopting. | +| Generic curl / scripts | User sets the header themselves; documented one-liner in README. | Explicit. | +| Fallback for any client we can't header-inject reliably | Run ccxray with `CCXRAY_LOOPBACK_ONLY_FOR_UPSTREAM=1` to allow unauthenticated upstream from same-UID loopback peers (see §3.7). | Documented downgrade with cost spelled out. | + +**Critically: the launcher reads `AUTH_TOKEN` and computes `K_upstream` via HKDF.** Users do not set `K_upstream` themselves. They set `AUTH_TOKEN` once; ccxray derives. If `AUTH_TOKEN` is rotated, `K_upstream` changes automatically, the launcher re-derives, no separate rotation step. No collision with `ANTHROPIC_AUTH_TOKEN` because we never touch that env var. + +### 3.6 The `/` GET special case + +Loading `/` *without* a session cookie returns the static `index.html` (200). This is the only dashboard path that doesn't require a cookie, and it does not reveal any sensitive data — `index.html` is the same shell for everyone, and the inline script gates everything else. + +If a request to `/_api/*` or `/_events` arrives without a cookie, return 401 with a JSON body `{"error":"no_session","hint":"run `ccxray open` in your terminal"}`. The app.js handler converts this to a banner instead of attempting reconnect storms. + +Loading `/style.css`, `/app.js`, etc. without a cookie returns 200 (these are not sensitive). Loading them *with* an expired cookie also returns 200 (we don't want to break asset caching). The sensitivity boundary is the API and event stream, not the static shell. + +This deliberately differs from A: A gates static assets via cookie, which means a bookmarked `/` that has lost its cookie shows a broken page (subresources 200 but data calls 401). B shows a clean "no session" message in `index.html` itself, because `index.html` is always reachable. + +### 3.7 Hub IPC over Unix domain socket + +This is the structural fix for the HTTP-IPC surface area in A. + +Hub now listens on: + +- `~/.ccxray/hub.sock` (Unix domain socket) +- `~/.ccxray/` directory permission: `0700` +- Socket file permission: `0600` (auto-applied; parent dir restriction provides defense in depth) + +Discovery flow: + +1. Client process reads `~/.ccxray/hub.json` for `{ pid, sockPath, version }` (no shared secrets in the lockfile anymore). +2. Client connects to `sockPath`. +3. Server uses `getpeereid(2)` (via `net.Socket` `_handle.getpeereid` on Node, or `process.getuid()` comparison if the platform exposes peer creds) to verify the peer UID matches the server's UID. Reject otherwise. +4. Client sends framed messages: `register`, `unregister`, `health`, `bootstrap-token`. + +`bootstrap-token` is how `ccxray open` retrieves the one-time URL: the CLI connects to the hub's socket, requests a token, the hub mints one and returns it, the CLI prints the URL. No HTTP involved. + +On platforms where peer UID is unavailable (Windows): fall back to a `0700` mode named pipe + a one-time secret in `hub.json` at file mode `0600`. Documented platform-specific footnote. + +**Effect:** the HTTP listener no longer serves any privileged IPC surface. There is no `/_hub/*` path. The HTTP layer's only customers are the LLM clients (upstream domain) and the dashboard (dashboard domain). The attack surface from "HTTP request to `/_hub/foo` from a same-machine other-UID process" is eliminated, not just defended. + +### 3.8 The "no AUTH_TOKEN" posture — multi-UID localhost stance + +This is the structural fix for WEAKNESS-8 in A's critique. A's "127.0.0.1 = anonymous OK" is a real security regression on multi-tenant dev hosts. + +When `AUTH_TOKEN` is unset, ccxray operates in **ephemeral mode**: + +1. At startup, ccxray generates an ephemeral `AUTH_TOKEN` equivalent (random 32 bytes) and writes it to `~/.ccxray/local-secret` at mode `0600`. This is a *file*, not an env var (so it doesn't show up in `ps eww` of other UIDs). +2. The bootstrap CLI (`ccxray open`) reads `~/.ccxray/local-secret` (which only the owning UID can read) and uses it to mint the URL. +3. The upstream domain also requires `X-Ccxray-Auth` derived from this ephemeral secret. The launcher reads it from the same file at spawn time. +4. Multi-UID localhost is *not* a privileged source. A request from another UID without the header is rejected with 401. + +For users who genuinely want anonymous loopback (developer convenience on a single-user laptop): `CCXRAY_LOOPBACK_NO_AUTH=1` enables it, with a startup banner warning and a `~/.ccxray/local-secret` that is world-readable to document the choice. This makes the footgun explicit, opt-in, and visible — opposite of A's silent default. + +Tabular summary: + +| Configuration | Upstream domain | Dashboard domain | Hub IPC | +|---|---|---|---| +| `AUTH_TOKEN=` | requires `X-Ccxray-Auth` | requires cookie or `X-Ccxray-Auth` | Unix socket peer-UID | +| `AUTH_TOKEN` unset (default) | requires `X-Ccxray-Auth` (from ~/.ccxray/local-secret) | requires cookie minted via `ccxray open` | Unix socket peer-UID | +| `AUTH_TOKEN` unset + `CCXRAY_LOOPBACK_NO_AUTH=1` | allows loopback unauthenticated | allows loopback unauthenticated | Unix socket peer-UID still required | + +In every mode, hub IPC is gated by peer-UID. There is no configuration where "shared dev box, other UID" can reach the dashboard data of another UID's ccxray process. + +--- + +## 4. Threat-by-threat mitigation table + +| # | Threat | Defense | Layer | Residual risk | +|---|---|---|---|---| +| 1 | **Malicious website CSRF** | Multi-layered, with the structural fix being domain segregation. (a) Upstream domain (`/v1/*`) cannot receive browser-ambient credentials at all — `X-Ccxray-Auth` is a custom header, triggers CORS preflight, no allowed origin → browser blocks. The state-changing `POST /v1/messages` CSRF described in A's WEAKNESS-6 is structurally impossible. (b) Dashboard domain cookie has `SameSite=Strict` → not sent cross-site. (c) `Sec-Fetch-Site` enforcement → reject cross-site cookie use even if a browser bug sets the cookie. (d) Origin/Host fallback for older browsers. | Architecture + cookie + middleware | None at any layer. | +| 2 | **DNS rebinding** | Host header allowlist enforced on every dashboard request (no exemption for any auth method). Cookie has no `Domain` attribute (locked to exact host). Upstream domain also enforces Host — even with `X-Ccxray-Auth` valid, a `Host: evil.com` request is rejected. **Unlike A, there is no "Origin check skipped for upstream-proxy" carve-out.** Host validation is universal across both domains. | Middleware (universal) + cookie scope | None for loopback. For remote deploys, `CCXRAY_PUBLIC_ORIGINS` must list the exact public hostname; documented. | +| 3 | **Token exfiltration via URL surface** | Bootstrap token lives in URL **fragment** (`#k=…`), not query string. Fragments are never sent to the server, never logged, not in Referer, excluded from URL sync. The fragment is scrubbed by `history.replaceState` within milliseconds of page load. Server-side bootstrap is a `POST` with the token in a custom header (`X-Ccxray-Bootstrap`). The token is **one-time** (60s TTL, single-use). After redemption, the session cookie carries auth — the token does not persist anywhere. **A's "?token= appears in URL twice" issue (WEAKNESS-3) is structurally absent: B never puts a token in the query string.** | Bootstrap protocol | Bookmarking the bootstrap URL is futile (token expires in 60s, single-use). User would bookmark `http://localhost:5577/` (no fragment) and re-bootstrap via `ccxray open` when needed. Documented. | +| 4 | **XSS-in-conversation-content** | (a) `HttpOnly` cookie cannot be read by attacker JS → no credential exfiltration. (b) `K_upstream` is **not** present in any browser-accessible location — it lives in the server process and in `~/.ccxray/local-secret` (file mode `0600`). Browser-side XSS cannot reach it. (c) `Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'` (the inline bootstrap script is the only inline JS; CSP nonces if we want to remove that residual) prevents script-src exfil channels. (d) Existing HTML escaping in `entry-rendering.js` is the primary preventive layer. The XSS-to-active-session residual (attacker JS making authenticated fetches) is accepted, as in A, but B additionally prevents the *worse* outcome of credential theft (upstream key never accessible from JS context). | Cookie + key isolation + CSP | Same residual as A on the "active-session" axis; strictly stronger on credential isolation. | +| 5 | **Plain-HTTP eavesdropping** | Constraint-limited; same posture as A. ccxray does not terminate TLS. For remote deploy, document the TLS-terminator setup; set `CCXRAY_FORCE_SECURE_COOKIE=1`. B additionally documents: when behind a TLS terminator, set the terminator to strip `X-Ccxray-Auth` from incoming requests if they originate outside the trusted network — this protects the upstream domain credential from accidental exposure in logs/dumps. | Documentation + Secure flag + ops guidance | Accepted by constraint. | +| 6 | **Token leak via WebSocket upgrade URL** | The WebSocket upgrade gate accepts `X-Ccxray-Auth` header only — same as the upstream HTTP path. Query-string `?token=` is rejected. The dashboard never opens a WebSocket against ccxray (it uses SSE), so the "browser cannot set headers on `new WebSocket()`" problem (A's WEAKNESS-1) does not arise. **This is structural, not aspirational.** | Upgrade middleware + invariant (no browser WS) | None. | + +Additional threats addressed (not in the original 6 but raised by A's critique): + +| # | Threat | Defense | +|---|---|---| +| 7 | **Hub HTTP IPC reachable cross-UID** | Hub IPC moved to Unix domain socket with peer-UID check. The HTTP listener no longer serves `/_hub/*`. (Architectural fix for the surface that produced WEAKNESS-8 in A.) | +| 8 | **Credential collision with `ANTHROPIC_AUTH_TOKEN`** | ccxray's credential is `X-Ccxray-Auth`, never `ANTHROPIC_AUTH_TOKEN`. The launcher uses provider-specific custom-header injection (`ANTHROPIC_CUSTOM_HEADERS`, codex `-c request_headers`) that does not touch upstream credentials. (Fix for A's WEAKNESS-7.) | +| 9 | **Bootstrap token replay across browser-sync devices** | Token in fragment never reaches Chrome Sync of the URL bar. Even if a user copy-pastes the URL to another device within 60 seconds: token is single-use; after redemption on device A, device B's POST `/_auth/redeem` fails. | + +--- + +## 5. Migration path + +The migration is structurally larger than A's (because B replaces the HTTP IPC surface and adds CLI bootstrap), but staged so no working setup breaks mid-version. + +### 5.1 Backward-compatibility surface + +| Old behavior | New behavior | Phase | +|---|---|---| +| `Authorization: Bearer ` on `/_api/*` | Accepted as alias for `X-Ccxray-Auth: ` during one deprecation cycle (server detects bearer == AUTH_TOKEN, treats as `K_upstream` match). Deprecation log line. Removed in vN+2. | Phase 1 | +| `Authorization: Bearer ` on `/v1/*` | Same as above, with deprecation log. | Phase 1 | +| `?token=` query parameter | Phase 1: accepted with `X-Ccxray-Deprecation: query-token` warning header. Phase 2: rejected on all paths except `/` (where it is silently converted into a `Set-Cookie` redemption for compatibility with bookmark-style URLs from old versions). Phase 3: removed entirely. | 1, 2, 3 | +| `AUTH_TOKEN` unset → allow all loopback | Phase 1: behavior unchanged + boot warning "AUTH_TOKEN unset; default-allow loopback will require explicit CCXRAY_LOOPBACK_NO_AUTH=1 in vN+1". Phase 2: requires explicit flag. | 1, 2 | +| `/_hub/*` HTTP routes | Phase 1: HTTP routes still serve, but also start the Unix socket. Both work. Phase 2: HTTP routes 410 Gone, with a log line "use Unix socket at ~/.ccxray/hub.sock". | 1, 2 | +| Browser-bookmarked `/?token=` URL | Phase 1: still works (Set-Cookie redemption path, with deprecation warning rendered in the page). Phase 2: still works but documented as legacy. Phase 3: removed; `ccxray open` is the only path. | 1, 2, 3 | + +### 5.2 Phased rollout + +**Phase 1 (additive, no breakage):** + +- Add Unix socket hub IPC + `ccxray open` subcommand. +- Add `/_auth/redeem` endpoint, HMAC cookie minting. +- Add `verifyUpstream` / `verifyDashboard` split, but both accept legacy bearer and query-token forms with deprecation headers. +- Add Host check + Sec-Fetch-Site check in *warn-only* mode. +- All existing setups keep working. Logs surface what would break in Phase 2. + +**Phase 2 (enforcement, minor breakage with one-flag opt-out):** + +- Sec-Fetch-Site / Origin / Host checks flip from warn to block. +- `AUTH_TOKEN` unset requires `CCXRAY_LOOPBACK_NO_AUTH=1` for anonymous access. +- HTTP `/_hub/*` routes return 410. +- Legacy `?token=` accepted only on `/`. +- Launcher starts injecting `X-Ccxray-Auth` for spawned children. + +**Phase 3 (cleanup):** + +- Remove legacy bearer-as-`AUTH_TOKEN` path; only `X-Ccxray-Auth` and cookie are recognized. +- Remove `?token=` redemption on `/`. +- Auth surface settles at its final shape (~150 LOC `server/auth.js` + 25 LOC dispatcher + 30 LOC CLI bootstrap). + +### 5.3 Communication + +Each phase ships with a CHANGELOG entry that says exactly: "If you were doing X, now do Y". The README's "Setup" section has one quickstart (`ccxray open`) and one "I'm scripting CI" section (`X-Ccxray-Auth` header with the value of `K_upstream`, obtained by `ccxray secret upstream` — a CLI subcommand that prints `K_upstream` to stdout once, for piping into env files). + +--- + +## 6. Explicitly rejected alternatives + +### 6.1 Pure cookie (cookie required for CLI too) + +**Rejected.** Same reasoning as A — breaks curl ergonomics. B keeps a CLI-friendly header (`X-Ccxray-Auth`) and additionally derives it from `AUTH_TOKEN` via HKDF, so the user still sets one env var. + +### 6.2 Pure bootstrap-injection into `window.__CCXRAY_TOKEN__` + +**Rejected.** Same as A — token in JS-readable memory loses to XSS, requires `fetch`/`EventSource`/`WebSocket` patching. B's inline-script-reads-fragment pattern is bounded: the fragment lives in JS *for milliseconds* before being scrubbed and converted into an HttpOnly cookie. Steady-state JS has no credential at all. + +### 6.3 OAuth / OIDC + +**Rejected.** Same as A — violates "no external IdP" and adds enormous failure surface for negligible benefit. + +### 6.4 mTLS + +**Rejected.** Same as A — requires HTTPS in ccxray and per-client cert config that the SDKs don't support cleanly. + +### 6.5 Server-side session set (A's approach) + +**Rejected for B.** A's `Set` is correct in spirit but loses to hub idle-shutdown (5s) and crash-recovery (clients fork new hub). HMAC-signed stateless cookies derived from `K_session = HKDF(AUTH_TOKEN, "session")` reproduce the same security property without the lifetime contradiction: same `AUTH_TOKEN` → same key → same cookies validate. Rotation by changing `AUTH_TOKEN` is automatic (no shared mutable set to wipe). + +The one feature lost is per-session revocation. We accept this: in a single-secret binary-trust model, the revocation primitive is "rotate the secret", which invalidates everything. Per-cookie revocation would require server-side state, which is what we deliberately don't have. + +### 6.6 Bearer on `Authorization` (A's mechanism) + +**Rejected as the primary credential.** B uses `X-Ccxray-Auth` because: + +- It disambiguates from upstream credentials that also use `Authorization`. A user running curl with both `Authorization: Bearer ` and `x-api-key: sk-...` works, but a user proxying an existing tool that already sets `Authorization` for upstream has a collision. Custom header sidesteps this. +- Custom non-CORS-simple headers force browser preflight, making the upstream domain unreachable from browsers by construction. `Authorization: Bearer` is also non-simple, but `X-Ccxray-Auth` makes the intent visible in code and grep-able in logs. +- B does accept `Authorization: Bearer` on the dashboard domain as a convenience for users with existing curl scripts (no breakage). On the upstream domain, only `X-Ccxray-Auth` — the SDKs that auto-attach `Authorization` headers will not accidentally double-auth. + +### 6.7 Pure `Sec-Fetch-Site` CSRF defense (no Origin fallback) + +**Rejected.** Coverage is ≥ 95% but not 100%. Older browsers + non-browser clients (curl, some test harnesses) don't set Sec-Fetch headers. Falling back to Origin/Referer keeps the fallback path coherent. The primary path is Sec-Fetch; the fallback is Origin. + +### 6.8 Synchronizer CSRF token / double-submit cookie + +**Rejected.** Same as A — `SameSite=Strict` + `Sec-Fetch-Site` covers the threat without per-page-render token state. Synchronizer tokens are 2014-era; B is 2026. + +### 6.9 HTTPS-only / Let's-Encrypt cert from ccxray itself + +**Rejected.** Out of scope by constraint. For remote deploy, terminator-in-front is the documented pattern. + +### 6.10 Hub IPC over HTTP with a shared secret (A's approach) + +**Rejected.** Even with the secret file at `0600`, the HTTP listener becomes a privileged surface that has to be defended against same-machine other-UID attackers, against bugs in our path classifier, against header-smuggling proxy intermediaries, etc. Unix domain socket with peer-UID check moves the trust root to the OS kernel, which is the right place. On Windows we accept a small downgrade (named pipe + secret), documented. + +--- + +## 7. Failure modes & operational notes + +### 7.1 Token rotation + +Change `AUTH_TOKEN` (or delete and recreate `~/.ccxray/local-secret`) and restart ccxray. All cookies invalidate (new `K_session` produces different HMACs); `K_upstream` rotates automatically so the next spawned CLI re-derives correctly. Browser users run `ccxray open` again. CLI users update `AUTH_TOKEN`. + +Difference from A: B's cookies survive *restart* (same `AUTH_TOKEN` → same keys → cookies still valid). A's cookies do not (in-memory set wiped). B is strictly more convenient here. + +### 7.2 Cookie clearing + +User clears cookies → next dashboard request 401 → `index.html` script detects absence of cookie, shows "No session. Run `ccxray open` in your terminal." Single-line static message; no thrashing reconnects. + +### 7.3 Multiple browser tabs + +Cookie is per-origin. All tabs share the session — open a fifth tab, it inherits. Closing tabs doesn't invalidate. Same as A. + +### 7.4 Server restart mid-session + +**B handles this correctly where A does not.** + +- `AUTH_TOKEN` is the same → keys re-derive identically → existing cookies validate. No interruption. +- `AUTH_TOKEN` changed → new keys → cookies fail → reauth via `ccxray open`. + +The SSE client in `public/app.js` already has reconnect-on-error. We add: on 401 from `/_events`, show banner "Session expired — run `ccxray open`" and stop reconnect attempts (avoid the storm A would experience). + +### 7.5 Hub mode + +Hub holds the secret derivation (same `K_session`, `K_upstream`, `K_bootstrap`) for the life of the process. Cookies issued by hub instance N validate against hub instance N+1 *if `AUTH_TOKEN` is unchanged*. + +- Hub idle-shutdown (5s after last client): no impact on cookies; next hub re-derives from `AUTH_TOKEN`. +- Hub crash-recovery (clients fork new hub): no impact on cookies, same reason. +- The `pendingBootstraps` set is in-memory and ephemeral; if hub restarts mid-bootstrap, the user re-runs `ccxray open`. The 60-second TTL bounds the window where this can happen to a small fraction of a second per `open` invocation. + +This is the structural answer to WEAKNESS-4 and WEAKNESS-5 in A's critique: B's session lifetime is bounded by `AUTH_TOKEN` change, not by hub process lifetime. + +### 7.6 Non-local HTTP deployment + +Documented operational guide: + +> ccxray does not terminate TLS. If you expose it beyond loopback, put it behind a TLS terminator (Tailscale Serve, Caddy, nginx, an SSH `-L` tunnel). Set: +> +> - `CCXRAY_FORCE_SECURE_COOKIE=1` — adds `Secure` to the session cookie. +> - `CCXRAY_PUBLIC_ORIGINS=ccxray.example.com:443` — adds the public host to the allowlist. +> - Configure the terminator to strip incoming `X-Ccxray-Auth` headers from untrusted networks (the header should only originate from trusted CLI clients, not from inbound browser users). + +Two env vars, one reverse-proxy rule. Same constraint posture as A; B adds the strip-rule recommendation because B's credential is in a header that intermediaries can reasonably filter. + +### 7.7 Shared-host multi-UID footgun + +`AUTH_TOKEN` unset = ephemeral mode (random secret in `~/.ccxray/local-secret`, file mode `0600`). Other UIDs cannot read the secret, cannot bootstrap, cannot reach the dashboard or upstream domain. The `ccxray open` CLI is the privileged interface and it reads the secret via the Unix socket (peer-UID gated), so even a co-tenant running their own `ccxray open` against your hub fails: their UID doesn't match. + +For genuine single-user-laptop anonymous mode: `CCXRAY_LOOPBACK_NO_AUTH=1`, with a startup banner. Explicit opt-in. + +### 7.8 Constant-time comparison + +All comparisons use `crypto.timingSafeEqual` on equal-length buffers obtained from `crypto.createHmac(...).digest()` (fixed 32 bytes). For inputs of unknown length: + +```js +function compareToken(provided, expected) { + // hash both to fixed-width digest; compare digests + const ph = crypto.createHash('sha256').update(provided || '').digest(); + const eh = crypto.createHash('sha256').update(expected || '').digest(); + return crypto.timingSafeEqual(ph, eh); +} +``` + +Both inputs are hashed unconditionally; comparison runs on fixed-length buffers; there is no early return on length mismatch and no "fake hash" hand-wave (A's pattern in WEAKNESS-10). This is the standard pattern. + +### 7.9 Logging + +- Successful auth: no log line. +- Failed auth: one line `{ts, ip, method, path, reason}` to `~/.ccxray/auth.log` with 10 MB rotation. +- `req.url` is logged with `?token=` scrubbed (defensive — Phase 1 still accepts `?token=`). +- `X-Ccxray-Auth` and `X-Ccxray-Bootstrap` headers are explicitly listed in `server/log-sanitize.js` as redacted-by-prefix. Cookie values are redacted as a class. We never log `Authorization` headers. +- The auth.log rotation file is created with mode `0600`. + +### 7.10 Test coverage matrix + +Not a line budget — a coverage matrix. Every cell must have at least one assertion. + +| Scenario / Concern | Upstream domain | Dashboard domain | Hub IPC | +|---|---|---|---| +| Valid credential accepted | `X-Ccxray-Auth` valid → 200 | Valid cookie → 200; valid `X-Ccxray-Auth` → 200 | Same-UID connect → ok | +| Invalid credential rejected | Wrong header value → 401 | Wrong cookie HMAC → 401; expired cookie → 401; missing both → 401 | Different-UID connect → reject | +| No credential | No header → 401 (loopback no-auth disabled) | Missing cookie on `/_api/*` → 401; on `/` → 200 (shell) | No socket connect possible without UID match | +| `?token=` query (Phase 1) | Accepted + deprecation header | Accepted + deprecation header | n/a | +| `?token=` query (Phase 2+) | Rejected | Rejected except on `/` (Phase 2) | n/a | +| CSRF: cross-origin `fetch` with cookie | n/a (impossible by CORS) | `Sec-Fetch-Site: cross-site` → 403 | n/a | +| CSRF: form POST cross-origin | n/a | `Sec-Fetch-Mode: navigate, Sec-Fetch-Site: cross-site` → 403 | n/a | +| CSRF: `` cross-origin | n/a | `Sec-Fetch-Dest: image, Sec-Fetch-Site: cross-site` → 403 | n/a | +| CSRF: old browser w/o Sec-Fetch | n/a | Origin mismatch on POST → 403; missing Origin on POST → 403 | n/a | +| DNS rebinding (Host: evil.com) | 421 regardless of auth | 421 regardless of auth | n/a | +| WS upgrade auth | Valid header → 101; invalid → 401 socket close | n/a (no browser WS) | n/a | +| Token URL leak (bootstrap) | n/a | Fragment never reaches server; `/` GET log has no token | n/a | +| Server restart | `K_upstream` rederives correctly | Cookie validates after restart (same `AUTH_TOKEN`); invalidates on `AUTH_TOKEN` change | Reconnect after restart | +| Hub idle shutdown + restart | Cookie survives | Cookie survives | New socket comes up at same path | +| Hub crash recovery | Cookie survives | Cookie survives | Client reconnects to new socket | +| Multi-UID localhost | Other UID can't read `~/.ccxray/local-secret`; 401 | Other UID can't get a cookie via redemption (no bootstrap token); 401 | Other UID rejected at socket | +| Constant-time | timing test: 1000 runs of wrong-byte-at-position-0 vs position-31; stddev within tolerance | same | n/a | +| Logging | Failed auth logs; cookies/headers redacted | Failed auth logs; cookies/headers redacted | Failed peer-UID logs | + +Each row is at least one test, and many rows are several tests. Implementation lives in `test/auth-upstream.test.js`, `test/auth-dashboard.test.js`, `test/auth-bootstrap.test.js`, `test/auth-hub-socket.test.js`, `test/auth-csrf.test.js`, `test/auth-rebind.test.js`, `test/auth-timing.test.js`, `test/auth-migration.test.js`. Total: ~50 test cases across 8 files. Coverage is asserted by the matrix above, not by LOC. + +### 7.11 What B does *not* solve + +To be honest about residual risk: + +- **XSS in conversation content** is mitigated for credential theft (HttpOnly + key isolation) but not for "attacker JS makes authenticated calls". The right fix is the existing escape layer, not the auth layer. +- **TLS** is not provided; plain-HTTP eavesdropping on a hostile network is unmitigable in-scope. +- **Per-session revocation** is not provided (stateless cookies trade per-session revocation for hub-lifetime resilience). Rotation by changing `AUTH_TOKEN` is the only revocation primitive. +- **Compromise of `~/.ccxray/local-secret`** (file permissions weakened by user error) compromises everything. The single-secret model has this inherent property. +- **Windows peer-UID gap**: Unix domain socket peer credentials are not available on Windows in Node ≤ 22. The fallback (named pipe + secret file) is functionally equivalent but explicitly documented as a different trust model. + +--- + +## 8. Summary scoring against the rubric + +| Rubric criterion | Status | +|---|---| +| Threat 1 (CSRF) | Mitigated structurally: upstream domain unreachable from browsers via CORS-preflight invariant; dashboard via `SameSite=Strict` + `Sec-Fetch-Site` + Origin fallback. **No "Origin check skipped" carve-out anywhere.** | +| Threat 2 (DNS rebind) | Mitigated: universal Host allowlist on both domains, no exemptions. | +| Threat 3 (Token in URL) | Mitigated structurally: token in fragment (`#k=`), never query string, never server-logged, never Referer'd. **Token is one-time, 60s TTL.** | +| Threat 4 (XSS exfil) | Mitigated for credential theft via HttpOnly + key isolation. Residual "active session" accepted as outside auth scope. CSP narrows further. | +| Threat 5 (plain-HTTP) | Accepted by constraint; ops guidance includes `X-Ccxray-Auth` strip at terminator. | +| Threat 6 (WS URL token) | Mitigated: WS upgrade requires `X-Ccxray-Auth` header; browser never opens a WS against ccxray (invariant). | +| First-time setup ≤ 1 step beyond `AUTH_TOKEN` | Yes: `ccxray open` is the one step. (And if `AUTH_TOKEN` is unset, ephemeral mode requires zero env vars.) | +| Browser works across reloads | Yes: cookie persists 8h; survives server restart if `AUTH_TOKEN` unchanged. | +| CLI/curl/CI trivial | Yes: `curl -H 'X-Ccxray-Auth: '`. `K_upstream` retrievable via `ccxray secret upstream` for piping into CI env files. | +| Auth logic ≤ 2 files | Yes: `server/auth.js` (verifyUpstream + verifyDashboard + deriveSecrets) and `server/index.js` (dispatcher). CLI bootstrap is a separate concern in `bin/ccxray`. | +| No per-route bespoke handling | Yes: routing is by path prefix → domain → verifier; no per-route policy switching. | +| No client-side monkey-patching | Yes: cookie attaches automatically; `EventSource`, `fetch`, no patching. The 20-line bootstrap script reads a fragment and POSTs once — not a patch. | +| Composes with future features | Yes: adding a new upstream prefix is one `UPSTREAM_PREFIXES` entry. Adding a new auth mode is a new `verifyXxx` function. No central enum to grow. | +| Hub mode preserved | Yes, structurally improved: IPC moves off HTTP entirely. Lockfile-based discovery preserved. | +| Multi-UID localhost defensible | Yes: ephemeral mode (default) restricts access to the owning UID; opt-in flag to relax. **Default is safe.** | + +--- + +## 9. Key structural differences from Candidate A + +For reviewers: + +1. **Two security domains, not one classifier.** Upstream and dashboard are separate enforcement surfaces with separate verifiers. A's single `authGate(kind)` produced WEAKNESS-2 and WEAKNESS-6; B's split eliminates them by construction. +2. **Stateless HMAC cookies, not a server-side `Set`.** Sessions survive hub idle-shutdown and crash-recovery. A's WEAKNESS-4 and WEAKNESS-5 do not apply. +3. **Fragment-based one-time bootstrap, not `?token=` query.** Token never reaches the server in any URL form. A's WEAKNESS-3 does not apply. +4. **Custom header `X-Ccxray-Auth`, not `ANTHROPIC_AUTH_TOKEN`.** No collision with upstream credentials. A's WEAKNESS-7 does not apply. +5. **Unix domain socket for hub IPC.** HTTP listener never serves privileged IPC. The cross-UID HTTP threat surface is eliminated, not defended. +6. **`Sec-Fetch-Site` as primary CSRF gate.** Browser-native, JS-unsettable, with Origin/Referer fallback for non-browser cases. +7. **Default-deny for unset `AUTH_TOKEN`.** Ephemeral mode uses a per-UID file-based secret; cross-UID localhost access requires explicit `CCXRAY_LOOPBACK_NO_AUTH=1`. A's "127.0.0.1 = anonymous OK" footgun does not apply. +8. **Test coverage as a matrix, not a line budget.** 8 test files structured around the threat × domain matrix in §7.10. + +This is a different bet from A, not a patch on A. + +--- end of candidate B --- diff --git a/reason/260525-0055-ccxray-auth-design/critique-A.md b/reason/260525-0055-ccxray-auth-design/critique-A.md new file mode 100644 index 0000000..d906236 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/critique-A.md @@ -0,0 +1,23 @@ +# Critique of Candidate A + +WEAKNESS-1 [FATAL]: "WebSocket upgrade: Accepts ONLY `Authorization: Bearer`, never cookie, never `?token=`. Codex sets this when launched via ccxray's launcher." — Browsers cannot set arbitrary headers on WebSocket handshakes via the `WebSocket` constructor. The dashboard's own browser-side code therefore cannot authenticate any WS upgrade under this rule. The candidate hand-waves "Codex sets this" but says nothing about how the dashboard UI itself opens WS connections (e.g., for live streaming features), and the threat model explicitly distinguishes "browser" from "CLI." Either the dashboard has no WS (in which case why specify this gate at all?) or the rule is unimplementable for the dashboard — the design never resolves which. + +WEAKNESS-2 [FATAL]: "CLI/curl: Bearer authenticated. Exempt from Origin/Host CSRF check because bearer cannot be sent by a browser victim cross-origin without explicit JS that must already have the bearer (OWASP custom header pattern)." — This is wrong about the bearer header and wrong about the Host check. (a) The Host header IS sent by browsers and a DNS-rebind attacker controls it; exempting bearer-auth requests from the Host check leaves an open hole for any victim who has the bearer cached in a browser extension, password manager autofill, or a service worker — and more critically, exempts the very requests Codex/Claude Code send through localhost from rebind protection. (b) Saying "Host check enforced" for upstream-proxy in §2 directly contradicts "Origin check skipped for upstream-proxy" combined with the table in §4 claiming Host allowlist defends rebind — the candidate states both that Host is checked and that bearer requests are exempt from CSRF checks (which the §3 text bundles together as "for every cookie-auth'd request AND every non-GET regardless of auth method"). Which is it? + +WEAKNESS-3 [FATAL]: "User opens `http://localhost:5577/?token=`. Server 302s to `/_auth?token=&next=/`." — The token is now leaked into the server access log, the browser history, the Referer header of any resource the redirect target loads (Referrer-Policy on `/_auth` is too late — the leak happens on the inbound `/` request, and the redirect chain itself is logged), and any process listing observing the URL bar. The candidate claims "Token in URL exactly once at `/_auth`" but in fact the token appears at `/?token=`, then `/_auth?token=`, i.e., twice, and the first hop has no Referrer-Policy control because it's the entry point. The mitigation row #3 is materially false. + +WEAKNESS-4 [MAJOR]: "stores `sha256(S)` in in-memory `Set` with expiry now+8h" — Storing in a `Set` has no expiry mechanism; Sets don't evict. The text says "with expiry now+8h" but never describes the sweep. With an 8h window and unbounded entries (every reload of an expired bookmark redeems a new session), the Set grows monotonically until restart. Worse, "Server restart: in-memory sessions wiped" means every hub idle-shutdown (5s per the project notes) invalidates all browser sessions — this directly contradicts the smooth-UX claim and makes the cookie approach near-useless in hub mode where idle shutdown is the norm. + +WEAKNESS-5 [MAJOR]: "Hub mode: hub holds canonical session set, clients proxy through hub. New hub on crash recovery generates fresh ipcSecret and fresh session set." — This is incompatible with the project's documented hub model where `~/.ccxray/hub.json` is the discovery file and clients connect directly to the hub port. If the hub holds the session set, then a crash-recovery fork (per the project notes: clients monitor pid every 5s and fork a new hub) means every browser session dies every time any client restarts the hub. Combined with idle shutdown after 5s, the cookie has an effective lifetime measured in seconds, not 8 hours. The "Max-Age=28800" is theater. + +WEAKNESS-6 [MAJOR]: "SameSite=Strict + Origin check on non-GET (defense in depth)" + "Origin check skipped for upstream-proxy" — The upstream-proxy path is exactly where state-changing requests to Anthropic go (POST /v1/messages, etc.). If the classifier routes a request to `upstream-proxy` based on path, an attacker who controls a victim's browser and knows the bearer (or can ride a cookie session, since the unified middleware "accepts either the bearer header or the cookie") can POST to `/v1/messages` cross-origin with no Origin check. The candidate's own unified middleware undermines the CSRF claim: cookie auth + upstream-proxy classification + Origin skip = CSRF-able expensive API calls billed to the user. + +WEAKNESS-7 [MAJOR]: "The CLI launcher injects `ANTHROPIC_AUTH_TOKEN=` as a request header via the launcher's env." — `ANTHROPIC_AUTH_TOKEN` is the Anthropic API credential variable; conflating it with ccxray's `AUTH_TOKEN` means either (a) the proxy's shared secret is now identical to the user's Anthropic key, which is a credential-coupling disaster, or (b) the launcher is overwriting the user's real Anthropic token with ccxray's shared secret, breaking upstream auth. The candidate does not say which. Codex's `-c request_headers` claim is similarly hand-waved — no actual header name is specified for the bearer injection into Codex's HTTP layer, and Codex's WS transport (per the project notes, the main session uses WS on `/v1/responses`) cannot accept arbitrary headers from `-c` config in the way implied. + +WEAKNESS-8 [MAJOR]: "treats absence of `AUTH_TOKEN` as '127.0.0.1-only, anonymous OK'" combined with "`AUTH_TOKEN` unset + bound to 0.0.0.0 now requires explicit `CCXRAY_ALLOW_ANONYMOUS_REMOTE=1` opt-in." — Any local process on a multi-user machine (shared dev box, CI runner, devcontainer with other tenants) can hit 127.0.0.1:5577 and exfiltrate every recorded request/response, including the user's Anthropic key in captured headers. "127.0.0.1-only" is not a security boundary on shared hosts, and the candidate elsewhere acknowledges XSS-in-conversation as a vector — meaning conversation content can include hostile data — yet the default posture exposes everything to any local UID with zero auth. + +WEAKNESS-9 [MINOR]: "tests: 3 new files (auth-cookie, auth-csrf, auth-rebind), <80 lines each" — 240 lines of tests for a security middleware that handles bearer, cookie, Origin, Host, WS upgrade, hub IPC, anonymous mode, constant-time comparison, session expiry, and rotation is wildly under-scoped. The candidate offers a line budget instead of a coverage matrix; this is a tell that the design hasn't been mentally executed against its own threat table. + +WEAKNESS-10 [MINOR]: "mismatched length runs fake hash before short-circuit" — Performing a "fake hash" before short-circuit does not equalize timing meaningfully unless the fake work is calibrated to the real work's distribution. `timingSafeEqual` requires equal-length buffers; the actual safe pattern is to hash both sides to a fixed-width digest and compare those. The candidate's phrasing reveals a misunderstanding of where the timing channel lives. + +VERDICT: The design's CSRF/rebind defense collapses because the unified middleware exempts bearer/upstream-proxy from Origin checks while accepting cookies on the same paths, and the cookie session lifetime is contradicted by the project's hub idle-shutdown and crash-recovery behavior — making both the threat-model claims and the UX claims false. diff --git a/reason/260525-0055-ccxray-auth-design/errata.md b/reason/260525-0055-ccxray-auth-design/errata.md new file mode 100644 index 0000000..e5e0101 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/errata.md @@ -0,0 +1,141 @@ +# Errata — review findings against candidate-AB.md + +**Reviewer:** Codex CLI v0.133.0-alpha.1 (gpt-5.5), independent pass on this PR. +**Date:** 2026-05-25. + +This file records concrete corrections to `candidate-AB.md` that surfaced during external review and empirical verification. The original `candidate-AB.md` is preserved as the historical record of the reason loop's winning synthesis; the deviations below are what the implementation will actually ship. + +--- + +## 1. Blocking corrections + +### 1.1 HttpOnly cookie + JS `document.cookie` check are mutually exclusive + +`candidate-AB.md` §2.3 Flow B (around L132) and §3.2 (around L261) describe: + +- `Set-Cookie: ccxray_s=...; HttpOnly; SameSite=Strict; Path=/; Max-Age=28800` +- Inline bootstrap script: `else if (!document.cookie.includes('ccxray_s=')) { ... }` + +These are incompatible. `HttpOnly` cookies are intentionally invisible to JavaScript; `document.cookie` will never contain `ccxray_s=`, so the "no session" branch fires regardless of whether a valid cookie exists. + +**Implementation deviation.** Replace the `document.cookie.includes(...)` probe with a server-side auth-status endpoint: + +```js +const status = await fetch('/_auth/status', { credentials: 'same-origin' }); +if (status.status === 401) document.body.textContent = 'No session. Run `ccxray open`.'; +``` + +The cookie remains `HttpOnly` (XSS-in-conversation defense preserved). Cost: one extra GET on cold load, no server-side state change. The new endpoint costs ~10 LOC and lives in `server/routes/auth.js`. + +Affected commits: **1.3** (bootstrap flow), **3.1** (final cleanup). + +### 1.2 `net.Socket._handle.getpeereid` is not a public Node API + +`candidate-AB.md` §3.7 (around L394) claims Node ≥ 18 on Linux/macOS exposes `socket._handle.getpeereid`. Verified locally on Node v22.22.2/darwin — the method does not exist on `pipe_wrap.Pipe.prototype` nor on the `net` public API. There is no public Node interface to read `SO_PEERCRED`/`getpeereid(2)` without a native addon. + +**Implementation deviation.** Defend peer identity at the **filesystem layer**, not the Node API: + +- `~/.ccxray/` mode `0700` (already in plan). +- `~/.ccxray/hub.sock` mode `0600`. +- Other UIDs receive `EACCES` from `connect(2)` at the kernel; the connection never reaches Node code. + +The peer-UID claim downgrades from "primary gate" to "belt-and-suspenders we cannot ship without a native addon (out of scope per zero-new-deps constraint)." The threat model is preserved because filesystem permissions are the actual access control on local Unix sockets — peer-credential checks would only catch a same-UID attacker, which is outside this design's scope. + +Affected commit: **2.3** (Unix socket hub IPC). + +### 1.3 Codex CLI key name is `http_headers`, not `request_headers` + +`candidate-AB.md` §3.5 (around L366) writes: + +> `-c request_headers='X-Ccxray-Auth='` via the codex per-request-header config. Verified to propagate to the WebSocket upgrade HTTP request. + +Empirical verification against Codex v0.133.0-alpha.1: + +``` +$ codex exec --strict-config -c 'request_headers={X-Ccxray-Auth="test"}' "x" +Error loading config.toml: unknown configuration field `request_headers` in -c/--config override +``` + +Codex does support header injection, but through `model_providers..http_headers` (plus `env_http_headers` for env-derived headers) and a top-level `model_provider = ""` selecting which provider applies. Verified by spy-server test — `X-Ccxray-Auth: test-value-123` did appear on the outbound HTTP request. + +**Implementation deviation.** ccxray's Codex launcher (Phase 1.4) should construct overrides like: + +```js +codex \ + -c 'model_providers.ccxray={name="ccxray", base_url="http://localhost:5577/v1", wire_api="responses", http_headers={"X-Ccxray-Auth"=""}}' \ + -c 'model_provider="ccxray"' \ + ...args +``` + +**Spike resolved (2026-05-25).** Codex v0.133.0-alpha.1 hard-rejects any attempt to override builtin providers: + +``` +$ codex exec -c 'model_providers.openai.http_headers={"X-Test"="v"}' ... +Error: model_providers contains reserved built-in provider IDs: `openai`. +Built-in providers cannot be overridden. Rename your custom provider +(for example, `openai-custom`). +``` + +Probed all plausible shortcut keys (`openai_http_headers`, `chatgpt_http_headers`, `default_http_headers`, etc.) — none exist. Only `chatgpt_base_url` and `openai_base_url` are top-level builtin shortcuts; there is no header equivalent. + +**Decision: spawn-time auth-mode detection in `server/providers.js`.** + +ccxray's Codex launcher detects which Codex auth path the user is on and chooses: + +- **API-key Codex** (env `OPENAI_API_KEY` set): inject `-c 'model_providers.ccxray={…, http_headers={"X-Ccxray-Auth"="…"}}'` + `-c 'model_provider="ccxray"'`. Header enforcement active end-to-end. +- **ChatGPT-OAuth Codex** (no `OPENAI_API_KEY`): skip the `model_provider` override entirely (would break OAuth login). Continue with `-c 'openai_base_url=…'` and `-c 'chatgpt_base_url=…'` as today. **The upstream domain's `X-Ccxray-Auth` requirement does not apply to this path** — ccxray's upstream verifier additionally accepts loopback-unauth requests that look like Codex-on-ChatGPT (presence of `chatgpt-account-id` + JWT-shaped `Authorization`). + +Why this is acceptable for the threat model: + +- The only attacker class who can reach this path is a same-machine process on a different UID. Same-UID attackers are already trusted (they can read `~/.ccxray/local-secret`). +- Such an attacker cannot exfiltrate credentials: they would have to supply their own `Authorization` + `chatgpt-account-id`, which they cannot forge without having already compromised the ccxray user's ChatGPT session (a same-UID attack). +- Residual risk reduces to **cost amplification / log pollution by other-UID local attackers**, not credential theft. The mitigation primitive is to bind ccxray to a Unix socket (out of scope of this migration; deferred as a future hardening item) or run ccxray under a dedicated UID. + +Affected commits: +- **1.4** (warn-only launcher injection): detect API-key vs ChatGPT-OAuth mode and inject accordingly. WS gate accepts loopback-unauth for ChatGPT-shaped requests with a warn log. +- **2.1** (enforcement): block bearer + `?token=` on `/v1/*`; **except** retain loopback-unauth allowance for the ChatGPT-OAuth Codex path, gated by a single helper `isLoopbackChatGPTCodex(req)` that checks (a) `req.socket.localAddress` is loopback, (b) `Authorization` is JWT-shaped, (c) `chatgpt-account-id` header present. + +--- + +## 2. Non-blocking corrections + +### 2.1 Threat-table residual risks overstated + +`candidate-AB.md` §4 (around L429–L434) marks the residual risk for threats 1, 2, and 6 as "None." That phrasing is stronger than the architecture warrants: "no residual risk if Host allowlist, CORS denial, and header injection are implemented as specified" is more accurate. The mitigations are structural and high-quality, but calling them "None" reads as a guarantee, not a defense. + +**Implementation deviation.** None at the code level. README + commit messages use the more careful phrasing. + +### 2.2 Cookie name inconsistency between `overview.md` and `candidate-AB.md` + +- `overview.md` L19: `ccxray_session=.` +- `candidate-AB.md` L231: `ccxray_s = base64url(payload) "." base64url(hmac)` + +**Implementation deviation.** Pick **`ccxray_s`** (shorter, lower per-request header overhead; matches the more detailed spec). Update `overview.md` to match — single-line fix. + +### 2.3 Phase 1 "no breakage" claim relies on launcher injection working + +If Phase 1.4 launcher injection fails (e.g. user's Codex version doesn't take `model_providers.X.http_headers`), Phase 1 verifyUpstream still accepts legacy bearer/no-credential so spawned CLIs remain unbroken. The "no breakage" property holds during Phase 1 specifically because enforcement is deferred to Phase 2. This is correct in the design but worth restating in commit messages. + +**Implementation deviation.** None. + +--- + +## 3. Out-of-scope finding worth filing separately + +During the empirical Codex test, spawning Codex with `model_provider="ccxray"` and a `base_url` pointing at a localhost spy server caused Codex to attach the user's **ChatGPT OAuth JWT** (full bearer + `chatgpt-account-id`) to the outbound request, despite the provider config not naming `chatgpt`. + +This is a credential-leak surface in Codex itself, not in ccxray. It is unrelated to this migration but should be filed upstream and is a reminder that ccxray's own logging must always redact `Authorization` and `chatgpt-account-id` headers (already in plan). + +--- + +## 4. What changes in the implementation plan + +| Commit | Original plan | Revised plan | +|---|---|---| +| 1.3 | Inline `document.cookie.includes('ccxray_s=')` probe | `GET /_auth/status` probe (+10 LOC, new endpoint in `server/routes/auth.js`) | +| 1.4 | `-c request_headers='X-Ccxray-Auth=…'` for Codex | API-key Codex: `-c 'model_providers.ccxray={…, http_headers={…}}'` + `-c 'model_provider="ccxray"'`. ChatGPT-OAuth Codex: skip `model_provider` override; upstream verifier accepts loopback-unauth for that path. Spike completed; no further investigation needed before Commit 1.4 | +| 2.3 | peer-UID via `socket._handle.getpeereid` | Filesystem mode `0600` socket + `0700` parent dir as the real gate; no Node-API peer-credential check | +| Docs | Threat table claims "None" residuals | Use "low residual risk if X is implemented as specified" | +| Docs | `ccxray_session` (in `overview.md`) | Standardize on `ccxray_s` | + +No commit is added or removed. The total surface remains 8 commits across Phases 1–3. diff --git a/reason/260525-0055-ccxray-auth-design/judge-transcripts.md b/reason/260525-0055-ccxray-auth-design/judge-transcripts.md new file mode 100644 index 0000000..ea18574 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/judge-transcripts.md @@ -0,0 +1,62 @@ +# Round 1 — Judge Transcripts + +Label mapping (revealed post-vote): +- X = candidate-AB.md (synthesis) +- Y = candidate-A.md +- Z = candidate-B.md + +Vote tally: **X = 3, Y = 0, Z = 0** (unanimous for synthesis). + +--- + +## Judge 1 — Security architect + +WINNER: X + +SCORES (Security / Convenience / Maintainability, 1–10): +- X: S=9 C=9 M=9 +- Y: S=6 C=6 M=7 +- Z: S=8 C=7 M=8 + +RATIONALE: X wins because it adopts the structural improvements that close the real holes in Y while preserving the back-compat that Z sacrifices. On Security, X gets a universal Host allowlist with no carve-out, an upstream domain that browsers cannot reach by construction (`X-Ccxray-Auth` is non-CORS-simple so preflight is required and never granted), stateless HMAC cookies that survive hub recycle, fragment-based one-time bootstrap that never reaches the server, and peer-UID-gated Unix socket IPC — all six threats are mitigated structurally rather than by policy branches. On Convenience, X is the only candidate that keeps `Authorization: Bearer ` permanently valid on the dashboard domain (CI/CLI back-compat is load-bearing here), while still offering `ccxray open` as the one-step browser bootstrap. On Maintainability, X has a clean two-domain dispatcher with no `kind` enum to grow, two verifiers each ~50 LOC, and a crisp three-phase migration with explicit warn-only intermediate. + +Y's strongest weakness is its in-memory `Set` session store combined with the documented 5s hub idle-shutdown: every hub recycle invalidates every browser session, forcing constant re-redemption — the design's stated convenience claim ("cookie persists 8h") contradicts the system's actual lifetime. Y also routes the upstream policy through a single `authGate(kind)` that must remember to skip Origin for `upstream-proxy`, which is the kind of carve-out that re-breaks on the next refactor, and it puts the bootstrap token in the query string (Referer/log/bookmark leak surface). + +Z's strongest weakness is breaking the explicit constraint that `curl -H 'Authorization: Bearer X'` keeps working trivially — Z requires CI scripts to migrate to `X-Ccxray-Auth` with `K_upstream` (an HKDF-derived value users must extract via a new `ccxray secret upstream` subcommand) and only grants `Authorization: Bearer` as a one-deprecation-cycle alias. + +--- + +## Judge 2 — Threat-model auditor + +WINNER: X + +THREAT WALKTHROUGH: +- T1 CSRF: X passes (structural CORS preflight on upstream + SameSite/Sec-Fetch on dashboard), Y passes-but-fragile (skip-Origin carve-out one refactor away from regression), Z passes (same structural approach as X) +- T2 DNS rebind: X passes (universal Host allowlist, no carve-out), Y passes (Host allowlist + cookie without Domain), Z passes (universal Host allowlist) +- T3 Token-in-URL: X passes (fragment-only, never server-visible, single-use 60s), Y partial (token still hits server on `/` and `/_auth` GETs; bookmark residual acknowledged), Z passes (fragment-only, never server-visible) +- T4 XSS: X passes (HttpOnly + key isolation + CSP), Y passes-with-residual (HttpOnly only; AUTH_TOKEN used directly), Z passes (HttpOnly + key isolation + CSP) +- T5 Plain-HTTP: X / Y / Z all accepted-by-constraint with Secure flag guidance +- T6 WS leak: X passes (custom-header-only upgrade + no-browser-WS invariant), Y passes (bearer-only on upgrade), Z passes (custom-header-only + invariant) + +RATIONALE: X and Z share the load-bearing structural choices; Y is materially weaker on three of six threats. On T1, Y's "skip Origin check for upstream-proxy" is functionally correct today but is a single-bug-away regression because policy is multiplexed through one polymorphic `authGate(kind)` — X/Z eliminate the branch by construction via CORS preflight on a non-simple custom header. On T3, Y's "?token= used exactly once at /_auth" still puts the secret in `req.url`, access logs, and the bookmarked address bar; X/Z move it to the fragment, which never reaches the server, plus enforce single-use 60s TTL — the leak channel is closed, not narrowed. Y also under-states its operational cost: A's in-memory `Set` session set is wiped on hub idle-shutdown (5s after last client) and crash-recovery, forcing constant browser re-redemption, while X/Z's stateless HMAC cookies validate identically across hub recycles as long as AUTH_TOKEN is unchanged. + +Between X and Z, both pass all six threats with equivalent structural rigor, but X keeps `Authorization: Bearer ` permanently valid on the dashboard domain — which directly satisfies the "CLI/scripts MUST still work" constraint without forcing users to fetch a derived `K_upstream` via a new subcommand — while still using `X-Ccxray-Auth` exclusively on the upstream domain to preserve CORS-by-construction CSRF prevention. + +--- + +## Judge 3 — Operational reviewer + +WINNER: X + +OPERATIONAL WALKTHROUGH: +- Hub idle/crash recovery: X=survives, Y=breaks (in-memory session set wiped every 5s idle), Z=survives +- Browser session across reload: X=survives, Y=breaks on server restart (sessions wiped), Z=survives +- Token rotation: X=clean (rotate AUTH_TOKEN, HKDF re-derives), Y=blunt but works (set wipe), Z=clean +- Multi-UID localhost: X=safe (ephemeral 0600 secret + peer-UID socket), Y=footgun (127.0.0.1 anonymous OK by default), Z=safe +- Remote behind TLS: X=works, Y=works, Z=works +- WS auth from browser: X=structurally N/A (invariant: no browser WS), Y=bearer-only but no invariant, Z=structurally N/A +- Files actually touched (auth surface): X≈9, Y≈4–5, Z≈8 + +RATIONALE: X wins because it survives the two operational realities that define ccxray's runtime — the hub's 5-second idle-shutdown and crash-fork recovery — without invalidating browser sessions, while keeping the curl ergonomics every CI script depends on. Y's in-memory `Set` is fundamentally incompatible with the hub lifecycle described in the brief: every idle window wipes the set, so the user's browser dashboard 401s constantly during normal use. Y also leaves the "AUTH_TOKEN unset = 127.0.0.1 anonymous" default in place, which is a silent multi-UID localhost disaster on any shared dev host. Z fixes all of Y's operational gaps but pays an ergonomic tax: it drops `Authorization: Bearer ` as the CLI credential in favor of `X-Ccxray-Auth: ` that users must retrieve via `ccxray secret upstream`, breaking every existing curl script. + +X takes Z's structural wins and explicitly keeps `Authorization: Bearer ` permanently valid on the dashboard domain. The convenience claim ("set AUTH_TOKEN + open one URL") is honest in X because the URL is a one-time fragment that auto-converts to a long-lived cookie surviving hub recycle. The footgun in X is the larger implementation surface (~9 files touched) but each piece composes cleanly and the auth-policy core still lives in 2 files. X is the only design where the convenience promise survives contact with the hub's actual idle/crash behavior. diff --git a/reason/260525-0055-ccxray-auth-design/lineage.md b/reason/260525-0055-ccxray-auth-design/lineage.md new file mode 100644 index 0000000..6fe2d41 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/lineage.md @@ -0,0 +1,27 @@ +# Reason loop lineage — ccxray auth design + +| Phase | Artifact | Cold-start context | Outcome | +|---|---|---|---| +| 1 Setup | `task.md` | — | Brief written: convergent, 3 judges, iterations cap 3, 6-threat model | +| 2 Generate-A | `candidate-A.md` (420 lines) | task brief only | Stance: bearer-for-machines + HttpOnly cookie via `/_auth?token=` + Origin/Host pinning | +| 3 Critic | `critique-A.md` | candidate A only | 4 FATAL, 4 MAJOR, 2 MINOR; verdict: CSRF/rebind collapses on upstream-proxy + cookie lifetime contradicted by hub idle | +| 4 Generate-B | `candidate-B.md` (703 lines) | task + A + critique | Departs structurally: path-segregated domains, stateless HMAC cookies via HKDF, fragment bootstrap, Unix-socket hub IPC, `X-Ccxray-Auth` for upstream | +| 5 Synthesize-AB | `candidate-AB.md` (658 lines) | task + A + B (no critique) | Takes B's structural separation + HKDF cookies + fragment bootstrap; keeps A's permanent `Authorization: Bearer` on dashboard for CLI back-compat; explicit two-domain asymmetric resolution | +| 6 Judge (blind, X=AB Y=A Z=B) | `judge-transcripts.md` | task + 3 candidates, no provenance | 3–0 for X (AB synthesis). All judges flagged Y's in-memory session set + hub idle incompatibility; all flagged Z's CLI ergonomics regression as a real but lesser cost. | + +## Convergence status + +- Iterations cap: 3 +- Convergence rule: 3 consecutive wins for same approach +- Rounds completed: 1 +- Round 1 winner: AB (synthesis), unanimous (3–0) +- Convergence: **not yet reached** — only 1 round; needs 3 consecutive synthesis wins + +## Recommendation + +The Round 1 sweep is a strong signal, but the convergence rule is not satisfied. Two paths forward: + +1. **Continue to Round 2:** Treat AB as incumbent, run Critic-on-AB → Generate-C → Synthesize-AB+C → blind judges. If AB-line wins again, continue Round 3 for full convergence. +2. **Stop here and implement AB:** The 3–0 sweep with concrete, non-correlated rationales from three different judge personas (architect / threat auditor / ops) is materially stronger evidence than a contested vote across more rounds would be. AB's design space is also already well-explored — A and B span the two natural extremes (in-memory session vs stateless HMAC; query-string vs fragment bootstrap; single-classifier vs path-segregated domains). Round 2 risks chasing diminishing returns. + +User decision pending. diff --git a/reason/260525-0055-ccxray-auth-design/overview.md b/reason/260525-0055-ccxray-auth-design/overview.md new file mode 100644 index 0000000..09827b5 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/overview.md @@ -0,0 +1,41 @@ +# ccxray auth scheme — reason loop overview + +**Status:** Round 1 complete. Synthesis (AB) won unanimously 3–0. + +## Files in this directory + +| File | What it is | +|---|---| +| `task.md` | Original task brief: system context, 5 known problems, 6 threats, constraints, rubric, deliverable shape | +| `candidate-A.md` | Round 1 Generate-A. Stance: bearer for machines + HttpOnly cookie via one-shot `?token=` redemption + Origin/Host pinning. Single `authGate(kind)` middleware. | +| `critique-A.md` | Adversarial critique of A. Found 4 FATAL, 4 MAJOR, 2 MINOR. | +| `candidate-B.md` | Round 1 Generate-B. Departs structurally from A: path-segregated domains (upstream vs dashboard), stateless HMAC cookies via HKDF(AUTH_TOKEN), fragment bootstrap `#k=…`, Unix-socket hub IPC. | +| `candidate-AB.md` | Round 1 Synthesis. **Winner.** Takes B's structural separation + stateless HMAC + fragment bootstrap, keeps A's permanent `Authorization: Bearer` on the dashboard domain so existing CLI/CI scripts keep working. | +| `judge-transcripts.md` | Three blind judges' verdicts (architect / threat auditor / ops reviewer). 3–0 for AB. | +| `lineage.md` | Phase-by-phase chronicle of the reason loop, convergence status, and what to do next. | + +## Winning stance (one paragraph) + +> **Note:** This summary reflects the implementation deviations recorded in [`errata.md`](errata.md), not the original `candidate-AB.md` verbatim. Where this summary and `candidate-AB.md` disagree, `errata.md` is authoritative. + +Split the auth surface into two domains with one shared root secret. On the **upstream domain** (`/v1/*` plus the WS upgrades to `/v1/responses` and `/v1/realtime`), accept `X-Ccxray-Auth: ` — a custom header that browsers cannot send cross-origin without a CORS preflight, which is never granted, so browser-initiated CSRF on the expensive upstream proxy is structurally impossible. (One carve-out: ChatGPT-OAuth Codex traffic cannot carry the header because builtin Codex providers cannot be overridden; for that path the upstream verifier accepts loopback-origin requests matching the ChatGPT-Codex header signature — see errata §1.3.) On the **dashboard domain** (`/`, `/_api/*`, `/_events`), accept any of three credentials in unified order — (1) `Authorization: Bearer ` (preserved permanently for `curl` and CI), (2) stateless HMAC-signed session cookie `ccxray_s=.` derived via labeled HKDF from `AUTH_TOKEN` so sessions survive hub idle-shutdown and crash-recovery without any server-side session set, or (3) `X-Ccxray-Auth: ` for advanced use. Cookies carry `HttpOnly; SameSite=Strict; Path=/; Max-Age=28800` (no `Domain`); state-changing requests additionally pass a primary `Sec-Fetch-Site` check with `Origin` allowlist fallback, and every request passes a universal `Host` allowlist (no carve-outs). Bootstrap is a **single `ccxray open` command** that prints `http://localhost:5577/#k=`; the URL fragment never reaches the server, never enters access logs, never appears in `Referer`, and is scrubbed via `history.replaceState` immediately after `POST /_auth/redeem`. Hub IPC moves off HTTP entirely onto a Unix domain socket (`~/.ccxray/hub.sock` mode `0600`, parent dir mode `0700`); the access gate is filesystem permissions (kernel returns `EACCES` to other UIDs at `connect(2)`) rather than `SO_PEERCRED`, which is not exposed by Node's public API — see errata §1.2. When `AUTH_TOKEN` is unset, ephemeral mode auto-generates a per-process secret stored at mode `0600` and refuses all non-loopback requests by default; "anonymous OK on 127.0.0.1" is removed because same machine ≠ same user. + +## Threat coverage (winning design) + +| # | Threat | Mitigation | +|---|---|---| +| 1 | Malicious-site CSRF | Upstream domain: CORS preflight never granted (custom header is non-simple); ChatGPT-Codex loopback-unauth carve-out (errata §1.3) is still safe because browsers cannot forge the `chatgpt-account-id` + JWT-shaped `Authorization` signature without same-UID compromise. Dashboard: `SameSite=Strict` cookie + primary `Sec-Fetch-Site` check + `Origin` allowlist fallback. | +| 2 | DNS rebinding | Universal `Host` allowlist on both domains (no exemption). Cookie has no `Domain` attribute (bound to exact host). | +| 3 | Token URL exfil | Bootstrap token lives only in the URL fragment — never reaches the server. One-time use, 60s TTL, single-redemption via POST. Permanent CLI bearer was never in a URL to begin with. | +| 4 | XSS-in-conversation | Session cookie `HttpOnly` — JS cannot read it. `K_upstream` never reaches browser context. CSP `default-src 'self'; connect-src 'self'`. Residual: in-page authenticated `fetch` from XSS is accepted as an XSS bug, not an auth bug. | +| 5 | Plain-HTTP eavesdrop | Out of scope by stated constraint. Ops doc: terminate TLS in front (Caddy / Tailscale Serve / SSH `-L`); set `CCXRAY_FORCE_SECURE_COOKIE=1`; add hostname to `CCXRAY_PUBLIC_ORIGINS`; terminator strips `X-Ccxray-Auth` before logging. | +| 6 | WS upgrade URL leak | Upgrade handler accepts `X-Ccxray-Auth` header (and the ChatGPT-Codex loopback-unauth carve-out from row 1 for the WS path); query token rejected with 401 at upgrade. Invariant: dashboard browser never opens a WS to ccxray (all live updates use SSE on `/_events`). | + +## What's next + +Two paths — your call: + +- **Implement now.** AB swept 3–0 with non-correlated rationales from three different personas. Round 2 is unlikely to dethrone it; the design space's two natural extremes (A and B) are already explored. +- **Run Round 2** for full 3-round convergence (iterations cap = 3). Treats AB as incumbent, generates a fresh challenger, synthesizes again. + +See `lineage.md` for full reasoning. diff --git a/reason/260525-0055-ccxray-auth-design/task.md b/reason/260525-0055-ccxray-auth-design/task.md new file mode 100644 index 0000000..0455575 --- /dev/null +++ b/reason/260525-0055-ccxray-auth-design/task.md @@ -0,0 +1,81 @@ +# Task Brief — ccxray auth scheme design + +**Domain:** security (with software overlap) +**Mode:** convergent +**Judges:** 3 (security-leaning) +**Convergence:** 3 consecutive wins +**Iterations cap:** 3 (initial; may extend on user approval) + +--- + +## Task + +**Design an authentication scheme for `ccxray` that balances security, convenience, and maintainability — given that it is a local-first developer tool with optional remote deployment.** + +### Target system (concrete context) + +`ccxray` is a single-process Node.js HTTP proxy + dashboard: + +- Sits between Claude Code / Codex and the Anthropic / OpenAI APIs, records every request, serves a Miller-column dashboard at the **same port** (default 5577). +- One process serves **three classes of clients** on the same port: + 1. **The LLM client** (Claude Code / Codex CLI) — programmatic, sets `ANTHROPIC_BASE_URL=http://localhost:5577`, sends real upstream API requests with its own `x-api-key` for the upstream service. + 2. **The dashboard browser** — loads `index.html` + static assets (`/style.css`, `/app.js`, `/miller-columns.js`, …), then opens an `EventSource('/_events')` SSE stream and issues many `fetch('/_api/*')` calls. + 3. **CLI / scripts / curl** — humans and CI poking at `/_api/entries?limit=10` etc. +- Optional **hub mode**: multiple `ccxray` clients on the same machine share one hub process via lockfile; cross-process within the same user. +- Run modes: + - **Default:** purely local — bound to localhost, single user, single machine. + - **Optional remote:** some users run it on a shared dev box or in a container reachable over a corporate LAN / Tailscale. +- Current auth model (`server/auth.js`): + - Off by default (`AUTH_TOKEN` env unset → allow all). + - When `AUTH_TOKEN=` is set, every request must carry it via either `Authorization: Bearer ` header or `?token=` query param. + - The same middleware gates dashboard HTML, static assets, SSE, API, AND the upstream proxy path. + - WebSocket upgrades (`/v1/responses` and `/v1/realtime` from Codex) also flow through the same server. + +### Known problems in the current scheme + +1. **Query-string mode loses auth on browser subrequests.** `/index.html?token=X` authenticates, but the browser then issues `GET /style.css`, `GET /app.js`, `fetch('/_api/entries')`, and `new EventSource('/_events')` with **no token attached** — they 401. Browser-based AUTH_TOKEN mode is effectively broken. (`?token=` works fine for curl/scripts because they carry it explicitly per request.) +2. **Token in URL leaks.** Browser history, Referer header (if the dashboard ever loads external assets), HTTP access logs, shoulder surfing, copy-paste-to-Slack. +3. **No CSRF protection** if a future fix uses ambient credentials (cookies). The dashboard has state-changing endpoints: `POST /_api/intercept//approve`, `POST /_api/intercept//reject`, `POST /_api/intercept/toggle`, `POST /_api/settings`, `POST /_api/stars`, `POST /_api/intercept/timeout`. +4. **No `Host` header validation** — vulnerable to DNS rebinding attacks against the localhost service. +5. **No graceful auth for WebSocket upgrades** — query-string `?token=` on the WS URL works but inherits all the leakage problems above; header-based auth on the WS upgrade can be hard from browsers. + +### Constraints / non-goals + +- **Zero new runtime deps.** `ccxray` advertises "zero dependencies beyond Node.js". The design must respect that. +- **Single coherent codepath.** No 5-mode swamp of "old `?token=` + new cookie + bearer + …" that future maintainers have to keep all aligned. +- **CLI/scripts MUST still work.** `curl -H 'Authorization: Bearer X' http://localhost:5577/_api/entries` is a primary usage. +- **Hub mode must keep working** — the lockfile/registry endpoints are part of the same auth surface. +- **No external IdP / SSO / OAuth.** Single shared secret model is acceptable; the secret comes from `AUTH_TOKEN` env or equivalent. +- **No persistent user store / accounts / roles.** Auth is binary: have-secret or not. +- **Out of scope:** multi-user RBAC, encryption-at-rest of logs, audit log of access, enterprise IT integration. + +### Threat model to cover explicitly + +The design MUST explicitly address (state how each is mitigated, or justify why it is acceptable to leave unmitigated): + +1. **Malicious website CSRF.** User is logged into `ccxray` at `http://localhost:5577` and visits `evil.com` in another tab. `evil.com` issues `` or `` or `fetch('http://localhost:5577/_api/entries', {credentials: 'include'})`. Does the design prevent state changes? Data exfiltration? +2. **DNS rebinding.** `evil.com` resolves first to attacker IP, serves a page, then re-resolves to `127.0.0.1`. The browser thinks it is same-origin with `evil.com` and the localhost service. +3. **Token exfiltration via URL surface.** History, Referer header, server access logs, paste-into-Slack, browser sync to other devices. +4. **XSS-in-conversation-content.** The dashboard renders LLM output and tool-call content, which is high-risk surface. If a model response triggers XSS, can the attacker steal the auth credential? +5. **Plain-HTTP eavesdropping** when deployed non-locally. ccxray has no HTTPS — what is the design's stance for remote deployments? +6. **Token leak via WebSocket upgrade URL.** Codex WS uses long-lived connections; if the URL carries `?token=`, the token sits in connection logs and any debugging dump. + +### What "balance" means here (rubric) + +- **Security:** All 6 threats above either mitigated or explicitly accepted with rationale. +- **Convenience:** First-time setup ≤ 1 step beyond setting `AUTH_TOKEN`. Browser dashboard works without manual token re-entry across reloads. CLI/curl/CI usage trivial. +- **Maintainability:** Auth logic lives in ≤ 2 files, no per-route bespoke handling, no client-side monkey-patching of `fetch`/`EventSource`/`WebSocket`. Composes with future features (auth method swap, multi-tenant in the distant future) without rewrite. + +### Required deliverable shape (design-doc level) + +Each candidate must include: + +1. **Stance / one-sentence summary** of the chosen approach. +2. **Component-level architecture**: which modules change, what the request flow looks like for each of the 3 client classes (LLM client, browser, CLI). +3. **Concrete protocol details**: header / cookie names, attributes (`HttpOnly`, `SameSite`, `Secure`, `Max-Age`, `Path`, `Domain`), endpoints added or changed, exact wire-format examples. +4. **Threat-by-threat mitigation table** covering the 6 threats above with the specific mechanism that defends against each. +5. **Migration path** from the current `AUTH_TOKEN` + header/query model. +6. **Explicitly rejected alternatives** with reasons (at minimum: pure cookie, pure bootstrap-injection, OAuth, mTLS). +7. **Failure modes & operational notes**: what happens if the token is rotated, if the user clears cookies, if multiple browser tabs are open, if the server restarts mid-session, if hub mode is active, if deployed non-locally over HTTP. + +Length: appropriate for design-doc completeness — concise where possible, but not so terse that protocol details or threat mitigations are hand-waved.