|
| 1 | +# Plan: Provider-Neutral Runtime Determinism and Flake Elimination |
| 2 | + |
| 3 | +## Summary |
| 4 | +Replace timing-sensitive websocket and orchestration behavior with explicit typed runtime boundaries, ordered push delivery, and server-owned completion receipts. The cutover is broad and single-shot: no compatibility shim, no mixed old/new transport. The design must reduce flakes without baking Codex-specific lifecycle semantics into generic runtime code. |
| 5 | + |
| 6 | +## Implementation Status |
| 7 | + |
| 8 | +All 7 sections are implemented. CI passes (format, lint, typecheck, test, browser test, build). One deferred item remains: the shared `WsTestClient` helper from section 7 — tests use direct transport subscription and receipt-based waits instead. |
| 9 | + |
| 10 | +### New files |
| 11 | + |
| 12 | +| File | Purpose | |
| 13 | +|------|---------| |
| 14 | +| `packages/shared/src/DrainableWorker.ts` | Queue-based Effect worker with deterministic `drain` signal | |
| 15 | +| `packages/shared/src/schemaJson.ts` | Two-phase JSON→Schema decode helpers (`decodeJsonResult`, `formatSchemaError`) | |
| 16 | +| `apps/server/src/wsServer/pushBus.ts` | `ServerPushBus` — ordered typed push pipeline with auto-incrementing sequence | |
| 17 | +| `apps/server/src/wsServer/readiness.ts` | `ServerReadiness` — Deferred-based barriers for startup sequencing | |
| 18 | +| `apps/server/src/orchestration/Services/RuntimeReceiptBus.ts` | Receipt schema union: checkpoint captured, diff finalized, turn quiesced | |
| 19 | +| `apps/server/src/orchestration/Layers/RuntimeReceiptBus.ts` | PubSub-backed receipt bus implementation | |
| 20 | +| `apps/server/src/watchFileWithStatPolling.ts` | Stat-polling file watcher for containers where `fs.watch` is unreliable | |
| 21 | +| `apps/server/vitest.config.ts` | Server-specific test config (timeout bumps) | |
| 22 | +| `apps/server/src/wsServer/pushBus.test.ts` | Push bus serialization and welcome-gating tests | |
| 23 | +| `packages/shared/src/DrainableWorker.test.ts` | Drainable worker enqueue/drain lifecycle tests | |
| 24 | + |
| 25 | +### Key modifications |
| 26 | + |
| 27 | +| File | Change | |
| 28 | +|------|--------| |
| 29 | +| `packages/contracts/src/ws.ts` | Channel-indexed `WsPushPayloadByChannel` map, `WsPush` union schema, `WsPushSequence` | |
| 30 | +| `apps/server/src/wsServer.ts` | Integrated `ServerPushBus` and `ServerReadiness`; welcome gated on readiness | |
| 31 | +| `apps/server/src/keybindings.ts` | Explicit runtime with `start`/`ready`/`snapshot`; dual `fs.watch` + stat-polling watcher | |
| 32 | +| `apps/web/src/wsTransport.ts` | Connection state machine (`connecting`→`open`→`reconnecting`→`closed`→`disposed`); two-phase decode at boundary; cached latest push by channel | |
| 33 | +| `apps/web/src/wsNativeApi.ts` | Removed decode logic; delegates to pre-validated transport messages | |
| 34 | +| `apps/server/src/orchestration/Layers/CheckpointReactor.ts` | Uses `DrainableWorker`; publishes completion receipts | |
| 35 | +| `apps/server/src/orchestration/Layers/ProviderCommandReactor.ts` | Uses `DrainableWorker` for command processing | |
| 36 | +| `apps/server/src/orchestration/Layers/ProviderRuntimeIngestion.ts` | Uses `DrainableWorker` for event ingestion | |
| 37 | +| `apps/server/integration/OrchestrationEngineHarness.integration.ts` | Receipt-based waits replace polling loops | |
| 38 | + |
| 39 | +## Key Changes |
| 40 | +### 1. Strengthen the generic boundaries, not the Codex boundary — DONE |
| 41 | +- `ProviderRuntimeEvent` remains the canonical provider event contract; `ProviderService` remains the only cross-provider facade. |
| 42 | +- Raw Codex payloads and event ordering stay isolated in `CodexAdapter.ts` and `codexAppServerManager.ts`. |
| 43 | +- `ProviderKind` was not expanded. The runtime stays provider-neutral by contract. |
| 44 | + |
| 45 | +### 2. Replace loose websocket envelopes with channel-indexed typed pushes — DONE |
| 46 | +- `packages/contracts/src/ws.ts` now derives push messages from a `WsPushPayloadByChannel` channel-to-schema map. `WsPush` is a union schema replacing `channel: string` + `data: unknown`. |
| 47 | +- Every server push carries `sequence: number`, auto-incremented in `ServerPushBus`. |
| 48 | +- `packages/shared/src/schemaJson.ts` provides structured decode diagnostics via `formatSchemaError`. |
| 49 | +- `packages/contracts/src/ws.test.ts` covers typed push envelope validation and channel/payload mismatch rejection. |
| 50 | + |
| 51 | +### 3. Introduce explicit server readiness and a single push pipeline — DONE |
| 52 | +- `apps/server/src/wsServer/pushBus.ts`: `ServerPushBus` with `publishAll` (broadcast) and `publishClient` (targeted) methods, backed by one ordered path. All pushes flow through it. |
| 53 | +- `apps/server/src/wsServer/readiness.ts`: `ServerReadiness` with Deferred-based barriers for HTTP listening, push bus, keybindings, terminal subscriptions, and orchestration subscriptions. |
| 54 | +- `server.welcome` is emitted only after connection-scoped and server-scoped readiness is complete. |
| 55 | +- `wsServer.ts` no longer publishes directly from ad hoc background streams. |
| 56 | + |
| 57 | +### 4. Turn background watchers into explicit runtimes — DONE |
| 58 | +- `apps/server/src/keybindings.ts` refactored as explicit `KeybindingsShape` service with `start`, `ready`, `snapshot` semantics. |
| 59 | +- Initial config load, cache warmup, and dual watcher attachment (`fs.watch` + `watchFileWithStatPolling`) complete before `ready` resolves. |
| 60 | +- `watchFileWithStatPolling.ts` is the thin adapter for environments where `fs.watch` is unreliable. |
| 61 | + |
| 62 | +### 5. Replace polling-based orchestration waiting with receipts — DONE |
| 63 | +- `RuntimeReceiptBus` service defines three receipt types: `CheckpointBaselineCapturedReceipt`, `CheckpointDiffFinalizedReceipt` (with `status: "ready"|"missing"|"error"`), and `TurnProcessingQuiescedReceipt`. |
| 64 | +- `CheckpointReactor`, `ProviderCommandReactor`, and `ProviderRuntimeIngestion` use `DrainableWorker` and publish receipts on completion. |
| 65 | +- Integration harness and checkpoint tests await receipts instead of polling snapshots and git refs. |
| 66 | + |
| 67 | +### 6. Centralize client transport state and decoding — DONE |
| 68 | +- `apps/web/src/wsTransport.ts` implements an explicit connection state machine: `connecting`, `open`, `reconnecting`, `closed`, `disposed`. |
| 69 | +- Two-phase decode (JSON parse → Schema validate) happens at the transport boundary. `wsNativeApi.ts` receives pre-validated messages. |
| 70 | +- Cached latest welcome/config modeled as explicit `latestPushByChannel` state. |
| 71 | + |
| 72 | +### 7. Replace ad hoc test helpers with semantic test clients — MOSTLY DONE |
| 73 | +- `DrainableWorker` replaces timing-sensitive `Effect.sleep` with deterministic `drain()` across reactor tests. |
| 74 | +- Orchestration harness waits on receipts/barriers instead of `waitForThread`, `waitForGitRef`, and retry loops. |
| 75 | +- Behavioral assertions moved to deterministic unit-style harnesses; narrow integration tests kept for real filesystem/socket behavior. |
| 76 | +- **Deferred:** Shared `WsTestClient` helper (connect, awaitSemanticWelcome, awaitTypedPush, trackSequence, matchRpcResponseById). Tests use direct transport subscription instead. |
| 77 | + |
| 78 | +## Provider-Coupling Guardrails |
| 79 | +- No generic runtime API may depend on Codex-native event names, thread IDs, or request payload shapes. |
| 80 | +- No readiness barrier may be defined as "Codex has emitted X." Readiness is owned by the server runtime, not by provider event order. |
| 81 | +- No websocket channel payload may contain raw provider-native payloads unless the channel is explicitly debug/internal. |
| 82 | +- Any provider-specific divergence must be exposed through provider capabilities from `ProviderService.getCapabilities()`, not `if provider === "codex"` branches in shared runtime code. |
| 83 | +- Generic tests must use canonical `ProviderRuntimeEvent` fixtures. Codex-specific ordering and translation tests stay in adapter/app-server suites only. |
| 84 | +- Keep UI/provider-specific knobs such as Codex-only options scoped to provider UX code. Do not pull them into generic transport or orchestration state. |
| 85 | + |
| 86 | +## Test Plan |
| 87 | +- Contracts: |
| 88 | + - schema tests for typed push envelopes and structured decode diagnostics |
| 89 | + - ordering tests for `sequence` |
| 90 | +- Server: |
| 91 | + - readiness tests proving `server.welcome` cannot precede runtime readiness |
| 92 | + - push bus tests proving terminal/config/orchestration pushes are serialized and typed |
| 93 | + - keybindings runtime tests with fake watch source plus one real watcher integration test |
| 94 | +- Orchestration: |
| 95 | + - receipt tests proving checkpoint refs and projections are complete before completion signals resolve |
| 96 | + - replacement of polling-based checkpoint/integration waits with receipt-based waits |
| 97 | +- Web: |
| 98 | + - transport tests for invalid JSON, invalid envelope, invalid payload, reconnect queue flushing, cached semantic state |
| 99 | +- Validation gate: |
| 100 | + - `bun run lint` |
| 101 | + - `bun run typecheck` |
| 102 | + - `mise exec -- bun run test` |
| 103 | + - repeated full-suite run after cutover to confirm flake removal |
| 104 | + |
| 105 | +## Assumptions and Defaults |
| 106 | +- This remains a single-provider product during the cutover, but the runtime contracts must stay provider-neutral. |
| 107 | +- No backward-compatibility layer is required for old websocket push envelopes. |
| 108 | +- The goal is deterministic runtime behavior first; reducing retries and sleeps in tests is a consequence, not the primary mechanism. |
| 109 | +- If a completion signal cannot be expressed provider-neutrally, it does not belong in the shared runtime layer and must stay adapter-local. |
0 commit comments