Skip to content

test(scripts): cross-platform MCP handshake tests for first-install (#125)#129

Merged
aeneasr merged 4 commits intomainfrom
test/stdio-first-install-red
Apr 14, 2026
Merged

test(scripts): cross-platform MCP handshake tests for first-install (#125)#129
aeneasr merged 4 commits intomainfrom
test/stdio-first-install-red

Conversation

@aeneasr
Copy link
Copy Markdown
Member

@aeneasr aeneasr commented Apr 14, 2026

Summary

Adds a deliberately failing end-to-end test suite in scripts/test_run.sh and a new scripts/test_run_windows.ps1 that reproduce the first-install MCP failure from #125 on both POSIX and Windows launchers. Expect CI to go red on ubuntu, macos, and windows — that is the point of this PR.

Why the existing CI suite missed the bug

Two gaps, stacked:

  1. No end-to-end invocation of the launchers. The previous stdio early-exit guard tests block reimplemented the guard condition inline as a local helper and asserted "the guard fires for the stdio arg" as correct behaviour — codifying the bug as the spec. Nothing actually ran bash run.sh stdio (let alone run.bat stdio) with a missing binary.
  2. No Windows coverage at all for run.bat's first-install code path. run.bat has the identical fast-fail guard at lines 28–32. fix(scripts): download binary synchronously in stdio mode on first install #127 only fixes POSIX; a Windows-only fix still needs to land, and nothing in CI would have told us that.

What this PR adds

  • scripts/testdata/mock_mcp_server/main.go — a tiny pure-Go mock MCP server. Reads one line from stdin, and if it is a JSON-RPC initialize request it writes a valid MCP initialize response and exits 0. Cross-compiles on all three runners, no CGO.
  • Real MCP handshake in test_run.sh — replaces the old stub-curl-plus-exit 0 test. The test now builds the mock, stubs curl in PATH to drop the mock binary into place, pipes a real initialize request into bash run.sh stdio, and asserts the stdout response contains "jsonrpc":"2.0" and "name":"mock-lumen". A pass transitively proves the launcher didn't fast-exit, reached the download path, wrote the artefact where it would exec it, chmod'd it executable, and exec'd it with stdin/stdout inherited correctly.
  • scripts/test_run_windows.ps1 — same shape, against run.bat, via PowerShell. Uses a curl.bat stub on a PATH-prepended dir (cmd.exe's PATHEXT lookup finds the .bat first).
  • CI matrixscripts job expanded to ubuntu-latest + macos-latest + windows-latest, fail-fast: false, actions/setup-go added. New pwsh step runs the Windows launcher test.

Bugs this kind of test catches that the old one couldn't

Beyond the #125 fast-fail, a stubbed-curl / stubbed-binary test can pass even when the launcher has any of these real bugs:

  • Launcher closes stdin or stdout before exec
  • Launcher writes the binary to one path but execs from another
  • Launcher forgets to chmod +x
  • Binary actually runs but doesn't speak MCP (broken flag parsing, crash on startup, etc.)
  • CRLF / quoting mishaps in run.bat that prevent the process from ever starting

A real MCP handshake against a real process is the only way to catch these.

Red / green

Next steps

Merge #127 and add the matching run.bat change to flip both tests green, or fold these commits into that PR as its TDD red-phase.

🤖 Generated with Claude Code

aeneasr and others added 2 commits April 14, 2026 12:05
Demonstrates the bug in #125: invoking `run.sh stdio` without a binary
(the Claude Code first-install flow) fast-exits instead of downloading.
Because Claude Code does not retry failed MCP servers, this leaves the
`lumen` tools unavailable for the entire session.

The existing `stdio early-exit guard tests` block never executes run.sh
— it reimplements the guard condition inline and asserts "the guard
fires for stdio" as correct behaviour, codifying the bug. This new
integration test stubs `curl` in PATH, creates a temporary plugin
root with a manifest but no binary, runs `bash run.sh stdio`, and
asserts exit 0 + that the expected binary file is produced.

Red now; turns green once the stdio fast-fail is removed (see #127).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The existing stdio first-install integration test (added earlier in this
PR) stubbed curl to write `#!/bin/sh; exit 0` as the "downloaded binary"
and only checked the launcher exited 0 with a file on disk. That passes
on any launcher that reaches the download step — it can't distinguish a
launcher that actually execs a working MCP server from one that swallows
stdin/stdout, closes pipes early, or silently drops to the wrong binary
path. It also tested only run.sh; run.bat (with the same fast-fail bug)
was untested end-to-end.

Replaces the stub with a real mock MCP server (scripts/testdata/
mock_mcp_server/main.go — pure Go, no CGO, cross-compiles on all three
OSes). The test now:

  - Builds the mock for the current OS/arch
  - Creates a clean plugin root with a manifest but no bin/
  - Stubs curl (POSIX) / curl.bat (Windows) to copy the mock into the
    path the launcher will exec
  - Pipes a real JSON-RPC `initialize` request on stdin
  - Asserts the stdout response contains `jsonrpc: "2.0"` AND
    `serverInfo.name == "mock-lumen"`

A passing test transitively proves the launcher did not fast-exit in
stdio mode, reached the download code path, wrote the artefact where it
would exec it, set it executable, and exec'd it with stdin/stdout
inherited correctly — the actual MCP-server-alive contract.

Adds scripts/test_run_windows.ps1 (same shape, against run.bat) and
wires it into the CI `scripts` matrix as a new pwsh step. Also expands
the matrix to ubuntu-latest + macos-latest + windows-latest and adds
actions/setup-go so the mock can be built.

Both the POSIX test and the Windows test are RED against the current
launchers because run.sh AND run.bat both still contain the stdio
fast-fail guard — merging #127 plus an equivalent run.bat change will
flip them green.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aeneasr aeneasr changed the title test(scripts): failing integration test for stdio first-install bug (#125) test(scripts): cross-platform MCP handshake tests for first-install (#125) Apr 14, 2026
aeneasr and others added 2 commits April 14, 2026 12:23
Previous version used `return` inside a try/finally block in a
dot-sourced script, which exits the script without running the summary
or `exit 1` line — making CI mark a real failure as success. Also
relied on Start-Process -RedirectStandardInput, which doesn't reliably
propagate the child's exit code back through PowerShell.

Rewrites both: uses System.Diagnostics.Process directly for
deterministic stdin/stdout/stderr wiring and exit-code reporting, and
restructures into an if/elseif chain so all paths flow through the
final summary + exit logic.

The test will now correctly fail the Windows job when run.bat fast-
exits in stdio mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When run.bat hits the stdio fast-fail and `exit /b 1`s, its stdin pipe
is already closed by the time PowerShell tries to write the initialize
request — producing a `pipe is being closed` exception that masks the
real diagnostic. Catch it and let the exit-code check produce the
proper "#125 — MCP server would be dead for the session" failure
message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aeneasr aeneasr merged commit 656bb47 into main Apr 14, 2026
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant