Run StreamableHTTP transport tests in process instead of over sockets#2767
Conversation
Final installment of the in-process test migration: this was the last file spawning uvicorn subprocesses on bind-then-close ports with readiness polling, which races under pytest-xdist when two workers pick the same ephemeral port. Two tests in this file have flaked exactly that way under parallel load. All four subprocess servers (basic, JSON-response, event-store, context-aware) become in-process apps served through the interaction suite's StreamingASGITransport, held open by the session manager's run() context. Raw `requests` calls become httpx calls against the bridge client; the sync request-validation tests become anyio tests. The second-GET-409 test now holds the first stream open by construction, where the subprocess version noted it "might fail if the first stream fully closed before this runs". Assertions are unchanged, with documented exceptions now that the server handlers run as traced in-process code: - The long_running_with_checkpoints tool and the slow:// resource branch had no callers and are removed, so the expected tools/list count drops from 10 to 9 in five tests. - Dead defensive arms become asserts (sampling non-text fallback, close_sse_stream truthiness checks, the context server's unknown-tool fallthrough and request checks), and the event store's replay-from-unknown-event arm becomes a lookup that requires a stored event, since unreachable branches now fail branch coverage instead of hiding in an untraced subprocess. - test_client_crash_handled no longer sleeps between crashing clients; the bridge drains each client's teardown before the next connects. Three pragmas in src/mcp/server/streamable_http.py covered only by the formerly untraced subprocess (close_standalone_sse_stream, its session message callback, and the JSON-mode Accept rejection) are now executed by traced tests and removed. With the last wait_for_server user migrated, the helper is deleted from tests/test_helpers.py; run_uvicorn_in_thread stays for the websocket smoke test.
The tool slept a fixed duration after each close_sse_stream() so the client's retry-interval reconnect could land before the next close. That is a timing margin, not synchronization: a reconnect delayed past the margin made the close a silent no-op, two cycles merged, and the exact resumption-token count failed. The tool now waits, bounded by fail_after(5), for the client-side resumption-token callback to observe each cycle's two new tokens (the checkpoint and the new connection's priming event). The priming event is sent only after the server has re-registered the resumed stream, so once the client holds its token the next close is guaranteed to sever a live connection. The token count becomes a consequence of causality rather than margins, no sleep remains, the retry interval drops from 500ms to 50ms, and a genuinely failed reconnect now fails loudly at the timeout instead of silently merging cycles. The test now defines its own server with the gated tool inline, since the gate closes over per-test state; the shared server's tool_with_multiple_stream_closes had no other users and is removed, which moves the tools/list count assertions from 9 to 8.
There was a problem hiding this comment.
I didn't find any bugs — the src changes are limited to pragma/docstring removals with no behavior change — but this is a large rewrite of the streamable HTTP test suite (including a redesigned reconnection test), so it's worth a human look before merging.
Extended reasoning...
Overview
This PR is the final installment of the in-process test migration series (#2764, #2765). It rewrites tests/shared/test_streamable_http.py to drive all four test servers (basic, JSON-response, event-store, context) through StreamingASGITransport instead of spawning uvicorn subprocesses on bind-then-close ports. Source changes in src/mcp/server/streamable_http.py are limited to removing three # pragma: no cover markers and one stale docstring note — no behavior change. tests/test_helpers.py drops wait_for_server, which I verified has zero remaining callers; run_uvicorn_in_thread is retained for the websocket test.
Security risks
None of substance. The in-process test fixtures disable DNS-rebinding protection (enable_dns_rebinding_protection=False), but that protection is exercised separately in tests/server/test_streamable_http_security.py, and the change is confined to test fixtures — production defaults are untouched.
Level of scrutiny
The src diff is trivial, but the test rewrite is large (the entire file is restructured), converts 19 synchronous requests-based validation tests to async httpx, deletes two unused server tools (shifting tools/list count assertions from 10 to 8), and substantially redesigns test_streamable_http_multiple_reconnections around milestone events shared between the client-side resumption-token callback and the server tool. That synchronization redesign is clever but nontrivial, and a maintainer should confirm the new tests still exercise the intended behaviors with equivalent strength to the subprocess-based originals.
Other factors
The pattern follows the two already-merged sibling PRs, the author reports a green full run with 100% line+branch coverage and strict-no-cover clean, and the bug hunting system found no issues. The main residual risk is test-fidelity (weaker coverage or subtly changed assertions), which is exactly the kind of judgment call a human reviewer of the previous two PRs is best positioned to make — hence deferring rather than approving despite the absence of findings.
Final installment of the in-process test migration (#2764, #2765):
tests/shared/test_streamable_http.pywas the last file spawning uvicorn subprocesses on bind-then-close ports, the pattern that races under pytest-xdist when two workers pick the same ephemeral port. Two of its tests have flaked exactly this way under parallel load (test_server_validates_protocol_version_header,test_streamable_http_client_mcp_headers_override_defaults).Closes #2704.
Motivation and Context
Same mechanism and fix as the previous two PRs. All four subprocess servers (basic, JSON-response, event-store, context) become in-process apps driven through
StreamingASGITransport; with this, no test in the repo binds a port outside the one pre-bound websocket smoke test.What changed beyond the plumbing swap:
requests-library tests; they're now anyio + httpx over the bridge with the same methods, headers, bodies, and assertions (including suppressing the library-defaultAcceptheader where the old tests did).wait_for_serveris deleted fromtests/test_helpers.py— zero users remain.run_uvicorn_in_threadstays fortests/shared/test_ws.py.# pragma: no coverinsrc/mcp/server/streamable_http.pyare removed (close_standalone_sse_stream, its session-message callback, the JSON-mode Accept rejection): those lines were only ever executed inside the untraced subprocess; the migrated tests now cover them, and keeping the pragmas would failstrict-no-cover.close_standalone_sse_streamclaimed client reconnection for standalone GET streams "is NOT implemented — this is a known gap", but the note was self-contradictory from the day it landed — the same commit (Add SSE polling support (SEP-1699) #1654) implemented the auto-reconnect and the test that proves it.long_running_with_checkpointstool and theslow://resource branch had no callers anywhere; they're deleted, andtool_with_multiple_stream_closesmoves out of the shared server into the reworked reconnection test's inline server, which together move the tools/list count assertions from 10 to 8 in five tests. Unreachable handler branches elsewhere become dispatch asserts, since they now fail branch coverage instead of hiding in an untraced subprocess.test_streamable_http_client_resumption's flag-polling loop (while not flag: sleep(0.1), unbounded) is now twoanyio.Events awaited underfail_after(5).test_get_sse_stream's 409 assertion is deterministic by construction (the first GET is held open across the second), where the old version's comment admitted it could race.One test was further reworked rather than carried over:
test_streamable_http_multiple_reconnectionspreviously paced its server tool with fixed sleeps against the client's retry interval — a timing margin that could silently merge two close/reconnect cycles on a starved runner and break its exact token count. The tool is now gated on the client-side resumption-token callback observing each cycle's reconnect (the priming event is only sent after the server has re-registered the resumed stream, so the count becomes a consequence of causality rather than margins). No sleep remains in it, its retry interval drops to 50ms, and a genuinely failed reconnect now fails loudly at afail_after(5)instead of silently merging cycles.Deliberately retained elsewhere: the remaining genuinely time-based tests (the
elapsed >= 0.4retry-interval assertion, tool sleeps that create disconnect windows for single-cycle reconnect tests) keep their real-clock behaviour and byte-identical assertions — in-process execution strictly tightens their margins versus the subprocess version. A follow-up may apply the same gating pattern to those remaining sleeps; that's kept out of this PR so the harness swap stays auditable.How Has This Been Tested?
./scripts/testgreen: 1526 passed, 100% line+branch coverage includingtests/,strict-no-covercleanpytest-xdist -n 4, 3× — stable; the nine resumption/reconnection tests 5× soloBreaking Changes
None — test-only, plus pragma/docstring removals in src.
Types of changes
Checklist
Additional context
Several tests here overlap with
tests/interaction/transports/coverage (resumption-token replay, stream-close auto-reconnect, priming events, standalone GET delivery); deduplication is inventoried for a follow-up ticket, kept out of scope so this diff remains a pure harness swap.AI Disclaimer