Add TCP network transport to ACPAgent#2156
Conversation
Add ACPAgent, an AgentBase subclass that delegates to ACP-compatible servers (Claude Code, Gemini CLI, etc.) instead of direct LLM calls. The ACP server manages its own LLM, tools, and execution lifecycle. Closes #590 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Record per-turn token usage from PromptResponse.usage in step() - Record incremental cost from UsageUpdate notifications in session_update() - Yield sentinel LLM from get_all_llms() for telemetry pipeline - Wire _llm_ref on client for dynamic metrics/telemetry access - Trigger stats callback after each step for GUI updates - Extract notification drain delay to configurable constant - Add audit logging for auto-approved permission requests - Add debug logging in cleanup for process/executor errors - Wrap example in try/finally for robust cleanup - Add 8 telemetry tests in TestACPAgentTelemetry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ition Add ACP_NOTIFICATION_DRAIN_DELAY env var override, yield to the event loop before the timed sleep, and add a TODO for protocol-level fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix RequestPermissionResponse constructor (outcome, not result) - Add proper signatures to Client protocol stub methods - Narrow content[0] type in tests with isinstance checks - Pass required args in fs/terminal stub tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Renumber example 36 -> 40 to avoid conflict with 36_event_json_to_openai_messages.py - Add type: ignore[reportMissingImports] to all acp imports (optional dep) - Add TYPE_CHECKING import in __init__.py to satisfy pyright __all__ check - Add @requires_acp skip marker for tests needing the acp package Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move acp from optional to required deps, removing: - Lazy import guard in init_state - type: ignore[reportMissingImports] comments - @requires_acp skip markers on tests - test_missing_acp_sdk_raises_import_error test The __getattr__ in __init__.py is kept to avoid a circular import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ask_agent API (stateless question answering) failed for ACPAgent because its sentinel LLM can't make real LLM calls. This uses the ACP protocol's fork_session to create a temporary copy of the session, prompt it, collect the response, and discard the fork — preserving read-only semantics. - Add AgentBase.ask_agent() default method (returns None → use LLM path) - Override in ACPAgent with fork_session logic + token usage tracking - Add fork text routing in _OpenHandsACPClient.session_update() - Dispatch agent override in LocalConversation.ask_agent() - Add 7 tests for ask_agent and fork text routing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ACPAgent currently requires spawning the ACP server as a local subprocess, which needs Node.js installed in the container. When using APIRemoteWorkspace with K8s pods that lack Node.js, this forces custom container images. Add an alternative TCP transport mode where ACPAgent connects to an already-running ACP server over the network via asyncio.open_connection(). The ClientSideConnection works unchanged since it accepts any StreamReader/StreamWriter pair. Transport is configured via mutually exclusive fields validated by a model_validator: either acp_command (subprocess) or acp_host+acp_port (TCP). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Coverage Report •
|
||||||||||||||||||||
af8df4e to
2f59de1
Compare
Demonstrates the real-world use case: ACPAgent connects to a remote ACP server over TCP while running inside a K8s pod provisioned via the Runtime API. The pod image doesn't need Node.js since the ACP server runs as a separate network service. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
API breakage checks (Griffe)Result: Passed |
| print("\nDone!") | ||
|
|
||
|
|
||
| if __name__ == "__main__": |
There was a problem hiding this comment.
Can we remove this and make it a simple script, like all others are? It may be easier
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean solution to a real problem.
Linus-Style Assessment:
Solving the right problem: Remote K8s pods without Node.js is a legitimate deployment scenario. The TCP transport mode is pragmatic and well-motivated.
Data structures: Adding optional fields with a validator ensuring mutual exclusivity is the right approach. No unnecessary complexity.
Implementation: The branching in _start_acp_server() is clean and straightforward. Nested async functions are localized and readable. Cleanup properly handles both modes.
Tests: Comprehensive coverage (validation, serialization, both transports, error cases). All 63 existing tests pass.
Verdict: ✅ Worth merging. This is quality work—approve and ship it.
|
@OpenHands Fetch docs about ACP emerging standard from Zed Industries, so you understand what this PR, and ACP Agent here, are compatible with. Note that it’s ACP from Zed, not another similar acronym. Do a /codereview-roasted on the PR. Don’t exaggerate, think before you speak. When I ask you to summarize, include the full review instead. Really, during this session, if I ask you to summarize, include the full review instead. Note that it will be rendered as markdown. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as duplicate.
This comment was marked as duplicate.
enyst
left a comment
There was a problem hiding this comment.
@simonrosenberg
Could you please show some video or something, to see how it works?
I seem to see that the user needs to know some URL of the ACP server? When/how is the ACP server spawned? Maybe it’s just me, or maybe we could have a little more instructions, if possible?
|
(OpenHands-GPT-5.2) ACP (Agent Client Protocol) docs (Zed Industries) — what this PR is “compatible with”Authoritative sources:
Key points relevant to this PR:
That’s the lens I’m using below: this PR is implementing a custom TCP transport that happens to reuse the stdio-style stream model. /codereview-roasted — PR #2156 “Add TCP network transport to ACPAgent”Taste Rating🟡 Acceptable — solves a real deployment problem with minimal code, but it quietly invents “ACP over TCP” despite the spec not defining TCP transport, and it bakes in framing/behavior assumptions that deserve to be explicit. [CRITICAL ISSUES]
[IMPROVEMENT OPPORTUNITIES]
[TESTING GAPS]
Not mandatory, but it’s the one place where a tiny regression would be painful. VERDICT✅ Worth merging, provided everyone is clear this is a custom TCP transport convention (not ACP-standard transport today). The implementation is straightforward and the validation/tests are sensible; the main risk is user confusion + interoperability assumptions. KEY INSIGHTYou didn’t “add TCP transport to ACP” — you added a TCP socket connection option that assumes stdio-style ACP stream framing, and that assumption needs to be explicit so users don’t misconfigure it and blame ACP. |
- Simplify example script to flat style matching other examples - Clarify TCP is a custom transport (not ACP-standard) in docstrings - Document newline-delimited JSON-RPC framing expectations - Add configurable connect timeout (default 30s) for TCP mode - Use await writer.wait_closed() in TCP cleanup - Reject acp_port without acp_host, guard against empty acp_command - Apply _filter_jsonrpc_lines to TCP reads (same as subprocess mode) - Add test asserting asyncio.open_connection is called correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
asyncio.open_connection(host, port), as an alternative to the existing subprocess modeacp_hostandacp_portfields with a@model_validatorensuring exactly one transport (subprocess or TCP) is configured_start_acp_server()to branch on transport mode and update_cleanup()to close the TCP writerMotivation
When using
APIRemoteWorkspace, the agent runs inside a remote K8s pod whose container image doesn't include Node.js or the ACP command. Rather than requiring custom container images, this lets ACPAgent connect to an ACP server running elsewhere on the network.Test plan
ruff checkandruff format --checkclean🤖 Generated with Claude Code
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:9d49428-pythonRun
All tags pushed for this build
About Multi-Architecture Support
9d49428-python) is a multi-arch manifest supporting both amd64 and arm649d49428-python-amd64) are also available if needed