Releases: nnemirovsky/sluice
v0.16.0
Credential pools with automatic failover
One phantom identity the agent holds can now be backed by N real OAuth credentials, with transparent auto-failover when one is rate-limited or its auth fails.
- Pool model:
sluice pool create|list|status|rotate|remove. A pool maps a single pool-stable phantom (byte-identical synthetic JWT, R3) to the currently active member; refreshed tokens are attributed back to the issuing member via the injected refresh token (R1, fail-closed). - Auto-failover (Phase 2):
429/403 insufficient_quota→ rate-limited;401/ token-bodyinvalid_grant/invalid_token→ auth-failure. The active member is cooled and switched synchronously in-memory before the response returns, so the agent's own retry lands on the next member. Monotonic cooldown extension; lazy recovery; degrade-never-hard-fail. - QUIC request-side pool awareness + the R3 pool-stable phantom (response-side R1/failover is HTTP-only by capability).
- Data model: migrations
000006_credential_pools,000007_pool_membership_epoch.
Approval-prompt coalescing
Concurrent approvals to the same dest:port collapse into one Telegram prompt; one tap dismisses the whole burst, and the final coalesced count is folded into the existing resolve/cancel edit (zero extra Telegram API calls). MCP tool calls opt out (arg-sensitive). QUIC keeps its own packet-dedup.
Failover hardening (fixes found in live operation)
- Operator-park stranding fixed:
pool rotateparks the displaced member with reasonmanual rotate— that member is healthy, just deprioritized. The all-cooled degrade now prefers an operator-parked-but-healthy member over a genuinely-failed one, so a rotate onto an exhausted account fails over to the healthy peer instead of self-looping (which previously hard-failed the agent). Normal position-order rotate semantics are unchanged. - Self-failover spam fixed: when there is no distinct failover target (
to == from), it is classified as pool exhaustion — a distinctpool_exhaustedaudit action and an honest "pool exhausted" operator notice instead of a meaningless self-referentialcred_failover.FailoverEvent.Exhaustedcarries the distinction. - Notice dedup: identical
(pool, from, to, tag)signals are deduplicated within a 30s window, so an agent retry storm yields one audit row + one notice instead of N.
Fail-before/pass-after tests cover the degrade preference, the pool-exhausted suppression + dedup, and real failover to an operator-parked-but-healthy peer.
Known limitation
sluice's pooled token-host phantom expansion currently also rewrites non-refresh OAuth grants (e.g. device_code) to the pool's shared token host, which breaks a fresh in-container OAuth login for a pooled provider. Perform the initial login outside the proxy (or before binding the pool). Tracked for a follow-up.
v0.15.1
Bug Fixes
- proxy: SSH jump host close race that dropped the agent's exec reply (#41)
When an upstream SSH server replied + wrote data + sent exit-status + closed the channel in one burst, sluice's wait on the three upstream-to-agent goroutines completed and closed srcChan while the agent-to-upstream forwarder was still mid-reply for the agent's exec request. The agent's session.SendRequest("exec", true, ...) observed SSH_MSG_CHANNEL_CLOSE before the SUCCESS reply landed on ch.msg, gossh surfaced the closed channel as io.EOF, and session.Output("whoami") failed with EOF even though the upstream succeeded. The fix tracks in-flight agent-to-upstream requests with a mutex+cond barrier so the close path drains any pending reply before srcChan.CloseWrite() / srcChan.Close(). Symptom was visible on the CI e2e-linux runners since d27b05e narrowed the close timing window; production SSH clients have enough natural latency between reply and close that the race window almost never opens.
v0.15.0
Bug Fixes
- proxy: URL-encoded phantom tokens now swap correctly in
application/x-www-form-urlencodedbodies, URL query strings, URL paths, request headers, streaming bodies, QUIC/HTTP3 paths, and WebSocket text frames (#40)
OAuth refresh round-trips for providers that POST grant_type=refresh_token as form-urlencoded data (Anthropic Claude Code, Google) now go through the phantom swap cleanly. Previously the colon in SLUICE_PHANTOM:<name> got percent-encoded to %3A on the wire and the scanner missed it, so the upstream received the phantom verbatim and returned invalid_grant. The fix matches both casings (%3A and %3a) per RFC 3986 §2.1, uses path-correct escaping (PathEscape vs QueryEscape) so secrets containing spaces don't corrupt URL paths, and keeps secrets in byte slices that SecureBytes.Release() can zero.
Internals
- Each
phantomPairnow carries precomputedencodedPhantomandencodedPhantomLowerbyte slices populated once at pair construction time, so the hot-path swap reads precomputed bytes instead of recomputingurl.QueryEscapeon every request, header, and stream chunk - Added byte-in/byte-out
queryEscapeBytesandpathEscapeByteshelpers (RFC 3986 §2.3 unreserved sets) that replaceurl.QueryEscape(string(secret.Bytes()))patterns and avoid leaving immutable string copies of secrets on the heap swapPhantomBytesselects between query and path escaping via an explicitpathContext boolparameter instead of comparing the human-readable location label, so the type system enforces the encoding choice- Allocation-free fast paths in
encodePhantomForPairandencodePhantomLowerForPairskip the byte/string copy when the input contains no characters that would change under escape
v0.14.0
New Features
- CIDR rule destinations: rules whose destination contains a
/are now interpreted as CIDR (e.g.192.168.0.0/16,2001:db8::/32) and matched via IP containment instead of being treated as literal glob patterns (#39) - HTTP Host header peeking on port 80 / 8080: SOCKS5 CONNECT requests that arrive with a bare IP and a non-Allow / non-Deny verdict now defer policy evaluation, peek the request's
Hostheader, and re-evaluate against the recovered hostname. Mirrors the existing TLS SNI peek path. Eliminates the need for one approval rule per IP behind a hostname rule (e.g. tailscale's DERP probes hitting dozens ofderp[N].tailscale.comIPs) (#39)
Security hardening for the new HTTP Host path
- Spoofing guard verifies the recovered Host actually binds to the destination IP via the DNS interceptor's reverse cache or a forward DNS lookup. A claim like
Host: api.openai.comto an arbitrary IP is rejected before the verdict is upgraded (#39) - Peek failure on a deferred port-80 connection attaches a per-request policy checker bound to the IP destination so the broker still gets to ask, instead of silently upgrading the original Ask verdict to an allow (#39)
- HTTP-host deferral is gated on broker presence so Ask-without-broker continues to collapse to Deny via the IP-based path before SOCKS5 success goes out, avoiding success-then-reset on the client side (#39)
v0.13.2
Bug Fixes
- env-file ownership: chown the agent env file back to the runtime user after
docker execwrites it as root, sohermes claw migrateand other agent-side writes keep working (#38) - panic recovery in MITM response and stream paths: deferred recovers in
ResponseandStreamResponseModifierlog the stack and fall back to safe defaults so an OAuth handler panic no longer abandons the response body and triggers aJSONDecodeErrorin the agent (#38) - OAuth-vs-static header dispatch: header bindings on OAuth credentials no longer substitute the full JSON envelope into the request header. A new metadata-driven helper (
extractInjectableSecret) reads fromOAuthIndexto extract justaccess_tokenfor OAuth credentials and pass static credentials through unchanged. Mirrored into the QUIC proxy so HTTP/3 follows the same dispatch as HTTP/1 and HTTP/2 (#38) - stream OAuth body leak guard: when a panic fires after
io.ReadAllbut beforeswapOAuthTokensreturns, the recover no longer hands the agent the raw upstream bytes (which would contain real access and refresh tokens). The fallback is nowhttp.NoBodyuntil a successful swap produces a phantom-only buffer (#38) - nil-input guard ordering: the
StreamResponseModifiernil-input check now runs before the flow-nil early return, so a call with bothf == nilandin == nilreturnshttp.NoBodyinstead of a nil reader (#38)
v0.13.1
Bug Fixes
Hermes deployment fixes that emerged from running v0.13.0 in production. All six are real correctness or compatibility issues, not just polish.
- OAuth response handler: now decompresses gzip/br/deflate before parsing the token JSON, and is wrapped in a deferred recover with snapshot/rollback so a malformed body cannot panic the proxy or leave a half-rewritten response with stripped encoding headers. Reproduced live against
auth.openai.comwhich returns gzip by default. - Env-file marker block: sluice now writes phantom tokens into a fenced
BEGIN sluice-managed/END sluice-managedblock and replaces only that block on each call. Foreign keys (set byhermes claw migrate, the agent's own auth flow, or an operator) are preserved across both incremental updates and full reconciliation runs. Values are written single-quoted so the file is safe under both shellsourceand dotenv parsing. - MCP gateway always mounts: the
/mcpendpoint used to mount only when a sluice MCP upstream was registered. Agents that registered sluice as an MCP server (the documented setup) hit a 404 before the operator could add the first upstream. Now the gateway always starts; with zero upstreams it exposes an empty tool list. - HermesProfile WireMCPCmd uses the bundled venv: a sh wrapper activates
/opt/hermes/.venvwhen present so PyYAML is on the import path inside the official Hermes image. Native installs without the venv keep working via the systempython3. - SSH proxy exit-status race:
sshHandleChannelpreviously calledsrcChan.CloseWritefrom the upstream→agent data-copy goroutine the moment it saw EOF, racing the request-forwarder writing exit-status on the same channel. Fix holds the agent-side stdout EOF until every upstream→agent goroutine has drained, then issuesCloseWritefollowed byClose. Stdin direction is unchanged so upstream commands likecatstill terminate correctly. - Configurable Telegram agent label: approval messages used to read "OpenClaw wants to connect to..." regardless of the active profile. New
SetAgentDisplayNameis wired from the--agentflag at startup. Hermes deployments now read "Hermes wants to connect to...". The display name is HTML-escaped at render time.
Deploy files
The repo's compose.yml, compose.dev.yml, and Caddyfile switch to the Hermes stack as the supported deployment. A new bootstrap.sh runs hermes claw migrate against an existing OpenClaw home volume one time, then patches mcp_servers.sluice.url into ~/.hermes/config.yaml. Caddy cert paths moved off provider-specific /etc/cloudflare/... to standard FHS /etc/ssl/certs/agent.pem + /etc/ssl/private/agent.key.
Operators on OpenClaw who want to keep the v0.12.x deployment shape can pin to ghcr.io/nnemirovsky/sluice:0.12 and use the compose / Caddyfile from any v0.12.x tag.
v0.13.0
New Features
- add agent profile abstraction for Hermes support #36 @nnemirovsky
Sluice now supports nousresearch/hermes-agent as a first-class target alongside OpenClaw. The container managers (Docker, Apple Container, tart) consume an AgentProfile that captures the env file path, secrets-reload mechanism, and MCP wiring command for one agent runtime. Select with --agent <name> (or SLUICE_AGENT_PROFILE); default is openclaw so existing setups are unaffected.
The Hermes profile writes phantom tokens to ~/.hermes/.env and patches mcp_servers.<name>.url in ~/.hermes/config.yaml via an embedded python3 + pyyaml script. Hermes has no documented in-place secret reload, so new env values take effect on the next agent message; for MCP changes, run /reload-mcp from the Hermes chat session or restart the container once after first wire-up.
Adding a third agent profile is a single edit to internal/container/agent_profile.go.
v0.12.0
New Features
- add /mcp list, add, remove commands #35 @nnemirovsky
Improvements
- bump go-mitmproxy to v1.8.11 #34 @nnemirovsky
v0.11.0
New Features
- add ExecInspector for trampoline and dangerous pattern detection in MCP tool arguments
- add MITM response DLP scanning for HTTPS response bodies and headers
- add
sluice policy add redactCLI subcommand and/policy redactTelegram command
Details
- ExecInspector (
internal/mcp/exec_inspect.go) detects trampoline patterns (bash -c,python -c), dangerous commands (rm -rf /,chmod 0?[0-7]?777,curl | sh, fork bombs), env overrides (GIT_SSH_COMMAND,LD_PRELOAD,DYLD_INSERT_LIBRARIES), and shell metacharacters. Field-scoped scanning with recursion into nested maps (wrapped schemas), case-insensitive slot matching, and split-argv reconstruction acrosscommand+args. Default tool-name patterns anchored to the MCP__separator to avoid false positives on tools likeshellcheck. - Response DLP (
internal/proxy/response_dlp.go) runs per-response regex scan of buffered response bodies and headers usingInspectRedactRulerows from the policy store. Supportsgzip,br,deflate(zlib-wrapped per RFC 9110), andzstd. Handles up to 2 stacked Content-Encoding layers. Bounded decompression viaio.LimitReadercapped atmaxProxyBody(16 MiB). Distinct from phantom-token stripping, which protects outbound requests. This protects the agent from seeing real credentials leaked by upstreams in responses. - Rule management across all channels. New CLI subcommand
sluice policy add redact <pattern> --replacement "[REDACTED_X]"and Telegram/policy redact <pattern> [replacement]. HTTP API already supported this viaPOST /api/ruleswithverdict: "redact". TOML import/export continues to work via[[redact]]blocks. SIGHUP reloads rebuild the engine and atomically swap viaatomic.Pointer. - Audit redaction.
exec_blockaudit events include only the attack category (trampoline, dangerous_cmd, env_override, metachar), never the raw matched content, so audit logs cannot leak credentials embedded in blocked payloads.
Known limitation
Responses with Content-Type: text/event-stream or bodies exceeding go-mitmproxy's StreamLargeBodies (5 MiB) enter streaming mode, which skips the buffered DLP scan. A one-per-connection WARNING log fires when DLP rules are configured but the response streams. Stream-aware DLP is listed as Future work.
PR: #33