Skip to content

Releases: nnemirovsky/sluice

v0.16.0

17 May 02:26
70fcff3

Choose a tag to compare

Credential pools with automatic failover

One phantom identity the agent holds can now be backed by N real OAuth credentials, with transparent auto-failover when one is rate-limited or its auth fails.

  • Pool model: sluice pool create|list|status|rotate|remove. A pool maps a single pool-stable phantom (byte-identical synthetic JWT, R3) to the currently active member; refreshed tokens are attributed back to the issuing member via the injected refresh token (R1, fail-closed).
  • Auto-failover (Phase 2): 429 / 403 insufficient_quota → rate-limited; 401 / token-body invalid_grant/invalid_token → auth-failure. The active member is cooled and switched synchronously in-memory before the response returns, so the agent's own retry lands on the next member. Monotonic cooldown extension; lazy recovery; degrade-never-hard-fail.
  • QUIC request-side pool awareness + the R3 pool-stable phantom (response-side R1/failover is HTTP-only by capability).
  • Data model: migrations 000006_credential_pools, 000007_pool_membership_epoch.

Approval-prompt coalescing

Concurrent approvals to the same dest:port collapse into one Telegram prompt; one tap dismisses the whole burst, and the final coalesced count is folded into the existing resolve/cancel edit (zero extra Telegram API calls). MCP tool calls opt out (arg-sensitive). QUIC keeps its own packet-dedup.

Failover hardening (fixes found in live operation)

  • Operator-park stranding fixed: pool rotate parks the displaced member with reason manual rotate — that member is healthy, just deprioritized. The all-cooled degrade now prefers an operator-parked-but-healthy member over a genuinely-failed one, so a rotate onto an exhausted account fails over to the healthy peer instead of self-looping (which previously hard-failed the agent). Normal position-order rotate semantics are unchanged.
  • Self-failover spam fixed: when there is no distinct failover target (to == from), it is classified as pool exhaustion — a distinct pool_exhausted audit action and an honest "pool exhausted" operator notice instead of a meaningless self-referential cred_failover. FailoverEvent.Exhausted carries the distinction.
  • Notice dedup: identical (pool, from, to, tag) signals are deduplicated within a 30s window, so an agent retry storm yields one audit row + one notice instead of N.

Fail-before/pass-after tests cover the degrade preference, the pool-exhausted suppression + dedup, and real failover to an operator-parked-but-healthy peer.

Known limitation

sluice's pooled token-host phantom expansion currently also rewrites non-refresh OAuth grants (e.g. device_code) to the pool's shared token host, which breaks a fresh in-container OAuth login for a pooled provider. Perform the initial login outside the proxy (or before binding the pool). Tracked for a follow-up.

v0.15.1

12 May 04:14
a8317a5

Choose a tag to compare

Bug Fixes

  • proxy: SSH jump host close race that dropped the agent's exec reply (#41)

When an upstream SSH server replied + wrote data + sent exit-status + closed the channel in one burst, sluice's wait on the three upstream-to-agent goroutines completed and closed srcChan while the agent-to-upstream forwarder was still mid-reply for the agent's exec request. The agent's session.SendRequest("exec", true, ...) observed SSH_MSG_CHANNEL_CLOSE before the SUCCESS reply landed on ch.msg, gossh surfaced the closed channel as io.EOF, and session.Output("whoami") failed with EOF even though the upstream succeeded. The fix tracks in-flight agent-to-upstream requests with a mutex+cond barrier so the close path drains any pending reply before srcChan.CloseWrite() / srcChan.Close(). Symptom was visible on the CI e2e-linux runners since d27b05e narrowed the close timing window; production SSH clients have enough natural latency between reply and close that the race window almost never opens.

v0.15.0

12 May 02:30
55092f4

Choose a tag to compare

Bug Fixes

  • proxy: URL-encoded phantom tokens now swap correctly in application/x-www-form-urlencoded bodies, URL query strings, URL paths, request headers, streaming bodies, QUIC/HTTP3 paths, and WebSocket text frames (#40)

OAuth refresh round-trips for providers that POST grant_type=refresh_token as form-urlencoded data (Anthropic Claude Code, Google) now go through the phantom swap cleanly. Previously the colon in SLUICE_PHANTOM:<name> got percent-encoded to %3A on the wire and the scanner missed it, so the upstream received the phantom verbatim and returned invalid_grant. The fix matches both casings (%3A and %3a) per RFC 3986 §2.1, uses path-correct escaping (PathEscape vs QueryEscape) so secrets containing spaces don't corrupt URL paths, and keeps secrets in byte slices that SecureBytes.Release() can zero.

Internals

  • Each phantomPair now carries precomputed encodedPhantom and encodedPhantomLower byte slices populated once at pair construction time, so the hot-path swap reads precomputed bytes instead of recomputing url.QueryEscape on every request, header, and stream chunk
  • Added byte-in/byte-out queryEscapeBytes and pathEscapeBytes helpers (RFC 3986 §2.3 unreserved sets) that replace url.QueryEscape(string(secret.Bytes())) patterns and avoid leaving immutable string copies of secrets on the heap
  • swapPhantomBytes selects between query and path escaping via an explicit pathContext bool parameter instead of comparing the human-readable location label, so the type system enforces the encoding choice
  • Allocation-free fast paths in encodePhantomForPair and encodePhantomLowerForPair skip the byte/string copy when the input contains no characters that would change under escape

v0.14.0

08 May 13:31
399807c

Choose a tag to compare

New Features

  • CIDR rule destinations: rules whose destination contains a / are now interpreted as CIDR (e.g. 192.168.0.0/16, 2001:db8::/32) and matched via IP containment instead of being treated as literal glob patterns (#39)
  • HTTP Host header peeking on port 80 / 8080: SOCKS5 CONNECT requests that arrive with a bare IP and a non-Allow / non-Deny verdict now defer policy evaluation, peek the request's Host header, and re-evaluate against the recovered hostname. Mirrors the existing TLS SNI peek path. Eliminates the need for one approval rule per IP behind a hostname rule (e.g. tailscale's DERP probes hitting dozens of derp[N].tailscale.com IPs) (#39)

Security hardening for the new HTTP Host path

  • Spoofing guard verifies the recovered Host actually binds to the destination IP via the DNS interceptor's reverse cache or a forward DNS lookup. A claim like Host: api.openai.com to an arbitrary IP is rejected before the verdict is upgraded (#39)
  • Peek failure on a deferred port-80 connection attaches a per-request policy checker bound to the IP destination so the broker still gets to ask, instead of silently upgrading the original Ask verdict to an allow (#39)
  • HTTP-host deferral is gated on broker presence so Ask-without-broker continues to collapse to Deny via the IP-based path before SOCKS5 success goes out, avoiding success-then-reset on the client side (#39)

v0.13.2

07 May 11:02
61f4c00

Choose a tag to compare

Bug Fixes

  • env-file ownership: chown the agent env file back to the runtime user after docker exec writes it as root, so hermes claw migrate and other agent-side writes keep working (#38)
  • panic recovery in MITM response and stream paths: deferred recovers in Response and StreamResponseModifier log the stack and fall back to safe defaults so an OAuth handler panic no longer abandons the response body and triggers a JSONDecodeError in the agent (#38)
  • OAuth-vs-static header dispatch: header bindings on OAuth credentials no longer substitute the full JSON envelope into the request header. A new metadata-driven helper (extractInjectableSecret) reads from OAuthIndex to extract just access_token for OAuth credentials and pass static credentials through unchanged. Mirrored into the QUIC proxy so HTTP/3 follows the same dispatch as HTTP/1 and HTTP/2 (#38)
  • stream OAuth body leak guard: when a panic fires after io.ReadAll but before swapOAuthTokens returns, the recover no longer hands the agent the raw upstream bytes (which would contain real access and refresh tokens). The fallback is now http.NoBody until a successful swap produces a phantom-only buffer (#38)
  • nil-input guard ordering: the StreamResponseModifier nil-input check now runs before the flow-nil early return, so a call with both f == nil and in == nil returns http.NoBody instead of a nil reader (#38)

v0.13.1

07 May 07:59
d27b05e

Choose a tag to compare

Bug Fixes

Hermes deployment fixes that emerged from running v0.13.0 in production. All six are real correctness or compatibility issues, not just polish.

  • OAuth response handler: now decompresses gzip/br/deflate before parsing the token JSON, and is wrapped in a deferred recover with snapshot/rollback so a malformed body cannot panic the proxy or leave a half-rewritten response with stripped encoding headers. Reproduced live against auth.openai.com which returns gzip by default.
  • Env-file marker block: sluice now writes phantom tokens into a fenced BEGIN sluice-managed / END sluice-managed block and replaces only that block on each call. Foreign keys (set by hermes claw migrate, the agent's own auth flow, or an operator) are preserved across both incremental updates and full reconciliation runs. Values are written single-quoted so the file is safe under both shell source and dotenv parsing.
  • MCP gateway always mounts: the /mcp endpoint used to mount only when a sluice MCP upstream was registered. Agents that registered sluice as an MCP server (the documented setup) hit a 404 before the operator could add the first upstream. Now the gateway always starts; with zero upstreams it exposes an empty tool list.
  • HermesProfile WireMCPCmd uses the bundled venv: a sh wrapper activates /opt/hermes/.venv when present so PyYAML is on the import path inside the official Hermes image. Native installs without the venv keep working via the system python3.
  • SSH proxy exit-status race: sshHandleChannel previously called srcChan.CloseWrite from the upstream→agent data-copy goroutine the moment it saw EOF, racing the request-forwarder writing exit-status on the same channel. Fix holds the agent-side stdout EOF until every upstream→agent goroutine has drained, then issues CloseWrite followed by Close. Stdin direction is unchanged so upstream commands like cat still terminate correctly.
  • Configurable Telegram agent label: approval messages used to read "OpenClaw wants to connect to..." regardless of the active profile. New SetAgentDisplayName is wired from the --agent flag at startup. Hermes deployments now read "Hermes wants to connect to...". The display name is HTML-escaped at render time.

Deploy files

The repo's compose.yml, compose.dev.yml, and Caddyfile switch to the Hermes stack as the supported deployment. A new bootstrap.sh runs hermes claw migrate against an existing OpenClaw home volume one time, then patches mcp_servers.sluice.url into ~/.hermes/config.yaml. Caddy cert paths moved off provider-specific /etc/cloudflare/... to standard FHS /etc/ssl/certs/agent.pem + /etc/ssl/private/agent.key.

Operators on OpenClaw who want to keep the v0.12.x deployment shape can pin to ghcr.io/nnemirovsky/sluice:0.12 and use the compose / Caddyfile from any v0.12.x tag.

#37 @nnemirovsky

v0.13.0

07 May 04:23
9cd1a3e

Choose a tag to compare

New Features

Sluice now supports nousresearch/hermes-agent as a first-class target alongside OpenClaw. The container managers (Docker, Apple Container, tart) consume an AgentProfile that captures the env file path, secrets-reload mechanism, and MCP wiring command for one agent runtime. Select with --agent <name> (or SLUICE_AGENT_PROFILE); default is openclaw so existing setups are unaffected.

The Hermes profile writes phantom tokens to ~/.hermes/.env and patches mcp_servers.<name>.url in ~/.hermes/config.yaml via an embedded python3 + pyyaml script. Hermes has no documented in-place secret reload, so new env values take effect on the next agent message; for MCP changes, run /reload-mcp from the Hermes chat session or restart the container once after first wire-up.

Adding a third agent profile is a single edit to internal/container/agent_profile.go.

v0.12.0

20 Apr 11:35
4943433

Choose a tag to compare

New Features

Improvements

v0.11.0

14 Apr 13:04
7e043b8

Choose a tag to compare

New Features

  • add ExecInspector for trampoline and dangerous pattern detection in MCP tool arguments
  • add MITM response DLP scanning for HTTPS response bodies and headers
  • add sluice policy add redact CLI subcommand and /policy redact Telegram command

Details

  • ExecInspector (internal/mcp/exec_inspect.go) detects trampoline patterns (bash -c, python -c), dangerous commands (rm -rf /, chmod 0?[0-7]?777, curl | sh, fork bombs), env overrides (GIT_SSH_COMMAND, LD_PRELOAD, DYLD_INSERT_LIBRARIES), and shell metacharacters. Field-scoped scanning with recursion into nested maps (wrapped schemas), case-insensitive slot matching, and split-argv reconstruction across command + args. Default tool-name patterns anchored to the MCP __ separator to avoid false positives on tools like shellcheck.
  • Response DLP (internal/proxy/response_dlp.go) runs per-response regex scan of buffered response bodies and headers using InspectRedactRule rows from the policy store. Supports gzip, br, deflate (zlib-wrapped per RFC 9110), and zstd. Handles up to 2 stacked Content-Encoding layers. Bounded decompression via io.LimitReader capped at maxProxyBody (16 MiB). Distinct from phantom-token stripping, which protects outbound requests. This protects the agent from seeing real credentials leaked by upstreams in responses.
  • Rule management across all channels. New CLI subcommand sluice policy add redact <pattern> --replacement "[REDACTED_X]" and Telegram /policy redact <pattern> [replacement]. HTTP API already supported this via POST /api/rules with verdict: "redact". TOML import/export continues to work via [[redact]] blocks. SIGHUP reloads rebuild the engine and atomically swap via atomic.Pointer.
  • Audit redaction. exec_block audit events include only the attack category (trampoline, dangerous_cmd, env_override, metachar), never the raw matched content, so audit logs cannot leak credentials embedded in blocked payloads.

Known limitation

Responses with Content-Type: text/event-stream or bodies exceeding go-mitmproxy's StreamLargeBodies (5 MiB) enter streaming mode, which skips the buffered DLP scan. A one-per-connection WARNING log fires when DLP rules are configured but the response streams. Stream-aware DLP is listed as Future work.

PR: #33

v0.10.2

13 Apr 06:24
6f673c4

Choose a tag to compare

Fixes

  • Stop Telegram from auto-linking destinations in /policy show, /policy allow, /policy deny, and approval prompts (#32). Destinations, host:port, and request URLs now render as inline monospace.