-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Observed behavior
When hub-to-leaf traffic exceeds link capacity, the hub's write buffer saturates. PONG responses queue behind data and never reach the leaf. The leaf detects stale, disconnects, reconnects, and hits the same backlog. Cycle repeats indefinitely - data never transfers.
- NATS Server 2.12.2 (also tested on 2.10.22)
- Bandwidth-constrained leafnode link (~50kbit/s, simulating cellular)
Hub logs:
[INF] Slow Consumer Detected: WriteDeadline of 10s exceeded with 1 chunks of 2025807 total bytes.
[ERR] Leafnode Error 'Stale Connection'
Leaf logs (note cycling lid numbers):
[DBG] hub:7422 - lid:6 - Stale Client Connection - Closing
[INF] hub:7422 - lid:6 - Leafnode connection closed: Stale Connection - Remote: hub
[DBG] hub:7422 - lid:13 - Stale Client Connection - Closing
[INF] hub:7422 - lid:14 - Leafnode connection closed: Stale Connection - Remote: hub
Root Cause
Hub's write buffer saturation blocks PONG delivery:
- Hub's write buffer fills with data (hub -> leaf direction)
- Hub receives PING from leaf, queues PONG immediately (32us later)
- PONG stuck behind data - never delivered
- Leaf detects stale (no PONG after
ping_maxattempts), closes connection - Leaf reconnects, hits same backlog, cycle repeats
Verified Evidence
| Metric | Value |
|---|---|
| LMSGs queued by hub | 1917 |
| LMSGs received by leaf | 139 (7%) |
| PONGs sent by hub | 3 |
| PONGs received by leaf | 1 (33%) |
| Stale events on leaf | 8 |
| Connection cycles | 5 |
| Buffer backlog | ~29 seconds |
With ping_interval: 10s and ping_max: 2, stale triggers in 20 seconds - but 29 seconds of data sits ahead of the PONG.
Affected Scenarios
Any high-volume hub-to-leaf traffic over a slow link:
- JetStream mirrors/sources (bulk historical replication)
- Fan-out (hub publishes to leaf subscribers)
- Large request/reply responses
- KV replication
Expected behavior
Leafnode connection should remain stable during high-volume replication. PING/PONG keepalive should function regardless of data backlog, allowing slow links to eventually drain.
Server and client version
Server v2.12.2
Client Leafnode also v2.12.2
Host environment
Bandwidth-constrained leafnode link (~50kbit/s, simulating cellular)
Reproduced in linux docker containers on MacOS ARM
Steps to reproduce
Will link repro repo in a comment.
docker compose up -d && sleep 5
# Create stream with 50MB test data
nats -s nats://app:[email protected]:4222 stream add TEST \
--subjects="test.>" --storage=file --replicas=1 --defaults
nats -s nats://app:[email protected]:4222 pub test.load \
"$(head -c 1024 < /dev/zero | tr '\0' 'x')" --count=50000
# Apply traffic shaping (50kbit/s)
docker exec nats-hub tc qdisc add dev eth0 root tbf rate 50kbit burst 16kbit latency 100ms
docker exec nats-leaf tc qdisc add dev eth0 root tbf rate 50kbit burst 16kbit latency 100ms
# Create mirror (triggers replication)
cat > mirror.json << 'EOF'
{"name":"TEST_MIRROR","storage":"file","mirror":{"name":"TEST","external":{"api":"$JS.hub.API"}}}
EOF
nats -s nats://app:[email protected]:4223 stream add --config=mirror.json
# Watch for stale events (~30-60 seconds)
watch -n5 'docker logs nats-hub 2>&1 | grep -E "Stale|Slow" | tail -5'