Skip to content

Commit 03965f5

Browse files
committed
update
1 parent e650b93 commit 03965f5

File tree

4 files changed

+43
-41
lines changed

4 files changed

+43
-41
lines changed

docs/guides/request_plane.md

Lines changed: 36 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ limitations under the License.
2121

2222
Dynamo supports multiple transport mechanisms for its request plane (the communication layer between services). You can choose from three different request plane modes based on your deployment requirements:
2323

24-
- **NATS** (default): Message broker-based request plane
25-
- **TCP**: Direct TCP connection for optimal performance
24+
- **TCP** (default): Direct TCP connection for optimal performance
25+
- **NATS**: Message broker-based request plane
2626
- **HTTP**: HTTP/2-based request plane
2727

2828
This guide explains how to configure and use request plane in your Dynamo deployment.
@@ -59,51 +59,27 @@ export DYN_REQUEST_PLANE=<mode>
5959
```
6060

6161
Where `<mode>` is one of:
62-
- `nats` (default)
63-
- `tcp`
62+
- `tcp` (default)
63+
- `nats`
6464
- `http`
6565

6666
The value is case-insensitive.
6767

6868
### Default Behavior
6969

70-
If `DYN_REQUEST_PLANE` is not set or contains an invalid value, Dynamo defaults to `nats`.
70+
If `DYN_REQUEST_PLANE` is not set or contains an invalid value, Dynamo defaults to `tcp`.
7171

7272
## Usage Examples
7373

74-
### Using NATS (Default)
74+
### Using TCP (Default)
7575

76-
NATS is the default request plane and provides the most flexibility for complex deployments.
77-
78-
**Prerequisites:**
79-
- NATS server must be running and accessible
80-
- Configure NATS connection via standard Dynamo NATS environment variables
81-
82-
```bash
83-
# Explicitly set to NATS (optional, as it's the default)
84-
85-
# Run your Dynamo service
86-
DYN_REQUEST_PLANE=nats python -m dynamo.frontend --http-port=8000 &
87-
DYN_REQUEST_PLANE=nats python -m dynamo.vllm --model Qwen/Qwen3-0.6B
88-
```
89-
90-
**When to use NATS:**
91-
- Production deployments with service discovery
92-
- Currently (HA) highly available routers require durable messages persisted in NATS message broker. If you want to completely disable NATS, KV based routing won't be available
93-
- Multiple frontends and backends
94-
- Need for message replay and persistence features
95-
96-
Limitations:
97-
- NATS does not support payloads beyond 16MB (use TCP for larger payloads)
98-
99-
### Using TCP
100-
101-
TCP provides direct, low-latency communication between services.
76+
TCP is the default request plane and provides direct, low-latency communication between services.
10277

10378
**Configuration:**
10479

10580
```bash
106-
# Set request plane to TCP
81+
# TCP is the default, so no need to set DYN_REQUEST_PLANE explicitly
82+
# But you can explicitly set it if desired:
10783
export DYN_REQUEST_PLANE=tcp
10884

10985
# Optional: Configure TCP server host and port
@@ -170,6 +146,32 @@ Additional HTTP-specific environment variables:
170146
- `DYN_HTTP2_KEEP_ALIVE_TIMEOUT_SECS`: Keep-alive timeout for HTTP client (default: 10 seconds)
171147
- `DYN_HTTP2_ADAPTIVE_WINDOW`: Enable adaptive flow control (default: true)
172148

149+
### Using NATS
150+
151+
NATS provides the most flexibility for complex deployments with advanced routing.
152+
153+
**Prerequisites:**
154+
- NATS server must be running and accessible
155+
- Configure NATS connection via standard Dynamo NATS environment variables
156+
157+
```bash
158+
# Explicitly set to NATS
159+
export DYN_REQUEST_PLANE=nats
160+
161+
# Run your Dynamo service
162+
DYN_REQUEST_PLANE=nats python -m dynamo.frontend --http-port=8000 &
163+
DYN_REQUEST_PLANE=nats python -m dynamo.vllm --model Qwen/Qwen3-0.6B
164+
```
165+
166+
**When to use NATS:**
167+
- Production deployments with service discovery
168+
- Currently (HA) highly available routers require durable messages persisted in NATS message broker. If you want to completely disable NATS, KV based routing won't be available
169+
- Multiple frontends and backends
170+
- Need for message replay and persistence features
171+
172+
Limitations:
173+
- NATS does not support payloads beyond 16MB (use TCP for larger payloads)
174+
173175
## Complete Example
174176

175177
Here's a complete example showing how to launch a Dynamo deployment with different request planes:
@@ -219,7 +221,7 @@ This abstraction means your application code doesn't need to change when switchi
219221

220222
Request plane configuration is loaded from environment variables at startup and cached globally. The configuration hierarchy is:
221223

222-
1. **Mode Selection**: `DYN_REQUEST_PLANE` (defaults to `nats`)
224+
1. **Mode Selection**: `DYN_REQUEST_PLANE` (defaults to `tcp`)
223225
2. **Transport-Specific Config**: Mode-specific environment variables (e.g., `DYN_TCP_*`, `DYN_HTTP2_*`)
224226

225227
## Migration Guide

launch/dynamo-run/src/flags.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ pub struct Flags {
127127
pub store_kv: String,
128128

129129
/// Determines how requests are distributed from routers to workers. 'tcp' is fastest [nats|http|tcp].
130-
#[arg(long, default_value = "nats", value_parser = ["nats", "http", "tcp"])]
130+
#[arg(long, default_value = "tcp", value_parser = ["nats", "http", "tcp"])]
131131
pub request_plane: String,
132132

133133
/// Everything after a `--`. Not currently used.

lib/runtime/src/distributed.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -555,12 +555,12 @@ impl DistributedConfig {
555555
/// Request plane transport mode configuration
556556
///
557557
/// This determines how requests are distributed from routers to workers:
558-
/// - `Nats`: Use NATS for request distribution (default, legacy)
558+
/// - `Nats`: Use NATS for request distribution (legacy)
559559
/// - `Http`: Use HTTP/2 for request distribution
560-
/// - `Tcp`: Use raw TCP for request distribution with msgpack support
560+
/// - `Tcp`: Use raw TCP for request distribution with msgpack support (default)
561561
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
562562
pub enum RequestPlaneMode {
563-
/// Use NATS for request plane (default for backward compatibility)
563+
/// Use NATS for request plane
564564
Nats,
565565
/// Use HTTP/2 for request plane
566566
Http,
@@ -570,7 +570,7 @@ pub enum RequestPlaneMode {
570570

571571
impl Default for RequestPlaneMode {
572572
fn default() -> Self {
573-
Self::Nats
573+
Self::Tcp
574574
}
575575
}
576576

tests/router/common.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -332,12 +332,12 @@ async def send_request_with_retry(url: str, payload: dict, max_retries: int = 8)
332332
return False
333333

334334

335-
def get_runtime(store_backend="etcd", request_plane="nats"):
335+
def get_runtime(store_backend="etcd", request_plane="tcp"):
336336
"""Create a DistributedRuntime instance for testing.
337337
338338
Args:
339339
store_backend: Storage backend to use ("etcd" or "file"). Defaults to "etcd".
340-
request_plane: How frontend talks to backend ("tcp", "http" or "nats"). Defaults to "nats".
340+
request_plane: How frontend talks to backend ("tcp", "http" or "nats"). Defaults to "tcp".
341341
"""
342342
try:
343343
# Try to get running loop (works in async context)

0 commit comments

Comments
 (0)