Skip to content

Commit 5c2415d

Browse files
authored
docs: update docs to remove KVBM cuda graph limitation (#4902)
1 parent 31f31e8 commit 5c2415d

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

docs/kvbm/trtllm-setup.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@ To learn what KVBM is, please check [here](kvbm_architecture.md)
2323

2424
> [!Note]
2525
> - Ensure that `etcd` and `nats` are running before starting.
26-
> - KVBM does not currently support CUDA graphs in TensorRT-LLM.
2726
> - KVBM only supports TensorRT-LLM’s PyTorch backend.
2827
> - Disable partial reuse `enable_partial_reuse: false` in the LLM API config’s `kv_connector_config` to increase offloading cache hits.
2928
> - KVBM requires TensorRT-LLM v1.1.0rc5 or newer.

0 commit comments

Comments
 (0)