Skip to content

Commit bff07b5

Browse files
committed
fix: Set TRTLLM_USE_UCX_KVCACHE=1 for reducing gpu mem usage
Signed-off-by: Jacky <[email protected]>
1 parent 8c37f8e commit bff07b5

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

tests/fault_tolerance/cancellation/test_trtllm.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ def __init__(self, request, mode: str = "prefill_and_decode"):
8181

8282
# Set debug logging environment
8383
env = os.environ.copy()
84+
env["TRTLLM_USE_UCX_KVCACHE"] = "1"
8485
env["DYN_LOG"] = "debug"
8586
env["DYN_SYSTEM_USE_ENDPOINT_HEALTH_STATUS"] = '["generate"]'
8687
env["DYN_SYSTEM_PORT"] = port

0 commit comments

Comments
 (0)