test: Pre-download models before tests are ran #4811

kthui · 2025-12-08T23:40:04Z

Overview:

Pre-download models before the first test case is ran, avoid model download time from be counted against the first test case.

Details:

Added pytest_runtestloop that checks for model marker and download those models before any test case is started.

The new pre-download is verified on CI, over pre_merge:

============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-8.4.2, pluggy-1.6.0 -- /opt/dynamo/venv/bin/python3
cachedir: /tmp/.pytest_cache
benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /workspace
plugins: anyio-4.12.0, forked-1.6.0, asyncio-1.3.0, pytest_httpserver-1.1.3, md-report-0.7.0, mock-3.15.1, timeout-2.4.0, pytest_codeblocks-0.17.0, cov-7.0.0, xdist-3.8.0, benchmark-5.2.3, ai-dynamo-0.7.0, mypy-1.0.1
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function

collecting ... collected 450 items / 427 deselected / 23 selected
Fetching 10 files:   0%|          | 0/10 [00:00<?, ?it/s]
Fetching 10 files:  10%|█         | 1/10 [00:00<00:05,  1.56it/s]
Fetching 10 files:  70%|███████   | 7/10 [00:02<00:01,  2.64it/s]
Fetching 10 files: 100%|██████████| 10/10 [00:02<00:00,  3.67it/s]

Fetching 10 files:   0%|          | 0/10 [00:00<?, ?it/s]
Fetching 10 files:  10%|█         | 1/10 [00:00<00:01,  7.92it/s]
Fetching 10 files:  60%|██████    | 6/10 [00:04<00:03,  1.16it/s]
Fetching 10 files: 100%|██████████| 10/10 [00:04<00:00,  2.03it/s]

components/src/dynamo/vllm/tests/test_vllm_prometheus_utils.py::TestGetPrometheusExpfmt::test_vllm_use_case PASSED [  4%]
components/src/dynamo/vllm/tests/test_vllm_prometheus_utils.py::TestGetPrometheusExpfmt::test_error_handling PASSED [  8%]
components/src/dynamo/vllm/tests/test_vllm_unit.py::test_custom_jinja_template_invalid_path PASSED [ 13%]
components/src/dynamo/vllm/tests/test_vllm_unit.py::test_custom_jinja_template_valid_path PASSED [ 17%]
components/src/dynamo/vllm/tests/test_vllm_unit.py::test_custom_jinja_template_env_var_expansion PASSED [ 21%]
tests/dependencies/test_kvbm_imports.py::test_kvbm_wheel_exists_vllm PASSED [ 26%]
tests/dependencies/test_kvbm_imports.py::test_kvbm_imports_vllm PASSED   [ 30%]
tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_aggregated PASSED [ 34%]
tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_decode_cancel PASSED [ 39%]
tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_prefill_cancel PASSED [ 43%]
tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_worker_failure PASSED [ 47%]
tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_graceful_shutdown PASSED [ 52%]
tests/fault_tolerance/migration/test_vllm.py::test_no_request_migration_vllm_worker_failure PASSED [ 56%]
tests/fault_tolerance/migration/test_vllm.py::test_no_request_migration_vllm_graceful_shutdown PASSED [ 60%]
tests/profiler/test_profile_sla_dryrun.py::TestProfileSLADryRun::test_vllm_dryrun PASSED [ 65%]
tests/profiler/test_profile_sla_dryrun.py::TestProfileSLADryRun::test_profile_with_autogen_search_space_h100 PASSED [ 69%]
tests/router/test_router_e2e_with_vllm.py::test_vllm_kv_router_basic PASSED [ 73%]
tests/router/test_router_e2e_with_vllm.py::test_router_decisions_vllm_multiple_workers PASSED [ 78%]
tests/router/test_router_e2e_with_vllm.py::test_vllm_indexers_sync PASSED [ 82%]
tests/serve/test_vllm.py::test_serve_deployment[aggregated] PASSED       [ 86%]
tests/serve/test_vllm.py::test_serve_deployment[aggregated_lmcache] PASSED [ 91%]
tests/serve/test_vllm.py::test_serve_deployment[agg-request-plane-tcp] PASSED [ 95%]
tests/serve/test_vllm.py::test_serve_deployment[agg-request-plane-http] PASSED [100%]

=============================== warnings summary ===============================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
------ generated xml file: /workspace/test-results/pytest_test_report.xml ------
============================= slowest 10 durations =============================
182.64s call     tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_graceful_shutdown
180.68s call     tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_worker_failure
95.23s call     tests/serve/test_vllm.py::test_serve_deployment[aggregated]
95.12s call     tests/serve/test_vllm.py::test_serve_deployment[agg-request-plane-http]
94.10s call     tests/serve/test_vllm.py::test_serve_deployment[agg-request-plane-tcp]
91.60s call     tests/serve/test_vllm.py::test_serve_deployment[aggregated_lmcache]
85.69s call     tests/fault_tolerance/migration/test_vllm.py::test_no_request_migration_vllm_graceful_shutdown
75.50s call     tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_decode_cancel
75.21s call     tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_prefill_cancel
68.23s call     tests/fault_tolerance/cancellation/test_vllm.py::test_request_cancellation_vllm_aggregated
========= 23 passed, 427 deselected, 2 warnings in 1425.10s (0:23:45) ==========
sys:1: DeprecationWarning: builtin type swigvarlink has no __module__ attribute
🧪 Tests completed with exit code: 0

https://github.com/ai-dynamo/dynamo/actions/runs/20046561895/job/57493248332?pr=4811#step:8:108

Where should the reviewer start?

N/A

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A

Summary by CodeRabbit

Tests
- Refactored test infrastructure to optimize model download timing and eliminate redundant setup operations across test suites.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…begin Signed-off-by: Jacky <[email protected]>

Signed-off-by: Jacky <[email protected]>

This reverts commit 6924769. Signed-off-by: Jacky <[email protected]>

coderabbitai · 2025-12-09T00:41:49Z

Walkthrough

A new pytest hook is added to conftest.py to download models before test execution with timing measurements. Simultaneously, the predownload_models fixture parameter is removed from test function signatures across fault tolerance test suites, as the hook now handles downloads globally.

Changes

Cohort / File(s)	Summary
Pytest Configuration `tests/conftest.py`	Added `pytest_runtestloop(session)` hook that runs after test collection to download models with timing and logging. Imported `time` module for measurement.
Fault Tolerance Cancellation Tests `tests/fault_tolerance/cancellation/test_sglang.py`, `test_trtllm.py`, `test_vllm.py`	Removed `predownload_models` fixture parameter from test function signatures across 9 test functions (2 in sglang, 4 in trtllm, 3 in vllm). Test bodies unchanged.
Fault Tolerance Migration Tests `tests/fault_tolerance/migration/test_sglang.py`, `test_trtllm.py`, `test_vllm.py`	Removed `predownload_models` fixture parameter from test function signatures across 12 test functions (4 per file). Test bodies unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

pytest hook logic: Verify that pytest_runtestloop(session) correctly retrieves models_to_download from config, executes before tests, handles errors gracefully, and timing measurements are accurate
Parameter removals: Confirm the predownload_models dependency is truly obsolete across all modified tests and no test logic inadvertently relied on this fixture
Integration consistency: Ensure the global hook covers all models required by the removed fixture across different test backends (sglang, trtllm, vllm)

Poem

A hook hops in before the tests take flight,
Downloads all the models, sets things right,
The fixture fades, no longer needed here,
Pre-downloads run once, crystal clear! 🐰📥

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding pre-download of models before test execution to avoid counting download time against test timeouts.
Description check	✅ Passed	The description follows the template structure with Overview, Details, Where to start, and Related Issues sections. It explains the change, provides CI verification, and summarizes the implementation.
Docstring Coverage	✅ Passed	Docstring coverage is 95.83% which is sufficient. The required threshold is 80.00%.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80dfb82 and 6144d78.

📒 Files selected for processing (7)

tests/conftest.py (2 hunks)
tests/fault_tolerance/cancellation/test_sglang.py (2 hunks)
tests/fault_tolerance/cancellation/test_trtllm.py (4 hunks)
tests/fault_tolerance/cancellation/test_vllm.py (3 hunks)
tests/fault_tolerance/migration/test_sglang.py (4 hunks)
tests/fault_tolerance/migration/test_trtllm.py (4 hunks)
tests/fault_tolerance/migration/test_vllm.py (4 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-10-03T01:53:15.023Z

Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 3391
File: tests/fault_tolerance/cancellation/utils.py:323-390
Timestamp: 2025-10-03T01:53:15.023Z
Learning: In `tests/fault_tolerance/cancellation/utils.py`, the `poll_for_pattern` function's default `max_wait_ms` of 500ms is intentionally set to detect failures in cancellation signal propagation to TRT-LLM. This timeout covers only the time for the cancellation signal to reach TRT-LLM (not any generation routine), and if cancellation takes longer than 0.5s to propagate, it should be considered a test failure.

Applied to files:

tests/fault_tolerance/cancellation/test_trtllm.py

🧬 Code graph analysis (3)

tests/fault_tolerance/cancellation/test_sglang.py (4)

tests/conftest.py (1)

runtime_services (416-419)

tests/frontend/grpc/test_tensor_mocker_engine.py (1)

runtime_services (105-109)

tests/frontend/test_completion_mocker_engine.py (1)

runtime_services (137-141)

tests/frontend/test_vllm.py (1)

runtime_services (173-177)

tests/fault_tolerance/cancellation/test_vllm.py (4)

tests/conftest.py (2)

runtime_services (416-419)

set_ucx_tls_no_mm (82-93)

tests/frontend/grpc/test_tensor_mocker_engine.py (1)

runtime_services (105-109)

tests/frontend/test_completion_mocker_engine.py (1)

runtime_services (137-141)

tests/frontend/test_vllm.py (1)

runtime_services (173-177)

tests/fault_tolerance/migration/test_trtllm.py (1)

tests/conftest.py (2)

runtime_services (416-419)

set_ucx_tls_no_mm (82-93)

🪛 Ruff (0.14.8)

tests/fault_tolerance/cancellation/test_sglang.py

164-164: Unused function argument: runtime_services

(ARG001)

248-248: Unused function argument: runtime_services

(ARG001)

tests/fault_tolerance/migration/test_vllm.py

118-118: Unused function argument: runtime_services

(ARG001)

118-118: Unused function argument: set_ucx_tls_no_mm

(ARG001)

162-162: Unused function argument: runtime_services

(ARG001)

162-162: Unused function argument: set_ucx_tls_no_mm

(ARG001)

210-210: Unused function argument: runtime_services

(ARG001)

210-210: Unused function argument: set_ucx_tls_no_mm

(ARG001)

270-270: Unused function argument: runtime_services

(ARG001)

270-270: Unused function argument: set_ucx_tls_no_mm

(ARG001)

tests/fault_tolerance/cancellation/test_vllm.py

137-137: Unused function argument: runtime_services

(ARG001)

210-210: Unused function argument: runtime_services

(ARG001)

210-210: Unused function argument: set_ucx_tls_no_mm

(ARG001)

280-280: Unused function argument: runtime_services

(ARG001)

280-280: Unused function argument: set_ucx_tls_no_mm

(ARG001)

tests/fault_tolerance/migration/test_sglang.py

118-118: Unused function argument: runtime_services

(ARG001)

118-118: Unused function argument: set_ucx_tls_no_mm

(ARG001)

162-162: Unused function argument: runtime_services

(ARG001)

162-162: Unused function argument: set_ucx_tls_no_mm

(ARG001)

210-210: Unused function argument: runtime_services

(ARG001)

210-210: Unused function argument: set_ucx_tls_no_mm

(ARG001)

270-270: Unused function argument: runtime_services

(ARG001)

270-270: Unused function argument: set_ucx_tls_no_mm

(ARG001)

tests/fault_tolerance/migration/test_trtllm.py

114-114: Unused function argument: runtime_services

(ARG001)

114-114: Unused function argument: set_ucx_tls_no_mm

(ARG001)

158-158: Unused function argument: runtime_services

(ARG001)

158-158: Unused function argument: set_ucx_tls_no_mm

(ARG001)

206-206: Unused function argument: runtime_services

(ARG001)

206-206: Unused function argument: set_ucx_tls_no_mm

(ARG001)

266-266: Unused function argument: runtime_services

(ARG001)

266-266: Unused function argument: set_ucx_tls_no_mm

(ARG001)

tests/fault_tolerance/cancellation/test_trtllm.py

144-144: Unused function argument: runtime_services

(ARG001)

216-216: Unused function argument: runtime_services

(ARG001)

287-287: Unused function argument: runtime_services

(ARG001)

372-372: Unused function argument: runtime_services

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: trtllm (amd64)
GitHub Check: sglang (amd64)
GitHub Check: sglang (arm64)
GitHub Check: trtllm (arm64)
GitHub Check: vllm (amd64)
GitHub Check: vllm (arm64)
GitHub Check: operator (arm64)
GitHub Check: operator (amd64)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (9)

tests/conftest.py (2)

20-20: LGTM: Import supports timing measurements.

The time import is correctly added to support the model download timing measurements in the new pytest_runtestloop hook.

230-248: LGTM: Hook correctly pre-downloads models before test execution.

The pytest_runtestloop hook is properly implemented to download models after collection but before any tests run. This effectively decouples model download time from individual test timeouts, addressing the PR objective.

Key aspects verified:

Hook retrieves models collected by pytest_collection_modifyitems (lines 208-228)

Timing measurements are logged appropriately

Hook implicitly returns None, allowing default test loop execution to proceed

tests/fault_tolerance/cancellation/test_sglang.py (2)

164-164: LGTM: Fixture removal aligns with new pre-download hook.

The predownload_models fixture has been correctly removed from the test signature. Model downloads are now handled by the pytest_runtestloop hook in conftest.py, which downloads models before any tests execute.

Note: The static analysis hint about runtime_services being unused is a false positive. This fixture provides essential side effects (starting NATS and etcd servers) via its context manager, even though the yielded values aren't directly referenced in the test body.

248-248: LGTM: Fixture removal aligns with new pre-download hook.

The predownload_models fixture has been correctly removed from the test signature, consistent with the new model download strategy.

Note: The static analysis hint about runtime_services being unused is a false positive - this fixture is used for its side effects.

tests/fault_tolerance/cancellation/test_trtllm.py (1)

144-144: LGTM: Fixture removals align with new pre-download hook.

The predownload_models fixture has been correctly removed from all four test function signatures. Model downloads are now centrally handled by the pytest_runtestloop hook in conftest.py, which pre-downloads models before test execution begins.

Note: The static analysis hints about runtime_services being unused are false positives. This fixture provides essential side effects (starting NATS and etcd servers) through its context manager setup/teardown, even though the yielded values aren't explicitly referenced in the test bodies.

Also applies to: 216-216, 287-287, 372-372

tests/fault_tolerance/cancellation/test_vllm.py (1)

137-137: LGTM: Fixture removals align with new pre-download hook.

The predownload_models fixture has been correctly removed from all three test function signatures. Models are now pre-downloaded via the pytest_runtestloop hook before test execution.

Note: The static analysis hints about runtime_services and set_ucx_tls_no_mm being unused are false positives. Both fixtures provide essential side effects:

runtime_services: Starts NATS and etcd servers via context manager

set_ucx_tls_no_mm: Sets the UCX_TLS environment variable to mitigate UCX/NIXL mm transport issues (documented in conftest.py lines 81-93)

Also applies to: 210-210, 280-280

tests/fault_tolerance/migration/test_sglang.py (1)

118-118: LGTM: Fixture removals align with new pre-download hook.

The predownload_models fixture has been correctly removed from all four test function signatures, consistent with the new centralized model download approach via the pytest_runtestloop hook.

Note: The static analysis hints about runtime_services and set_ucx_tls_no_mm being unused are false positives. Both fixtures provide essential side effects through their setup/teardown logic, even though their yielded values aren't explicitly used in the test bodies.

Also applies to: 162-162, 210-210, 270-270

tests/fault_tolerance/migration/test_trtllm.py (1)

114-114: LGTM: Fixture removals align with new pre-download hook.

The predownload_models fixture has been correctly removed from all four test function signatures. Model pre-downloading is now handled centrally by the pytest_runtestloop hook before test execution begins.

Note: The static analysis hints about runtime_services and set_ucx_tls_no_mm being unused are false positives. Both fixtures are used for their side effects (service orchestration and environment configuration), not their return values.

Also applies to: 158-158, 206-206, 266-266

tests/fault_tolerance/migration/test_vllm.py (1)

118-118: LGTM: Fixture removals align with new pre-download hook.

The predownload_models fixture has been correctly removed from all four test function signatures. The new pytest_runtestloop hook in conftest.py now handles model pre-downloading before any tests execute, ensuring download time doesn't count against individual test timeouts.

Note: The static analysis hints about runtime_services and set_ucx_tls_no_mm being unused are false positives. Both fixtures provide essential side effects through their setup/teardown mechanisms, even though their yielded values aren't directly referenced in the test bodies.

Also applies to: 162-162, 210-210, 270-270

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

keivenchang

NICE

kthui added 2 commits December 8, 2025 15:25

test: Auto pre-download model before Cancellation and Migration test …

6621766

…begin Signed-off-by: Jacky <[email protected]>

tmp: Run all Cancellation / Migration tests pre_merge

6924769

Signed-off-by: Jacky <[email protected]>

kthui self-assigned this Dec 8, 2025

pull-request-size bot added the size/M label Dec 8, 2025

github-actions bot added the test label Dec 8, 2025

Revert "tmp: Run all Cancellation / Migration tests pre_merge"

6144d78

This reverts commit 6924769. Signed-off-by: Jacky <[email protected]>

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 00:38 Failure

kthui marked this pull request as ready for review December 9, 2025 00:39

kthui requested review from a team as code owners December 9, 2025 00:39

kthui changed the title ~~test: Pre-download models before test cases are ran~~ test: Pre-download models before tests are ran Dec 9, 2025

keivenchang approved these changes Dec 9, 2025

View reviewed changes

Merge branch 'main' into jacky-ft-test-timeout-download-time

f7b356b

copy-pr-bot bot temporarily deployed to GITLAB December 9, 2025 01:26 Inactive

kthui enabled auto-merge (squash) December 9, 2025 01:26

copy-pr-bot bot temporarily deployed to GITLAB December 9, 2025 01:34 Inactive

nnshah1 approved these changes Dec 9, 2025

View reviewed changes

kthui merged commit 4ca1679 into main Dec 9, 2025
30 of 31 checks passed

kthui deleted the jacky-ft-test-timeout-download-time branch December 9, 2025 03:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: Pre-download models before tests are ran #4811

test: Pre-download models before tests are ran #4811

kthui commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 9, 2025

Uh oh!

keivenchang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

test: Pre-download models before tests are ran #4811

test: Pre-download models before tests are ran #4811

Conversation

kthui commented Dec 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 9, 2025

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

keivenchang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kthui commented Dec 8, 2025 •

edited by coderabbitai bot

Loading