test: add ci tests for lora (agg and router) #4817

biswapanda · 2025-12-09T02:33:26Z

Overview:

Add CI tests for the lora shell scripts (agg and agg router) -

examples/backends/vllm/launch/lora/agg_lora.sh
examples/backends/vllm/launch/lora/agg_lora_router.sh

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes DEP-681

Summary by CodeRabbit

Tests
- Added comprehensive testing infrastructure for LoRA model integration
- Introduced end-to-end test coverage for LoRA inference across aggregated and router-based deployments
Chores
- Enhanced test utilities for MinIO-based model service management

_{✏️ Tip: You can customize this high-level summary in your review settings.}

biswapanda · 2025-12-09T04:49:02Z

@CodeRabbit review

coderabbitai · 2025-12-09T04:49:09Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-12-09T04:52:44Z

Walkthrough

The pull request introduces a comprehensive MinIO/LoRA testing infrastructure in conftest.py with a data class for configuration, a service class to manage Docker-based MinIO, and a pytest fixture that orchestrates LoRA download from Hugging Face and upload to MinIO. Additionally, test_vllm.py receives LoRA testing support through new test payload classes and two integration test cases for single and multi-worker LoRA inference scenarios.

Changes

Cohort / File(s)	Summary
MinIO Configuration & Service Infrastructure `tests/serve/conftest.py`	Introduces `MinioLoraConfig` dataclass with S3 URI generation and environment variable helpers. Adds `MinioService` class managing Docker lifecycle (start/stop), HTTP readiness probing with retry timeout, bucket creation via AWS S3 CLI, LoRA download from Hugging Face, and upload to MinIO with cleanup routines. Includes constants for MinIO endpoint, credentials, bucket, and LoRA repository details.
MinIO LoRA Fixture & Helper `tests/serve/conftest.py`	Introduces `load_lora_adapter()` function to load LoRA via system API endpoint with polling for model availability. Adds `minio_lora_service()` pytest fixture that orchestrates MinIO startup, bucket creation, LoRA download/upload, and cleanup.
LoRA Test Payload & Factories `tests/serve/test_vllm.py`	Adds `LoraTestChatPayload` class with state tracking and `_ensure_lora_loaded()` method that polls `/v1/models` endpoint until LoRA availability. Introduces `lora_chat_payload()` factory function for creating test payload instances with LoRA defaults.
LoRA Integration Tests `tests/serve/test_vllm.py`	Adds `test_lora_aggregated()` for single-deployment LoRA inference with MinIO integration. Adds `test_lora_aggregated_router()` for multi-worker LoRA inference with KV router and deterministic hash seeding. Both tests configure environment variables and exercise LoRA loading and inference workflows.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Docker lifecycle management: MinioService.start() and stop() methods with container orchestration and cleanup—verify proper error handling and resource cleanup on failures
HTTP readiness probing: MinioService._wait_for_ready() retry logic with timeout—check for off-by-one errors and appropriate exception handling
AWS S3 CLI interactions: Bucket creation and LoRA sync operations with endpoint override—verify correct argument passing and error scenarios
API polling: load_lora_adapter() function polling /v1/models endpoint—validate timeout logic and polling interval
State management in LoraTestChatPayload: The _ensure_lora_loaded() logic and URL override mechanism—ensure thread-safety and proper state transitions
Integration test orchestration: Both test_lora_aggregated and test_lora_aggregated_router fixture dependencies and environment variable setup—verify fixture ordering and isolation

Poem

🐰 A MinIO tale springs forth with glee,
LoRAs now dock in S3,
Docker containers, AWS CLI calls,
Through API polls and bucket halls,
Our tests now hop with fresh delight,
Each inference adapter–loaded just right! ✨

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the main change: adding CI tests for LoRA functionality in both aggregated and router modes.
Description check	✅ Passed	The description covers the overview and related issues but lacks the 'Details' and 'Where should the reviewer start?' sections from the template.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

tests/serve/test_vllm.py (3)
543-564: Use explicit None union in type hints.

The default mutable argument pattern with None is correct, but the type hints should explicitly indicate the parameter accepts None per PEP 484.
     def __init__(
         self,
         body: dict,
         lora_name: str,
         s3_uri: str,
         system_port: int = 8081,
         repeat_count: int = 1,
-        expected_response: list = None,
-        expected_log: list = None,
+        expected_response: list | None = None,
+        expected_log: list | None = None,
         timeout: int = 60,
     ):
566-614: Consider moving imports to module level.

The time and requests imports are placed inside the method. While this works, moving them to the module level improves readability and avoids repeated import resolution.
 import base64
 import logging
 import os
 import random
+import time
 from dataclasses import dataclass, field

 import pytest
+import requests
Then remove lines 569-571 from the method body.

622-653: Same type hint issue as __init__.
 def lora_chat_payload(
     lora_name: str,
     s3_uri: str,
     system_port: int = 8081,
     repeat_count: int = 2,
-    expected_response: list = None,
-    expected_log: list = None,
+    expected_response: list | None = None,
+    expected_log: list | None = None,
     max_tokens: int = 100,
     temperature: float = 0.0,
 ) -> LoraTestChatPayload:
tests/serve/conftest.py (3)
92-109: Consider pinning the MinIO Docker image version.

Using quay.io/minio/minio without a version tag means tests could behave differently across runs if the image is updated. Consider pinning to a specific version for reproducibility.
-            "quay.io/minio/minio",
+            "quay.io/minio/minio:RELEASE.2024-01-01T00-00-00Z",  # Pin to specific version
98-101: Hardcoded ports may cause conflicts in parallel test runs.

Ports 9000 and 9001 are hardcoded. If tests run in parallel or another MinIO instance is running, this will cause port conflicts. Consider using dynamic port allocation or documenting this limitation.

267-282: Consider accepting other 2xx status codes.

The check response.status_code != 200 rejects 201 (Created) and other successful status codes. If the API could return 201 for newly created resources, this would incorrectly raise an error.
-    if response.status_code != 200:
+    if not response.ok:  # Accepts any 2xx status code

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27a1b4e and 42c043a.

📒 Files selected for processing (2)

tests/serve/conftest.py (2 hunks)
tests/serve/test_vllm.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tests/serve/test_vllm.py (3)

tests/serve/conftest.py (5)

MinioLoraConfig (39-64)

load_lora_adapter (267-282)

minio_lora_service (321-361)

get_s3_uri (50-52)

get_env_vars (54-64)

tests/utils/payload_builder.py (1)

chat_payload (129-156)

tests/utils/payloads.py (2)

ChatPayload (102-155)

url (44-46)

tests/serve/conftest.py (2)

tests/conftest.py (1)

logger (193-204)

tests/serve/test_vllm.py (1)

url (616-619)

🪛 Ruff (0.14.8)

tests/serve/test_vllm.py

550-550: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

551-551: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

612-614: Avoid specifying long messages outside the exception class

(TRY003)

627-627: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

628-628: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

662-662: Unused function argument: runtime_services

(ARG001)

662-662: Unused function argument: predownload_models

(ARG001)

704-704: Unused function argument: runtime_services

(ARG001)

704-704: Unused function argument: predownload_models

(ARG001)

tests/serve/conftest.py

32-32: Possible hardcoded password assigned to: "MINIO_SECRET_KEY"

(S105)

63-63: Probable insecure usage of temporary file or directory: "/tmp/dynamo_loras_minio_test"

(S108)

111-111: subprocess call: check for execution of untrusted input

(S603)

113-113: Avoid specifying long messages outside the exception class

(TRY003)

133-133: Avoid specifying long messages outside the exception class

(TRY003)

140-140: subprocess call: check for execution of untrusted input

(S603)

141-141: Starting a process with a partial executable path

(S607)

146-146: subprocess call: check for execution of untrusted input

(S603)

147-147: Starting a process with a partial executable path

(S607)

162-162: subprocess call: check for execution of untrusted input

(S603)

163-170: Starting a process with a partial executable path

(S607)

178-178: subprocess call: check for execution of untrusted input

(S603)

179-186: Starting a process with a partial executable path

(S607)

191-191: Avoid specifying long messages outside the exception class

(TRY003)

200-200: subprocess call: check for execution of untrusted input

(S603)

201-209: Starting a process with a partial executable path

(S607)

215-215: Avoid specifying long messages outside the exception class

(TRY003)

238-238: subprocess call: check for execution of untrusted input

(S603)

239-249: Starting a process with a partial executable path

(S607)

255-255: Avoid specifying long messages outside the exception class

(TRY003)

278-280: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (10)

tests/serve/test_vllm.py (4)

17-30: LGTM!

Import additions are appropriate for the new LoRA testing functionality.

616-619: Side effect in url() is intentional but worth documenting.

The url() method triggers LoRA loading as a side effect. The docstring mentions this, which is good. This design ensures the LoRA is loaded lazily before the first request.

656-695: LGTM!

The test structure is well-organized. The runtime_services and predownload_models parameters are pytest fixtures that set up required infrastructure even though they're not directly referenced in the function body.

698-763: Test structure for multi-worker LoRA is well-designed.

The approach of loading LoRA on both workers separately (ports 8081 and 8082) before running inference ensures proper multi-worker setup. Setting PYTHONHASHSEED=0 for deterministic KV event IDs is a good practice for reproducible testing.

tests/serve/conftest.py (6)

29-36: Constants are appropriate for local testing.

The hardcoded credentials (flagged by static analysis) are acceptable here as these are default MinIO credentials for a local test container, not production secrets.

38-64: LGTM!

The dataclass provides clean configuration management with sensible defaults for test environments.

161-191: Bucket existence check conflates errors.

The s3 ls command returning non-zero could mean the bucket doesn't exist, but it could also indicate authentication failure or network issues. For test infrastructure this is acceptable, but be aware that bucket creation might fail with a confusing error if the real issue is auth/network.

193-222: LGTM!

The LoRA download logic properly handles errors and cleans up the cache directory.

224-255: LGTM!

The upload logic correctly syncs to MinIO while excluding git-related files.

320-361: Fixture cleanup logic is correct.

The dual cleanup_temp() calls serve different purposes:

Before yield: cleans downloaded LoRA files to save space while MinIO retains the uploaded data

In finally: cleans everything including MinIO's data directory

This is working as intended, though adding a brief comment would improve clarity.

biswapanda added 2 commits December 8, 2025 14:44

feat: enable kv aware routing for loras

27a1b4e

tests: add test cases for lora agg and agg router

42c043a

biswapanda requested review from a team as code owners December 9, 2025 02:33

pull-request-size bot added the size/XL label Dec 9, 2025

biswapanda changed the title ~~tests: add ci tests for lora agg and router~~ test: add ci tests for lora agg and router Dec 9, 2025

github-actions bot added the test label Dec 9, 2025

biswapanda self-assigned this Dec 9, 2025

biswapanda changed the title ~~test: add ci tests for lora agg and router~~ test: add ci tests for lora (agg and router) Dec 9, 2025

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

fix

0651568

biswapanda requested a review from a team as a code owner December 9, 2025 04:56

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 04:56 Failure

biswapanda force-pushed the bis/dep-546-lora-aware-kv-routing-support branch from 27a1b4e to 657702a Compare December 9, 2025 04:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: add ci tests for lora (agg and router) #4817

test: add ci tests for lora (agg and router) #4817

Uh oh!

biswapanda commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

biswapanda commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

test: add ci tests for lora (agg and router) #4817

Are you sure you want to change the base?

test: add ci tests for lora (agg and router) #4817

Uh oh!

Conversation

biswapanda commented Dec 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

biswapanda commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

biswapanda commented Dec 9, 2025 •

edited by coderabbitai bot

Loading