-
Notifications
You must be signed in to change notification settings - Fork 728
test: add ci tests for lora (agg and router) #4817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bis/dep-546-lora-aware-kv-routing-support
Are you sure you want to change the base?
test: add ci tests for lora (agg and router) #4817
Conversation
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
WalkthroughThe pull request introduces a comprehensive MinIO/LoRA testing infrastructure in Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
tests/serve/test_vllm.py (3)
543-564: Use explicitNoneunion in type hints.The default mutable argument pattern with
Noneis correct, but the type hints should explicitly indicate the parameter acceptsNoneper PEP 484.def __init__( self, body: dict, lora_name: str, s3_uri: str, system_port: int = 8081, repeat_count: int = 1, - expected_response: list = None, - expected_log: list = None, + expected_response: list | None = None, + expected_log: list | None = None, timeout: int = 60, ):
566-614: Consider moving imports to module level.The
timeandrequestsimports are placed inside the method. While this works, moving them to the module level improves readability and avoids repeated import resolution.import base64 import logging import os import random +import time from dataclasses import dataclass, field import pytest +import requestsThen remove lines 569-571 from the method body.
622-653: Same type hint issue as__init__.def lora_chat_payload( lora_name: str, s3_uri: str, system_port: int = 8081, repeat_count: int = 2, - expected_response: list = None, - expected_log: list = None, + expected_response: list | None = None, + expected_log: list | None = None, max_tokens: int = 100, temperature: float = 0.0, ) -> LoraTestChatPayload:tests/serve/conftest.py (3)
92-109: Consider pinning the MinIO Docker image version.Using
quay.io/minio/miniowithout a version tag means tests could behave differently across runs if the image is updated. Consider pinning to a specific version for reproducibility.- "quay.io/minio/minio", + "quay.io/minio/minio:RELEASE.2024-01-01T00-00-00Z", # Pin to specific version
98-101: Hardcoded ports may cause conflicts in parallel test runs.Ports 9000 and 9001 are hardcoded. If tests run in parallel or another MinIO instance is running, this will cause port conflicts. Consider using dynamic port allocation or documenting this limitation.
267-282: Consider accepting other 2xx status codes.The check
response.status_code != 200rejects 201 (Created) and other successful status codes. If the API could return 201 for newly created resources, this would incorrectly raise an error.- if response.status_code != 200: + if not response.ok: # Accepts any 2xx status code
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/serve/conftest.py(2 hunks)tests/serve/test_vllm.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/serve/test_vllm.py (3)
tests/serve/conftest.py (5)
MinioLoraConfig(39-64)load_lora_adapter(267-282)minio_lora_service(321-361)get_s3_uri(50-52)get_env_vars(54-64)tests/utils/payload_builder.py (1)
chat_payload(129-156)tests/utils/payloads.py (2)
ChatPayload(102-155)url(44-46)
tests/serve/conftest.py (2)
tests/conftest.py (1)
logger(193-204)tests/serve/test_vllm.py (1)
url(616-619)
🪛 Ruff (0.14.8)
tests/serve/test_vllm.py
550-550: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
551-551: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
612-614: Avoid specifying long messages outside the exception class
(TRY003)
627-627: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
628-628: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
662-662: Unused function argument: runtime_services
(ARG001)
662-662: Unused function argument: predownload_models
(ARG001)
704-704: Unused function argument: runtime_services
(ARG001)
704-704: Unused function argument: predownload_models
(ARG001)
tests/serve/conftest.py
32-32: Possible hardcoded password assigned to: "MINIO_SECRET_KEY"
(S105)
63-63: Probable insecure usage of temporary file or directory: "/tmp/dynamo_loras_minio_test"
(S108)
111-111: subprocess call: check for execution of untrusted input
(S603)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
140-140: subprocess call: check for execution of untrusted input
(S603)
141-141: Starting a process with a partial executable path
(S607)
146-146: subprocess call: check for execution of untrusted input
(S603)
147-147: Starting a process with a partial executable path
(S607)
162-162: subprocess call: check for execution of untrusted input
(S603)
163-170: Starting a process with a partial executable path
(S607)
178-178: subprocess call: check for execution of untrusted input
(S603)
179-186: Starting a process with a partial executable path
(S607)
191-191: Avoid specifying long messages outside the exception class
(TRY003)
200-200: subprocess call: check for execution of untrusted input
(S603)
201-209: Starting a process with a partial executable path
(S607)
215-215: Avoid specifying long messages outside the exception class
(TRY003)
238-238: subprocess call: check for execution of untrusted input
(S603)
239-249: Starting a process with a partial executable path
(S607)
255-255: Avoid specifying long messages outside the exception class
(TRY003)
278-280: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (10)
tests/serve/test_vllm.py (4)
17-30: LGTM!Import additions are appropriate for the new LoRA testing functionality.
616-619: Side effect inurl()is intentional but worth documenting.The
url()method triggers LoRA loading as a side effect. The docstring mentions this, which is good. This design ensures the LoRA is loaded lazily before the first request.
656-695: LGTM!The test structure is well-organized. The
runtime_servicesandpredownload_modelsparameters are pytest fixtures that set up required infrastructure even though they're not directly referenced in the function body.
698-763: Test structure for multi-worker LoRA is well-designed.The approach of loading LoRA on both workers separately (ports 8081 and 8082) before running inference ensures proper multi-worker setup. Setting
PYTHONHASHSEED=0for deterministic KV event IDs is a good practice for reproducible testing.tests/serve/conftest.py (6)
29-36: Constants are appropriate for local testing.The hardcoded credentials (flagged by static analysis) are acceptable here as these are default MinIO credentials for a local test container, not production secrets.
38-64: LGTM!The dataclass provides clean configuration management with sensible defaults for test environments.
161-191: Bucket existence check conflates errors.The
s3 lscommand returning non-zero could mean the bucket doesn't exist, but it could also indicate authentication failure or network issues. For test infrastructure this is acceptable, but be aware that bucket creation might fail with a confusing error if the real issue is auth/network.
193-222: LGTM!The LoRA download logic properly handles errors and cleans up the cache directory.
224-255: LGTM!The upload logic correctly syncs to MinIO while excluding git-related files.
320-361: Fixture cleanup logic is correct.The dual
cleanup_temp()calls serve different purposes:
- Before yield: cleans downloaded LoRA files to save space while MinIO retains the uploaded data
- In finally: cleans everything including MinIO's data directory
This is working as intended, though adding a brief comment would improve clarity.
27a1b4e to
657702a
Compare
Overview:
Add CI tests for the lora shell scripts (agg and agg router) -
examples/backends/vllm/launch/lora/agg_lora.shexamples/backends/vllm/launch/lora/agg_lora_router.shRelated Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.