-
Notifications
You must be signed in to change notification settings - Fork 571
feat(benchmark): AIPerf run script #1501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Greptile OverviewGreptile SummaryThis PR adds AIPerf benchmarking support to NeMo Guardrails with a well-structured command-line tool. The implementation includes YAML-based configuration, parameter sweep capabilities, and comprehensive test coverage. Key Changes:
Issues Identified: Confidence Score: 3/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant CLI
participant AIPerfRunner
participant ConfigValidator
participant ServiceChecker
participant AIPerf
User->>CLI: nemoguardrails aiperf run --config-file config.yaml
CLI->>AIPerfRunner: Initialize with config path
AIPerfRunner->>ConfigValidator: Load and validate YAML
ConfigValidator->>ConfigValidator: Validate with Pydantic models
ConfigValidator-->>AIPerfRunner: Return AIPerfConfig
AIPerfRunner->>ServiceChecker: _check_service()
ServiceChecker->>ServiceChecker: GET /v1/models with API key
ServiceChecker-->>AIPerfRunner: Service available
alt Single Benchmark
AIPerfRunner->>AIPerfRunner: _build_command()
AIPerfRunner->>AIPerfRunner: _create_output_dir()
AIPerfRunner->>AIPerfRunner: _save_run_metadata()
AIPerfRunner->>AIPerf: subprocess.run(aiperf command)
AIPerf-->>AIPerfRunner: Benchmark results
AIPerfRunner->>AIPerfRunner: _save_subprocess_result_json()
else Batch Benchmarks with Sweeps
AIPerfRunner->>AIPerfRunner: _get_sweep_combinations()
loop For each sweep combination
AIPerfRunner->>AIPerfRunner: _build_command(sweep_params)
AIPerfRunner->>AIPerfRunner: _create_output_dir(sweep_params)
AIPerfRunner->>AIPerfRunner: _save_run_metadata()
AIPerfRunner->>AIPerf: subprocess.run(aiperf command)
AIPerf-->>AIPerfRunner: Benchmark results
AIPerfRunner->>AIPerfRunner: _save_subprocess_result_json()
end
end
AIPerfRunner-->>CLI: Return exit code
CLI-->>User: Display summary and exit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, 3 comments
nemoguardrails/benchmark/aiperf/aiperf_configs/single_concurrency.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Documentation preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, 5 comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
Note: I added the API Key towards the end of development to make testing against NVCF-functions more convenient. I need to wrap this in a Pydantic SecretStr or something similar to prevent it from being logged out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
@tgasser-nv I noticed that the scope of this change is quite broad. It also introduces OpenAI-compatible endpoints on the server (at least for /chat/completions and /models) which is a major change. Given that, I think it might be better to wait until #1340 is finalized and merged. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 files reviewed, 1 comment
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
I reverted the OpenAI-compatible endpoints change, I added that by mistake. This isn't blocked by #1340. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11 files reviewed, no comments
…o get environment variable
01fd16e to
f95d772
Compare
* WIP: First round of performance regressions are working * Remove the rampup_seconds calculation, use warmup request count instead * Add benchmark directory to pyright type-checking * Add aiperf typer to the top-level nemoguardrails app * Add a quick GET to the /v1/models endpoint before running any benchamrks * Add tests for aiperf Pydantic models * Rename benchmark_seconds to benchmark_duration, add tokenizer optional field * Add single-concurrency config, rename both * Change configs to use NVCF hosted Llama 3.3 70B model * Refactor single and sweep benchmark runs, add API key env var logic to get environment variable * Add tests for run_aiperf.py * Revert changes to llm/providers/huggingface/streamers.py * Address greptile feedback * Add README for AIPerf scripts * Fix hard-coded forward-slash in path name * Fix hard-coded forward-slash in path name * Fix type: ignore line in huggingface streamers * Remove content_safety_colang2 Guardrail config from benchmark * Fix TextStreamer import Pyright waiver * Revert changes to server to give OpenAI-compliant responses * Add API key to /v1/models check, adjust description of AIPerf in CLI * Revert server changes * Address PR feedback * Move aiperf code to top-level * Update tests for new aiperf location * Rename configs directory * Create self-contained typer app, update README with new commands to run it * Rebase onto develop and re-run ruff formatter * Move aiperf under benchmark dir * Move aiperf under benchmark dir
Description
AIPerf (Github, Docs) is Nvidia's latest benchmarking tool for LLMs. It supports any OpenAI-compatible inference service and generates synthetic data loads, benchmarks, and all metrics needed for comparison.
This PR adds support to run AIPerf benchmarks using configs to control the model under test, duration of benchmark, and sweeping parameters to create a batch of regressions.
Test Plan
Pre-requisites
See README.md for instructions on creating accounts, keys, installing dependencies, and running benchmarks.
Running a single test
Pre-commit tests
Unit-tests
Chat server
Related Issue(s)
Checklist