Add custom MoE quantization guide for HuggingFace models#1118
Add custom MoE quantization guide for HuggingFace models#1118
Conversation
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis change adds comprehensive documentation for extending ModelOpt's Torch quantization capabilities to custom Mixture-of-Experts (MoE) architectures. A new guide file is created describing supported MoE quantization patterns and implementation approaches, with a reference added to the main README. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1118 +/- ##
==========================================
- Coverage 70.22% 70.21% -0.02%
==========================================
Files 228 228
Lines 25952 25952
==========================================
- Hits 18224 18221 -3
- Misses 7728 7731 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: documentation
Adds a new guide (
examples/llm_ptq/moe.md) that explains how model developers can add quantization support for custom MoE architectures from HuggingFace checkpoints. The guide covers all existing MoE patterns in ModelOpt'shuggingface.pyplugin and provides step-by-step instructions for patching new ones.Covered patterns:
Also adds a link to the new guide from
examples/llm_ptq/README.md.Testing
N/A — documentation only.
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/ASummary by CodeRabbit