Add nvfp4_mse and nvfp4_local_hessian options to the ptq script. by bkartal-dev · Pull Request #1113 · NVIDIA/Model-Optimizer

bkartal-dev · 2026-03-24T18:42:22Z

What does this PR do?

Type of change: Bugfix

Add newly added quant configs to the example PTQ script.

Testing

I have locally run auto_quantize with these two quant_configs, and obtained successfully exported HF artifacts.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Summary by CodeRabbit

New Features
- Added support for two new quantization formats: nvfp4_mse and nvfp4_local_hessian, expanding export options available when using auto-quantize.
Bug Fixes / UX
- Updated the invalid-quantization error message to list the newly accepted format identifiers.

copy-pr-bot · 2026-03-24T18:42:27Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-24T18:42:37Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 75c87cf9-d695-4469-9649-3ec414b34459

📥 Commits

Reviewing files that changed from the base of the PR and between f8ee452 and 97551cd.

📒 Files selected for processing (2)

examples/llm_ptq/hf_ptq.py
examples/llm_ptq/scripts/huggingface_example.sh

✅ Files skipped from review due to trivial changes (1)

examples/llm_ptq/scripts/huggingface_example.sh

🚧 Files skipped from review as they are similar to previous changes (1)

examples/llm_ptq/hf_ptq.py

📝 Walkthrough

Walkthrough

The Python auto-quantize whitelist now accepts nvfp4_local_hessian. The shell example's case whitelist and its "Unknown quant argument" message were expanded to include nvfp4_mse and nvfp4_local_hessian.

Changes

Cohort / File(s)	Summary
Python quantization whitelist `examples/llm_ptq/hf_ptq.py`	Added `nvfp4_local_hessian` to the hardcoded `qformat_list` used by `auto_quantize()` so it passes the assertion and can be resolved via `QUANT_CFG_CHOICES`.
Shell example whitelist & message `examples/llm_ptq/scripts/huggingface_example.sh`	Extended the `case $qformat in` whitelist to include `nvfp4_mse` and `nvfp4_local_hessian`; updated the "Unknown quant argument" error output to include these identifiers in the list of valid options.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: adding two new quantization format options (nvfp4_mse and nvfp4_local_hessian) to the PTQ example script.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	The pull request contains only configuration changes extending quantization format validation whitelists by adding two new format strings to assertion and case statement without introducing security-sensitive patterns.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Bilal Kartal <bkartal@nvidia.com>

Edwardf0t1

Please fix the conflict.

Signed-off-by: bkartal-dev <bkartal@nvidia.com>

codecov · 2026-03-24T19:45:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.21%. Comparing base (76b6fd5) to head (97551cd).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1113      +/-   ##
==========================================
- Coverage   77.22%   77.21%   -0.02%     
==========================================
  Files         459      459              
  Lines       48975    48975              
==========================================
- Hits        37822    37815       -7     
- Misses      11153    11160       +7

Flag	Coverage Δ
examples	`41.18% <ø> (-0.16%)`	⬇️
unit	`52.28% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bkartal-dev · 2026-03-24T22:05:52Z

merge conflict is resolved.

kevalmorabia97 · 2026-03-25T16:58:54Z

/ok to test c50a346

cjluo-nv · 2026-03-31T21:41:50Z

Could you update readme explaining these options? Or this time is to enable end2end but without the need to promote these two methods?

bkartal-dev · 2026-03-31T21:52:08Z

This only updates the end2end example script so that it does not fail when already supported features are selected.

cjluo-nv

Summary: Adds nvfp4_mse and nvfp4_local_hessian to the allowed quantization format whitelists in the example PTQ script (hf_ptq.py) and its shell wrapper (huggingface_example.sh), so these already-supported engine configs can be used end-to-end without the scripts rejecting them.

Issues Found:

[Readability] huggingface_example.sh:56 — The case pattern and error message are a single very long line that is becoming hard to scan. The new formats are appended at different positions (e.g., nvfp4_local_hessian tacked on at the end, nvfp4_mse inserted mid-list), making future diffs messy. Not blocking, but consider alphabetical or grouped ordering, or a multi-line case pattern using \ continuation for readability as this list grows.
[Correctness] huggingface_example.sh:56 vs hf_ptq.py:270-286 — The shell script whitelist now includes nvfp4_svdquant and fp8_pc_pt which are not in the Python auto_quantize whitelist. This means a user could pass these formats through the shell script but hit the Python assertion when --export_fmt hf triggers auto_quantize. This is a pre-existing issue (not introduced by this PR), but worth noting as the lists continue to diverge.
[Tests] No tests are added, but this is acceptable — the change only extends validation whitelists in example scripts, and the author confirms local end-to-end testing of both new configs. Codecov confirms no coverage regression.

Suggestions:

Consider extracting the supported format list into a single source of truth (e.g., a shared constant or config) to prevent the Python and shell whitelists from drifting further apart.

Overall Assessment: Clean, minimal change that unblocks already-supported quantization formats in the example scripts. No correctness risk from the changes themselves.

kevalmorabia97 · 2026-04-18T06:40:12Z

/ok to test 97551cd

bkartal-dev requested a review from a team as a code owner March 24, 2026 18:42

bkartal-dev requested a review from Edwardf0t1 March 24, 2026 18:42

Add nvfp4_mse and nvfp4_local_hessian options to the ptq script.

f8ee452

Signed-off-by: Bilal Kartal <bkartal@nvidia.com>

bkartal-dev force-pushed the fix-example-script branch from 8278385 to f8ee452 Compare March 24, 2026 18:49

realAsma approved these changes Mar 24, 2026

View reviewed changes

realAsma requested a review from cjluo-nv March 24, 2026 19:14

Edwardf0t1 approved these changes Mar 24, 2026

View reviewed changes

Merge branch 'main' into fix-example-script

c50a346

Signed-off-by: bkartal-dev <bkartal@nvidia.com>

kevalmorabia97 added the cherry-pick-0.44 After code freeze, cherry-pick into release branch for next rc. Only for bug fixes and doc updates label Mar 25, 2026

kevalmorabia97 enabled auto-merge (squash) March 25, 2026 16:59

cjluo-nv reviewed Apr 1, 2026

View reviewed changes

cjluo-nv approved these changes Apr 1, 2026

View reviewed changes

Merge branch 'main' into fix-example-script

97551cd

kevalmorabia97 removed the cherry-pick-0.44 After code freeze, cherry-pick into release branch for next rc. Only for bug fixes and doc updates label Apr 18, 2026

kevalmorabia97 merged commit 92622a9 into NVIDIA:main Apr 18, 2026
41 of 42 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nvfp4_mse and nvfp4_local_hessian options to the ptq script.#1113

Add nvfp4_mse and nvfp4_local_hessian options to the ptq script.#1113
kevalmorabia97 merged 3 commits intoNVIDIA:mainfrom
bkartal-dev:fix-example-script

bkartal-dev commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 24, 2026

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

Edwardf0t1 left a comment

Uh oh!

codecov bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

bkartal-dev commented Mar 24, 2026

Uh oh!

kevalmorabia97 commented Mar 25, 2026

Uh oh!

cjluo-nv commented Mar 31, 2026

Uh oh!

bkartal-dev commented Mar 31, 2026

Uh oh!

cjluo-nv left a comment

Uh oh!

kevalmorabia97 commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bkartal-dev commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 24, 2026

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

Edwardf0t1 left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bkartal-dev commented Mar 24, 2026

Uh oh!

kevalmorabia97 commented Mar 25, 2026

Uh oh!

cjluo-nv commented Mar 31, 2026

Uh oh!

bkartal-dev commented Mar 31, 2026

Uh oh!

cjluo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bkartal-dev commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

codecov bot commented Mar 24, 2026 •

edited

Loading