Remove --skip-softmax option from hf_sa (only allow calibration option) by rohansjoshi · Pull Request #1123 · NVIDIA/Model-Optimizer

rohansjoshi · 2026-03-25T20:28:34Z

What does this PR do?

The hf_sa.py script applies an attention sparsity method to a model and exports a "sparsified" checkpoint. The SKIP_SOFTMAX_DEFAULT sparse attention config doesn't run calibration or make any change to the checkpoint. If we use this method with hf_sa.py the exported checkpoint is identical to the original checkpoint. This could cause confusion so it's best to remove the option altogether. This PR removes the option and switches to SKIP_SOFTMAX_CALIB as the default config.

Summary by CodeRabbit

Chores
- Updated sparse attention configuration in example script: removed "skip_softmax" option and changed default to "skip_softmax_calib". The script no longer accepts the previous "skip_softmax" configuration path.

Signed-off-by: Rohan Joshi <rohjoshi@nvidia.com>

coderabbitai · 2026-03-25T20:28:50Z

📝 Walkthrough

Walkthrough

The change removes deprecated "skip_softmax" sparse attention configuration support from the HuggingFace sparse attention example script. The --sparse_attn CLI argument default is updated from "skip_softmax" to "skip_softmax_calib", and associated imports and configuration choices are adjusted accordingly.

Changes

Cohort / File(s)	Summary
Sparse Attention Configuration Cleanup `examples/llm_sparsity/attention_sparsity/hf_sa.py`	Removed `SKIP_SOFTMAX_DEFAULT` import, eliminated `"skip_softmax"` from `SPARSE_ATTN_CFG_CHOICES`, and updated `--sparse_attn` CLI default from `"skip_softmax"` to `"skip_softmax_calib"`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: removing the --skip-softmax option and retaining only the calibration option in hf_sa.py.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	Pull request introduces only configuration changes to hf_sa.py with no security anti-patterns or new dependencies.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch rohjoshi/rm-skipsoftmax-option

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-25T20:32:28Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1123/
Built to branch `gh-pages` at 2026-03-25 20:32 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

🧹 Nitpick comments (1)

examples/llm_sparsity/attention_sparsity/hf_sa.py (1)

230-235: Clarify CLI migration in help text.

Line 232 now defaults to a single supported value, which is fine. Consider explicitly stating that skip_softmax was removed to make CLI errors/migration clearer for existing users.

Suggested tweak

     parser.add_argument(
         "--sparse_attn",
         type=str,
         default="skip_softmax_calib",
         choices=list(SPARSE_ATTN_CFG_CHOICES.keys()),
-        help="Sparse attention configuration to apply.",
+        help="Sparse attention configuration to apply. Only 'skip_softmax_calib' is supported (legacy 'skip_softmax' removed).",
     )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/llm_sparsity/attention_sparsity/hf_sa.py` around lines 230 - 235,
Update the CLI help for the "--sparse_attn" argument to note the migration from
the removed "skip_softmax" option to the new default "skip_softmax_calib" so
users get a clear migration message; modify the help string where the argument
is defined (the "--sparse_attn" parser entry that uses SPARSE_ATTN_CFG_CHOICES
and default="skip_softmax_calib") to explicitly state that "skip_softmax" was
removed and that "skip_softmax_calib" is now the default, and keep the choices
list unchanged to avoid surprises.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@examples/llm_sparsity/attention_sparsity/hf_sa.py`:
- Around line 230-235: Update the CLI help for the "--sparse_attn" argument to
note the migration from the removed "skip_softmax" option to the new default
"skip_softmax_calib" so users get a clear migration message; modify the help
string where the argument is defined (the "--sparse_attn" parser entry that uses
SPARSE_ATTN_CFG_CHOICES and default="skip_softmax_calib") to explicitly state
that "skip_softmax" was removed and that "skip_softmax_calib" is now the
default, and keep the choices list unchanged to avoid surprises.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 68d1cedc-9562-4e37-b7a8-6f0c82c8a5f9

📥 Commits

Reviewing files that changed from the base of the PR and between 291498b and 03b8517.

📒 Files selected for processing (1)

examples/llm_sparsity/attention_sparsity/hf_sa.py

codecov · 2026-03-25T20:42:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.18%. Comparing base (291498b) to head (03b8517).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1123      +/-   ##
==========================================
- Coverage   70.21%   70.18%   -0.03%     
==========================================
  Files         228      228              
  Lines       25952    25952              
==========================================
- Hits        18221    18215       -6     
- Misses       7731     7737       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Remove --skip-softmax option from hf_sa (only allow calibration option)

03b8517

Signed-off-by: Rohan Joshi <rohjoshi@nvidia.com>

rohansjoshi requested a review from a team as a code owner March 25, 2026 20:28

rohansjoshi requested a review from realAsma March 25, 2026 20:28

rohansjoshi requested a review from kaix-nv March 25, 2026 20:29

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

rohansjoshi removed the request for review from realAsma March 26, 2026 00:03

kaix-nv approved these changes Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove --skip-softmax option from hf_sa (only allow calibration option)#1123

Remove --skip-softmax option from hf_sa (only allow calibration option)#1123
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/rm-skipsoftmax-option

rohansjoshi commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

github-actions bot commented Mar 25, 2026

Built to branch `gh-pages` at 2026-03-25 20:32 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rohansjoshi commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

github-actions bot commented Mar 25, 2026

Built to branch gh-pages at 2026-03-25 20:32 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rohansjoshi commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-03-25 20:32 UTC.
Preview will be ready when the GitHub Pages deployment is complete.