Skip to content

Remove --skip-softmax option from hf_sa (only allow calibration option)#1123

Open
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/rm-skipsoftmax-option
Open

Remove --skip-softmax option from hf_sa (only allow calibration option)#1123
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/rm-skipsoftmax-option

Conversation

@rohansjoshi
Copy link
Contributor

@rohansjoshi rohansjoshi commented Mar 25, 2026

What does this PR do?

The hf_sa.py script applies an attention sparsity method to a model and exports a "sparsified" checkpoint. The SKIP_SOFTMAX_DEFAULT sparse attention config doesn't run calibration or make any change to the checkpoint. If we use this method with hf_sa.py the exported checkpoint is identical to the original checkpoint. This could cause confusion so it's best to remove the option altogether. This PR removes the option and switches to SKIP_SOFTMAX_CALIB as the default config.

Summary by CodeRabbit

  • Chores
    • Updated sparse attention configuration in example script: removed "skip_softmax" option and changed default to "skip_softmax_calib". The script no longer accepts the previous "skip_softmax" configuration path.

Signed-off-by: Rohan Joshi <rohjoshi@nvidia.com>
@rohansjoshi rohansjoshi requested a review from a team as a code owner March 25, 2026 20:28
@rohansjoshi rohansjoshi requested a review from realAsma March 25, 2026 20:28
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

The change removes deprecated "skip_softmax" sparse attention configuration support from the HuggingFace sparse attention example script. The --sparse_attn CLI argument default is updated from "skip_softmax" to "skip_softmax_calib", and associated imports and configuration choices are adjusted accordingly.

Changes

Cohort / File(s) Summary
Sparse Attention Configuration Cleanup
examples/llm_sparsity/attention_sparsity/hf_sa.py
Removed SKIP_SOFTMAX_DEFAULT import, eliminated "skip_softmax" from SPARSE_ATTN_CFG_CHOICES, and updated --sparse_attn CLI default from "skip_softmax" to "skip_softmax_calib".

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: removing the --skip-softmax option and retaining only the calibration option in hf_sa.py.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns ✅ Passed Pull request introduces only configuration changes to hf_sa.py with no security anti-patterns or new dependencies.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch rohjoshi/rm-skipsoftmax-option

Comment @coderabbitai help to get the list of available commands and usage tips.

@rohansjoshi rohansjoshi requested a review from kaix-nv March 25, 2026 20:29
@github-actions
Copy link
Contributor

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1123/

Built to branch gh-pages at 2026-03-25 20:32 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
examples/llm_sparsity/attention_sparsity/hf_sa.py (1)

230-235: Clarify CLI migration in help text.

Line 232 now defaults to a single supported value, which is fine. Consider explicitly stating that skip_softmax was removed to make CLI errors/migration clearer for existing users.

Suggested tweak
     parser.add_argument(
         "--sparse_attn",
         type=str,
         default="skip_softmax_calib",
         choices=list(SPARSE_ATTN_CFG_CHOICES.keys()),
-        help="Sparse attention configuration to apply.",
+        help="Sparse attention configuration to apply. Only 'skip_softmax_calib' is supported (legacy 'skip_softmax' removed).",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/llm_sparsity/attention_sparsity/hf_sa.py` around lines 230 - 235,
Update the CLI help for the "--sparse_attn" argument to note the migration from
the removed "skip_softmax" option to the new default "skip_softmax_calib" so
users get a clear migration message; modify the help string where the argument
is defined (the "--sparse_attn" parser entry that uses SPARSE_ATTN_CFG_CHOICES
and default="skip_softmax_calib") to explicitly state that "skip_softmax" was
removed and that "skip_softmax_calib" is now the default, and keep the choices
list unchanged to avoid surprises.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@examples/llm_sparsity/attention_sparsity/hf_sa.py`:
- Around line 230-235: Update the CLI help for the "--sparse_attn" argument to
note the migration from the removed "skip_softmax" option to the new default
"skip_softmax_calib" so users get a clear migration message; modify the help
string where the argument is defined (the "--sparse_attn" parser entry that uses
SPARSE_ATTN_CFG_CHOICES and default="skip_softmax_calib") to explicitly state
that "skip_softmax" was removed and that "skip_softmax_calib" is now the
default, and keep the choices list unchanged to avoid surprises.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 68d1cedc-9562-4e37-b7a8-6f0c82c8a5f9

📥 Commits

Reviewing files that changed from the base of the PR and between 291498b and 03b8517.

📒 Files selected for processing (1)
  • examples/llm_sparsity/attention_sparsity/hf_sa.py

@codecov
Copy link

codecov bot commented Mar 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.18%. Comparing base (291498b) to head (03b8517).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1123      +/-   ##
==========================================
- Coverage   70.21%   70.18%   -0.03%     
==========================================
  Files         228      228              
  Lines       25952    25952              
==========================================
- Hits        18221    18215       -6     
- Misses       7731     7737       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rohansjoshi rohansjoshi removed the request for review from realAsma March 26, 2026 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants