Merge puzzletron compression algorithm by danielkorzekwa · Pull Request #1121 · NVIDIA/Model-Optimizer

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅
Did you write any new necessary tests?: ✅
Did you update Changelog?: ❌

Summary by CodeRabbit

New Features
- Added Puzzletron, an experimental LLM compression algorithm using Mixed Integer Programming-based Neural Architecture Search, supporting heterogeneous model modifications including FFN intermediate pruning, KV-head reduction, and expert removal.
- Added support for multiple model architectures (Llama, Mistral, Qwen, GPT-OSS, NeMo) with configurable compression workflows.
- Added knowledge distillation and inference evaluation utilities including Megatron-Bridge export support and lm-eval integration.
Documentation
- Added comprehensive guides and example configurations for running the Puzzletron compression pipeline.
Chores
- Updated workflows and build configuration to support new compression tooling.

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

…iv.org/abs/2411.19146 (#464) Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

## What does this PR do? Add decilm modelling code --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Compress tutorial (PoC) + compress cli app. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Signed-off-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Liana Mikaelyan <lmikaelyan@nvidia.com> Co-authored-by: Liana Mikaelyan <45925959+LianaMikael@users.noreply.github.com>

#545) ## What does this PR do? Add llama converter (no dependency on internal Nvidia code) - part 1/2 - change top-level dependencies in convert_llama3_to_decilm.py from puzzle_tools.... to modelopt..... - added modelopt.torch._compress.tools module - remove tokenization_mistral.py - not used scope of 2/2 part (will come once part 1/2 is merged): - change all deeper dependencies from from puzzle_tools.... to modelopt.... - test_convert_llama3_config_to_decilm_config.py should run without any internal nvidia dependencies --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

…ia code) (#552) ## What does this PR do? llama converter is self-contained now (no dependency on internal nvidia code) --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Make score pruning activations self-contained (no dependency on internal Nvidia code) - 1/6 - add integration test for attention pruning (pre-step to make adding activation scoring safer) --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? - Add score_pruning_activations.py Notes: - validate_model.py still depends on Nvidia internal code (will be changed in the subsequent MR) - sharded_checkpoint_utils.py - for now it needs to use DeciLM from internal Nvidia code, to be changed in the next MR --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

## What does this PR do? Add activation hooks used for pruning --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

…ng is self-contained now (#584) ## What does this PR do? Add sewing kit and utilities used for pruning scoring - pruning scoring is self-contained now - no dependency on internal Nvidia code. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

## What does this PR do? - Add L2NormHook and use it in megatron.py - Using L2NormHook removes code duplication between _DynamicSelfAttention and _DynamicMLP This is the first step towards reusing activation scores logic across Minitron and Puzzle. Next steps: - complete redesign of megatron.py - move other activation hooks logic to hooks.py - then combined those hooks.py with a similar hooks.py functoriality in puzzle (modelopt/torch/_compress/activation_scoring/activation_hooks/hooks.py) Questions: - why in the code before and after this redesign we store temp variables in two ways _register_temp_attribute and self.hook_handle)? ``` self._register_temp_attribute("_activation_hook", activation_hook) # TODO: confusion: why hook_handle is removed manually in export() and not using _register_temp_attribute? self.hook_handle = self.linear_fc2.register_forward_hook(activation_hook) ``` --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

## What does this PR do? Add pruning checkpoints for the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Add build replacement library to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Add subblock stats to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Add 1-block scoring to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

…utionHook (#610) ## What does this PR do? Add checkpoint save/load to ForwardHook + add IterativeChannelContributionHook. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

## What does this PR do? Add MIP step to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

## What does this PR do? **Type of change:** Improvement  - Fix tests for 2-gpu: Some places hard-coded cpu device for distributed communications which was causing this issue - Remove unused constrain_search_space.py - Remove `is_multi_layer_puzzle: False` case - Remove `use_greedy_search: False` case - Remove knapsack mip case - Remove unused `num_solutions` and `minimal_diversity` flags ## Testing  - GH CICD test passing - Tested on 2-gpu setup locally as well  ## Summary by CodeRabbit # Release Notes * **Refactor** * Optimized solver implementation with improved library integration. * Simplified model compression configuration by removing deprecated search options. * Consolidated optimization paths for streamlined processing. * **Chores** * Updated dependencies for improved compatibility. * **Documentation** * Clarified Model-Optimizer installation instructions in examples. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  --------- Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

…hooks analysis (#670) ## What does this PR do? Fix a bug in IterativeChannelContributionHook + tools for activation hooks analysis --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Daniel Korzekwa <daniel.korzekwa@gmail.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

…unctions (#667) ## What does this PR do? - Remove `runtime.py` and directly use `modelopt.torch.utils.distributed` functions - Read model_dtype and autocast_dtype from validate_model_defaults.yaml instead of runtime object - Remove more unused functions - Remove unnecessary parse_args for intermediate steps, and improve docstrings ## Testing  - No change in functionality - Existing tests passing --------- Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

### What does this PR do? Merging dkorzekwa/anymodel_build_library_and_stats into dkorzekwa/anymodel_pruning - this MR is only for reviewing. Ultimately dkorzekwa/anymodel_build_library_and_stats should be merged into feature/puzzletron once dkorzekwa/anymodel_pruning scoring is merged there.  ## Summary by CodeRabbit * **New Features** * Integrated language-model descriptors to improve compatibility and produce model-specific stats. * **Bug Fixes** * Re‑enabled the single‑process library build and stats workflow step. * **Documentation** * Expanded usage docs and CLI guidance for the replacement library builder. * **Refactor** * Improved typing/annotations and normalized key‑value head handling, affecting attention memory/parameter calculations and textual outputs.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Merging dkorzekwa/any_model_calc_one_block_scores into dkorzekwa/anymodel_build_library_and_stats - this MR is only for reviewing. Ultimately dkorzekwa/any_model_calc_one_block_scores should be merged into feature/puzzletron once dkorzekwa/anymodel_build_library_and_stats is merged there.  ## Summary by CodeRabbit ## Release Notes * **New Features** * Enabled distributed scoring step in puzzletron workflow for improved performance. * **Improvements** * Extended model loading support to accommodate additional model formats and configurations. * Optimized GPU memory management during validation with improved cache synchronization. * Streamlined validation function signatures for easier usage.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Merging dkorzekwa/mip_and_realize_models into dkorzekwa/any_model_calc_one_block_scores - this MR is only for reviewing. Ultimately dkorzekwa/mip_and_realize_models should be merged into feature/puzzletron once dkorzekwa/any_model_calc_one_block_scores is merged there.  ## Summary by CodeRabbit ## Release Notes * **New Features** * Enabled model realization step during compression workflow after scoring phase completes. * **Bug Fixes** * Fixed key-value head calculation in attention configuration sourcing. * **Tests** * Strengthened validation checks for compression artifacts and output directories; added rank-aware assertions for model compression expectations. * **Chores** * Minor documentation formatting updates.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

### What does this PR do? Merging dkorzekwa/any_model_other_models into dkorzekwa/mip_and_realize_models - this MR is only for reviewing. Ultimately dkorzekwa/any_model_other_models should be merged into feature/puzzletron once dkorzekwa/mip_and_realize_models is merged there.  ## Summary by CodeRabbit * **New Features** * Added support for multiple model architectures: Mistral Small, Nemotron H, Nemotron H v2, Qwen2, Qwen3 8B, and Qwen3 VL 30B. * Introduced new pruning configurations and optimization pipelines for supported models. * Added comprehensive model descriptor framework enabling automated weight conversion and configuration handling. * Extended support for Mixture of Experts (MoE) models with expert removal pruning capabilities. * **Tests** * Enhanced test coverage with parametrized configurations for multiple model variants.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

### What does this PR do? Merging dkorzekwa/anymodel_gptoss into dkorzekwa/any_model_other_models - this MR is only for reviewing. Ultimately dkorzekwa/anymodel_gptoss should be merged into feature/puzzletron once dkorzekwa/any_model_other_models is merged there.  ## Summary by CodeRabbit * **New Features** * Added support for GPT-OSS, Nemotron V2, Qwen2, and Qwen3 models. * Enabled MXFP4 quantization for GPT-OSS model compression. * Added expert removal pruning for mixture-of-experts (MoE) models. * **Bug Fixes** * Fixed padding token validation to ensure it doesn't exceed vocabulary size. * **Tests** * Enabled test coverage for GPT-OSS model workflows.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Merge any_model tutorial for Puzzletron.  ## Summary by CodeRabbit * **New Features** * MIP sweep mode for multi-rate memory-compression searches * Triton-ready HuggingFace deployable, lm-eval adapter, and Ray-compatible inference pathways * Megatron-Bridge distillation CLI/workflow with optional HF export * **New Configurations** * Extensive pruning/memory-sweep profiles for GPT‑Oss, Llama, Mistral, Nemotron, Qwen families * **Documentation** * GptOss guide and conversion example, expanded READMEs, MIP Quick Start, NeMo evaluator notes * **Chores** * Requirements updated (lm-eval bumped; math-verify, ray added)  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Merge anymodel mbridge distillation  ## Summary by CodeRabbit * **New Features** - Heterogeneous-layer support for Puzzletron AnyModel → Megatron-Core conversions - Knowledge-distillation workflow support and tooling - Export of trained models to HuggingFace format (new export helper) - Bridge support extended for Llama and Qwen3 architectures - New CLI options to control HuggingFace export destination and template model * **Tests** - Added integration test validating the distillation + export flow - Removed an obsolete trivial test * **Documentation** - Updated README to document HF export flags * **Chores** - CI: switched example-tests workflow to use NeMo and updated pip invocation  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

#1047) ### What does this PR do? The remaining difference between dkorzekwa/any_model and dkorzekwa/anymodel_mbridgedist  ## Summary by CodeRabbit * **New Features** * Added an end-to-end MIP sweep workflow that runs sweeps, collects metrics, and writes CSV reports. * **Improvements** * Simplified attention implementation selection and clarified attention display wording. * Realization pipeline now returns realized model solution paths. * Caching behavior adjusted for token ID retrieval. * Test model generation scales layer count with distributed size. * **Bug Fixes** * Checkpoint hook state restored using the correct load API. * **Chores** * Extended pre-commit license hook exclusion. * **Tests** * Added GPU/distributed test fixtures and dtype/checkout helpers.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

### What does this PR do? - Delete unused decilm code  ## Summary by CodeRabbit ## Release Notes * **Removals** * Removed model conversion utilities for Llama-Nemotron format * Removed DeciLM model classes, tokenizer implementations, and configuration utilities * Removed checkpoint import/export functionality * Removed heterogeneous transformer layer specifications and configuration builders * **Updates** * Updated pre-commit configuration for additional file exclusions * Updated imports across modules to reflect removed dependencies  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Delete not used decilm code  ## Summary by CodeRabbit * **Refactor** * Removed DeciLM-specific components including decoder layers, attention implementations, and specialized cache utilities, streamlining the codebase * Updated replacement library to use generic model configurations instead of DeciLM-specific types, improving compatibility with diverse architectures * Cleaned up internal utilities for attention masking, flash attention compatibility, and rotary position embeddings  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Integration tests for subblock stats (memory + num_of_params)  ## Summary by CodeRabbit * **Refactor** * Improved teacher model configuration loading and statistics computation for enhanced accuracy in memory and parameter calculations. * **Tests** * Added comprehensive tests validating teacher model memory and parameter statistics calculations with strict accuracy tolerances.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Refactoring of subblock stats to stop using DeciLM code and use anymodel instead.  ## Summary by CodeRabbit * **Refactor** * Restructured internal estimation logic for model memory and parameters to support broader model architectures. * Updated model initialization utilities. * **Tests** * Updated baseline metrics for validation tests.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Removing unused code from modelopt/torch/puzzletron/decilm/deci_lm_hf_code - completed.  ## Summary by CodeRabbit * **Chores** * Removed vendored Transformers modules including activation functions, cache utilities, and configuration classes. * Removed custom DeciLM configuration class. * Updated internal type signatures from DeciLMConfig to standard Hugging Face PretrainedConfig. * Simplified internal function signatures by removing unused parameters. * Enhanced code quality checks via pre-commit configuration updates.  --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

### What does this PR do? Remove hardcoded trust_remote_code=true  ## Summary by CodeRabbit ## Release Notes * **Documentation** * Clarified documentation for parameter counting in model layer calculations. * **Bug Fixes** * Remote code execution is now disabled by default in model configuration loading. Explicitly enable `trust_remote_code` when using custom modeling code.  Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

### What does this PR do? Use pruning importance hooks from prune/importance_hooks/ in modelopt.torch.puzzletron --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

coderabbitai · 2026-03-25T19:14:11Z

📝 Walkthrough

Walkthrough

Introduces Puzzletron, a comprehensive LLM model compression algorithm featuring model conversion to a heterogeneous block-config format, activation scoring, multi-stage pruning (FFN/attention/expert/hidden-dim), MIP-based architecture search, and model realization. Includes AnyModel framework for supporting multiple model families, distributed execution infrastructure, evaluation tools, and extensive configuration examples for multiple model variants.

Changes

Cohort / File(s)	Summary
Workflow & Configuration `.github/workflows/example_tests.yml`, `.github/workflows/gpu_tests.yml`, `.github/CODEOWNERS`, `.pre-commit-config.yaml`	Updated CI/CD workflows to support puzzletron example tests, changed GPU test matrix from multiple frameworks to single `gpu-puzzletron` entry, added Puzzletron CODEOWNERS entry, excluded Puzzletron files from linting checks.
Documentation & Examples `examples/pruning/README.md`, `examples/puzzletron/*`	Added Puzzletron end-to-end README documenting compression workflow, GPTOSS support, evaluation via lm-eval and NeMo Evaluator, MIP sweep mode, knowledge distillation guide, and multiple model config examples (Llama, Mistral, Qwen variants, Nemotron, GPT-OSS).
AnyModel Framework `modelopt/torch/puzzletron/anymodel/`	Introduced extensible model conversion framework with converter factory, model descriptor registry, support for multiple HF models, and model-specific layer handling (GPT-OSS, Llama, Mistral, Qwen, Nemotron variants).
Pruning & Initialization `modelopt/torch/puzzletron/pruning/`, `modelopt/torch/puzzletron/tools/bypassed_training/`	Added pluggable pruning mixins (FFN intermediate, KV-heads, expert removal, hidden dimension), initialization modes, and child checkpoint creation from parent via config overrides and pruning strategies.
Activation Scoring & Hooks `modelopt/torch/puzzletron/activation_scoring/`, `modelopt/torch/prune/importance_hooks/`	Introduced activation hook registration utilities, expert removal importance hooks for MoE pruning, and checkpoint save/resume for scored activations.
Block Configuration `modelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.py`	Defined frozen dataclass schema for heterogeneous block configs (attention/FFN/MoE/Mamba with per-subblock flags and validation).
Model Conversion Utilities `modelopt/torch/puzzletron/tools/checkpoint_utils.py`, `modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py`, `modelopt/torch/puzzletron/tools/sharded_checkpoint_utils.py`	Added checkpoint loading/saving utilities supporting safetensors sharding, config overrides, dynamic module caching, multi-threaded weight I/O, and per-rank shard construction.
Replacement Library & MIP `modelopt/torch/puzzletron/replacement_library/`, `modelopt/torch/puzzletron/mip/`	Implemented replacement library building (deduplication, block/subblock organization), MIP solver for multi-layer replacement selection, sweep mode for memory compression rates, and solution validation/realization orchestration.
Subblock Statistics `modelopt/torch/puzzletron/subblock_stats/`	Added computation of per-subblock memory (including KV-cache, attention, FFN, MoE) and parameter counts with dtype/batch/sequence-length parametrization.
Scoring & Validation `modelopt/torch/puzzletron/scoring/`	Introduced solution validation orchestration, metrics gathering from multiple solution/checkpoint variants, and scoring entrypoint for hub-to-validation workflow.
Module Stitching Framework `modelopt/torch/puzzletron/sewing_kit/`	Defined graph-based module composition system supporting internal/external/remote targets, activity context, passage-based I/O capture/override, and dependency-driven execution.
Evaluation & Distillation `examples/puzzletron/evaluation/`, `examples/puzzletron/mbridge_distillation/`, `modelopt/torch/puzzletron/export/mbridge/`	Added HuggingFace deployment wrapper for Triton/Ray inference, lm-eval integration, Megatron-Bridge distillation script, and heterogeneous bridge providers for HF-to-Megatron export.
Utilities & Infrastructure `modelopt/torch/puzzletron/tools/hydra_utils.py`, `modelopt/torch/puzzletron/tools/logger.py`, `modelopt/torch/puzzletron/tools/robust_json.py`, `modelopt/torch/puzzletron/dataset/`	Added Hydra config initialization, distributed logging with rank-selective output, robust JSON serialization for dataclasses/Path/Enum, and dataset preparation utilities.
Main Entrypoints `examples/puzzletron/main.py`, `modelopt/torch/puzzletron/puzzletron.py`, `modelopt/torch/puzzletron/build_library_and_stats.py`, `modelopt/torch/puzzletron/nas/plugins/puzzletron_nas_plugin.py`	Defined CLI orchestration for full vs. MIP-only modes, unified library+stats building, and NAS mode registration with searcher/converter/restore callbacks.
Importance Hook Extensions `modelopt/torch/prune/importance_hooks/expert_removal_hooks.py`, `modelopt/torch/prune/importance_hooks/__init__.py`, `modelopt/torch/prune/importance_hooks/base_hooks.py`	Added expert removal importance hooks via independent/router-based strategies, ranked-choice routing metadata, and minor fixes to activation hook kwargs handling and KV-head grouping logic.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Main Entrypoint
    participant Convert as Model Converter
    participant Score as Activation Scoring
    participant Prune as Pruning
    participant Library as Replacement Library
    participant MIP as MIP Optimizer
    participant Realize as Model Realization
    participant Evaluate as Evaluation

    CLI->>Convert: 1. Convert HF model to Puzzletron format
    Convert->>Convert: Apply ModelDescriptor, create block_configs
    Convert-->>CLI: Converted heterogeneous checkpoint
    
    CLI->>Score: 2. Score pruning activations
    Score->>Score: Collect forward activations with hooks
    Score-->>CLI: Activation logs per layer
    
    CLI->>Prune: 3. Generate pruned child checkpoints
    Prune->>Prune: For each pruning strategy (FFN/attention/expert)
    Prune->>Prune: Initialize children from parent via mode
    Prune-->>CLI: Multiple pruned checkpoint variants
    
    CLI->>Library: 4. Build replacement library
    Library->>Library: Deduplicate subblocks, organize by layer
    Library->>Library: Create layer replacement mappings
    Library-->>CLI: Replacement library + single sequence solution
    
    CLI->>MIP: 5. Run MIP optimization
    MIP->>MIP: Gather subblock stats + metrics
    MIP->>MIP: Solve multi-layer selection (memory constraint)
    MIP-->>CLI: Ranked solutions with objectives/costs
    
    CLI->>Realize: 6. Realize models from solutions
    Realize->>Realize: Load replacement library, select configs
    Realize->>Realize: Assemble heterogeneous model
    Realize-->>CLI: Final pruned HF checkpoints
    
    CLI->>Evaluate: 7. Evaluate (lm-eval or NeMo Evaluator)
    Evaluate->>Evaluate: Benchmark accuracy/throughput
    Evaluate-->>CLI: Metrics (loss, token accuracy, speed)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

The PR introduces a substantial, multi-component compression pipeline with significant new abstractions (factory patterns, descriptor-based extensibility, MIP solving, graph stitching), distributed execution logic, and intricate initialization/pruning strategies. While some configurations are repetitive, the core infrastructure (converters, model descriptors, pruning mixins, MIP, replacement library, checkpoint utilities) contains dense logic and heterogeneous change patterns requiring separate reasoning per module. The high file count, variety of concerns (conversion, scoring, optimization, distillation, evaluation), and deep integration with existing modelopt patterns elevate complexity beyond simple refactoring.

Possibly related PRs

PR #1035: Overlaps significantly in Puzzletron example files and configuration structure, indicating parallel or iterative work on the same compression pipeline and model family support.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/puzzletron

github-actions · 2026-03-25T19:18:33Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1121/
Built to branch `gh-pages` at 2026-03-25 19:18 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

Actionable comments posted: 12

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟡 Minor comments (15)

modelopt/torch/puzzletron/tools/logger.py-164-168 (1)
164-168: ⚠️ Potential issue | 🟡 Minor

Docstring says "rank -1 in node 0" but implementation may not match.

The lprint docstring says "Last rank prints only (rank -1 in node 0)" but the implementation checks local_rank != world_size - 1. If the intent is the last rank globally, this logic appears incorrect. If the intent is the last local rank within a node, the comparison should be against local world size, not global world size.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/tools/logger.py` around lines 164 - 168, The
docstring for lprint mismatches its rank-check logic; update either the
docstring or the implementation in lprint to match the intended behavior: if you
want "last rank in node 0" (last local rank) change the check to compare
local_rank against local_world_size - 1 (use the local world size variable used
elsewhere in logger or from torch/distributed) before calling logger.dist_log,
or if you intended the global last rank adjust the docstring to "last global
rank" and ensure the check compares local_rank against world_size - 1
accordingly; reference the lprint function and logger.dist_log to locate where
to change the comparison and/or docstring.
modelopt/torch/puzzletron/tools/logger.py-62-66 (1)
62-66: ⚠️ Potential issue | 🟡 Minor

Error message is inconsistent with valid choices.

The error message on line 65 lists ['all', 'main', 'local_main'] but 'last' is also a valid choice per line 62.
🐛 Proposed fix
         if ranks not in ["all", "main", "local_main", "last"]:
             raise NotImplementedError(
                 f"Could not broadcast msg {msg} - "
-                f"ranks parameters choices are ['all', 'main', 'local_main']. Got {ranks}"
+                f"ranks parameters choices are ['all', 'main', 'local_main', 'last']. Got {ranks}"
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/tools/logger.py` around lines 62 - 66, The
NotImplementedError message for the ranks check mentions the wrong set of valid
choices; update the f-string in the ranks validation (the raise
NotImplementedError block that references the variable ranks and msg) to include
'last' in the displayed list (i.e., ["all", "main", "local_main", "last"]) so
the error accurately reflects the valid options.
examples/pruning/README.md-10-10 (1)
10-10: ⚠️ Potential issue | 🟡 Minor

Fix compound adjective hyphenation in docs.

At Line 10, use “MIP-based” (not “MIP based”) for correct technical phrasing.
✏️ Proposed edit
-1. [Puzzletron](../puzzletron/README.md): An advanced pruning method by NVIDIA using Mixed Integer Programming (MIP) based NAS search algorithm.
+1. [Puzzletron](../puzzletron/README.md): An advanced pruning method by NVIDIA using a Mixed Integer Programming (MIP)-based NAS search algorithm.
As per coding guidelines, "**/*.{py,cpp,h,hpp,md}: All code (Python, C++, Markdown, etc.) must adhere to coding standards as checked by pre-commit hooks".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/pruning/README.md` at line 10, Update the compound adjective "MIP
based" in the Puzzletron README entry to the hyphenated form "MIP-based" so the
phrase reads "MIP-based NAS search algorithm"; locate the text string
"Puzzletron" / the line containing "MIP based NAS search algorithm" and replace
it with "MIP-based" to comply with the Markdown/style pre-commit checks.
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b_remove_experts_memory.yaml-14-17 (1)
14-17: ⚠️ Potential issue | 🟡 Minor

Fix the mismatch between target_memory and its comment.

16_000 MiB is about 15.6 GiB; 45 GiB would be 46_080 MiB. One of these is wrong, and the current ambiguity can make the example optimize for the wrong budget.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b_remove_experts_memory.yaml`
around lines 14 - 17, The mip.human_constraints.target_memory value
(target_memory: 16_000) and its comment (“45 GiB”) disagree; update one to match
the other: either set target_memory to 46_080 (MiB) if you mean 45 GiB, or
change the comment to match 16_000 MiB (~15.6 GiB). Locate the target_memory key
under mip -> human_constraints and make the numeric value and the GiB comment
consistent.
examples/puzzletron/configs/nemotron-nano-12b-v2/nemotron_nano_12b_v2_pruneffn_memory.yaml-14-17 (1)
14-17: ⚠️ Potential issue | 🟡 Minor

Align the MiB value with the GiB comment.

90_000 MiB is about 87.9 GiB, not 90 GiB. If the intended cap is 90 GiB, this should be 92_160; otherwise update the comment so the search target is unambiguous.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/puzzletron/configs/nemotron-nano-12b-v2/nemotron_nano_12b_v2_pruneffn_memory.yaml`
around lines 14 - 17, The target_memory under mip.human_constraints is
inconsistent with the "90 GiB" comment; update the value of target_memory to
92_160 (MiB) to represent 90 GiB, or alternatively change the comment to the
actual GiB represented by 90_000 MiB (~87.9 GiB); edit the target_memory key in
this YAML (mip -> human_constraints -> target_memory) accordingly to keep value
and comment aligned.
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/llama-3_2-3B_pruneffn_memory.yaml-17-17 (1)
17-17: ⚠️ Potential issue | 🟡 Minor

Memory unit comment is inaccurate.

Line 17 uses 45_000 MiB, which is ~43.95 GiB, not 45 GiB. Please update either the value or the comment to avoid misconfigured constraints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/llama-3_2-3B_pruneffn_memory.yaml`
at line 17, The comment next to the target_memory value is incorrect:
target_memory: 45_000 is in MiB (~43.95 GiB), not 45 GiB; either set
target_memory to 46_080 (45 * 1024) to represent 45 GiB exactly or update the
inline comment to the correct GiB approximation (≈43.95 GiB); locate the
target_memory entry and apply one of these fixes so the value and its comment
match.
modelopt/torch/puzzletron/anymodel/models/llama/llama_converter.py-36-52 (1)
36-52: ⚠️ Potential issue | 🟡 Minor

Return type annotation mismatch: method declares List[BlockConfig] but returns List[dict].

The method signature declares a return type of List[BlockConfig], but line 50 calls .to_dict() on each BlockConfig instance, causing the method to actually return List[dict]. Update the return type annotation to match the actual behavior.
🔧 Suggested fix
     `@staticmethod`
-    def create_block_configs_from_main_config(config: LlamaConfig) -> List[BlockConfig]:
+    def create_block_configs_from_main_config(config: LlamaConfig) -> List[dict]:
         """Create uniform block configs for all Llama layers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/anymodel/models/llama/llama_converter.py` around
lines 36 - 52, The function create_block_configs_from_main_config currently
builds BlockConfig objects but calls .to_dict(), so its declared return type
List[BlockConfig] is incorrect; update the signature to return List[dict] (or
List[MutableMapping] if you prefer an abstract type) to match the actual
returned value, and ensure the typing import (List) is present/adjusted
accordingly; reference create_block_configs_from_main_config, LlamaConfig,
BlockConfig, and the .to_dict() call when making the change.
modelopt/torch/puzzletron/anymodel/converter/converter_factory.py-43-47 (1)
43-47: ⚠️ Potential issue | 🟡 Minor

Handle potential None from inspect.getmodule().

inspect.getmodule(ref) can return None if the module cannot be determined, which would cause an AttributeError when accessing .__name__.
🛡️ Proposed fix
-                raise KeyError(
-                    f"Could not register `{cls_name}`: {cls_type}, "
-                    f"`{cls_name}` is already registered and points to "
-                    f"`{inspect.getmodule(ref).__name__}.{ref.__name__}`"
-                )
+                module = inspect.getmodule(ref)
+                module_name = module.__name__ if module else "<unknown>"
+                raise KeyError(
+                    f"Could not register `{cls_name}`: {cls_type}, "
+                    f"`{cls_name}` is already registered and points to "
+                    f"`{module_name}.{ref.__name__}`"
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/anymodel/converter/converter_factory.py` around
lines 43 - 47, The KeyError message in converter_factory.py assumes
inspect.getmodule(ref) always returns a module; update the error construction in
the registration failure branch (where cls_name, cls_type and ref are used) to
guard against None from inspect.getmodule(ref) by using a safe fallback (e.g.,
module = inspect.getmodule(ref); module_name = module.__name__ if module is not
None else getattr(ref, "__module__", "unknown")) and use module_name rather than
inspect.getmodule(ref).__name__; ensure ref.__name__ (or getattr(ref,
"__name__", repr(ref))) is similarly accessed safely so the f-string does not
raise AttributeError.
modelopt/torch/puzzletron/puzzletron.py-60-74 (1)
60-74: ⚠️ Potential issue | 🟡 Minor

Step numbering is inconsistent.

Comments jump from "Step 2" to "Step 4", skipping "Step 3". This appears to be a typo or leftover from a refactor. Consider renumbering for clarity:
📝 Proposed fix
     # Step 2: pruning_ckpts (single process)
     if dist.is_master():
         pruning_ckpts.launch_prune_ckpt(hydra_cfg)
     dist.barrier()

-    # Step 4: build_library_and_stats (single process)
+    # Step 3: build_library_and_stats (single process)
     if dist.is_master():
         build_library_and_stats.launch_build_library_and_stats(hydra_cfg)
     dist.barrier()

-    # Step 5: calc_one_block_scores (distributed processing)
+    # Step 4: calc_one_block_scores (distributed processing)
     scoring.launch_scoring(hydra_cfg)

-    # Step 6: mip_and_realize_models (distributed processing)
+    # Step 5: mip_and_realize_models (distributed processing)
     mip_and_realize_models.launch_mip_and_realize_model(hydra_cfg)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/puzzletron.py` around lines 60 - 74, The step
comments are misnumbered (jump from "Step 2" to "Step 4"); update the inline
step comments in puzzletron.py to use a consistent sequence (e.g., Step 2 → Step
3 → Step 4 etc.) around the blocks that call pruning_ckpts.launch_prune_ckpt,
build_library_and_stats.launch_build_library_and_stats, scoring.launch_scoring,
and mip_and_realize_models.launch_mip_and_realize_model so the step numbers
reflect the actual order of execution.
examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/pruning_defaults.yaml-20-31 (1)
20-31: ⚠️ Potential issue | 🟡 Minor

Remove unused ffn_list definition or align with intermediate_size_list usage.

The file serves as a defaults template for Hydra configuration, but there's an inconsistency: ffn_list is defined on line 21 with no value, yet the actual FFN pruning configuration (ffn_pruning.yaml) populates and uses intermediate_size_list instead. Either rename ffn_list to intermediate_size_list for consistency, or remove the unused ffn_list definition.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/pruning_defaults.yaml`
around lines 20 - 31, The defaults file defines an unused key ffn_list while the
FFN pruning logic uses intermediate_size_list (see ffn_pruning.yaml); fix by
either removing the ffn_list entry entirely or renaming it to
intermediate_size_list and ensuring the init mode key (mlp_init_mode) matches
the expected name used by ffn_pruning.yaml so the Hydra defaults align with the
pruning config.
modelopt/torch/puzzletron/tools/hydra_utils.py-29-36 (1)
29-36: ⚠️ Potential issue | 🟡 Minor

Validate block and mbs before division in warmup_steps.

block=0 or mbs=0 causes a runtime failure. Add explicit positive-value checks and fail early with a clear error.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/tools/hydra_utils.py` around lines 29 - 36, The
warmup_steps function currently divides by block and mbs which will raise on
zero; update warmup_steps to validate that block and mbs are positive integers
(e.g., block > 0 and mbs > 0) at the top of the function and raise a clear
ValueError if not; keep the existing logic (compute steps =
(tokens//block)//mbs, compute w = pct * steps, return max(1, round(w))) but fail
fast with an explanatory error mentioning the parameter names block and mbs so
callers get a clear message when invalid values are passed.
modelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.py-46-47 (1)
46-47: ⚠️ Potential issue | 🟡 Minor

Avoid mutating caller-owned activation_hooks_kwargs.

This function modifies the input dict in place (["model"] = model), which can leak state across calls. Copy it first before enrichment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.py`
around lines 46 - 47, The code mutates the caller-owned dict
activation_hooks_kwargs by setting activation_hooks_kwargs["model"] = model;
instead, make a shallow copy of activation_hooks_kwargs (e.g., new_kwargs =
activation_hooks_kwargs.copy() or dict(activation_hooks_kwargs)), add the
"model" key to the copy, and use/return that copy so the original dict isn’t
mutated; update any return/usage to use the new_kwargs variable.
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/Llama-3_1-8B.yaml-13-13 (1)
13-13: ⚠️ Potential issue | 🟡 Minor

Minor typo in comment.

ppath should be path.
Proposed fix
-dataset_path: ??? # ppath to Nemotron-Post-Training-Dataset-v2
+dataset_path: ??? # path to Nemotron-Post-Training-Dataset-v2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/Llama-3_1-8B.yaml`
at line 13, Fix the typo in the YAML comment for the dataset key: change the
comment after the dataset_path entry so it reads "path to
Nemotron-Post-Training-Dataset-v2" instead of "ppath to
Nemotron-Post-Training-Dataset-v2" so the comment for the dataset_path key is
correct and clear.
modelopt/torch/puzzletron/tools/validate_model.py-138-140 (1)
138-140: ⚠️ Potential issue | 🟡 Minor

Potential issue: validation_full_iters could be zero.

If args.eval_samples is less than args.micro_batch_size, the integer division results in validation_full_iters = 0. This could cause unexpected behavior in downstream code that expects at least one iteration.
🛡️ Proposed fix: Add validation or use ceiling division
     validation_full_iters = (
-        args.eval_samples // args.micro_batch_size
+        max(1, args.eval_samples // args.micro_batch_size)
     )  # model pipeline, single data rank
Or add an assertion:
     validation_full_iters = (
         args.eval_samples // args.micro_batch_size
     )  # model pipeline, single data rank
+    assert validation_full_iters > 0, f"eval_samples ({args.eval_samples}) must be >= micro_batch_size ({args.micro_batch_size})"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/tools/validate_model.py` around lines 138 - 140,
The calculation of validation_full_iters can yield zero when args.eval_samples <
args.micro_batch_size; update the logic around validation_full_iters in
validate_model.py to ensure at least one validation iteration by either
validating inputs (raise/exit if args.eval_samples < args.micro_batch_size) or
computing a ceiling division and then taking max(1, ...); reference the variable
validation_full_iters and the CLI args args.eval_samples and
args.micro_batch_size when making the change so downstream loops that expect >=1
iteration do not break.
modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py-316-321 (1)
316-321: ⚠️ Potential issue | 🟡 Minor

Actually emit the missing-MoE-stats warning.

Warning(...) only instantiates an object; it does not notify the caller. Right now a missing moe_stats.json silently changes active_params from MoE-aware counts to total params. Use warnings.warn(...) or mprint(...) here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py` around lines
316 - 321, The code creates a pathlib object moe_stats_file and on missing file
only instantiates Warning(...) which does nothing; update the block around
moe_stats_file (using variables moe_stats_file, moe_stats_filename,
master_puzzle_dir and the downstream active_params logic) to actually emit a
warning by calling warnings.warn(...) (or mprint(...)) with a clear message like
f"MOE stats file {moe_stats_file} does not exist, can't calculate num active
params", and add `import warnings` at the top if using warnings.warn; keep
setting moe_stats_file = None so the fallback to total params remains.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 31341e54-9727-48f6-b865-606bb7607464

📥 Commits

Reviewing files that changed from the base of the PR and between 291498b and c5ec50b.

⛔ Files ignored due to path filters (2)

examples/puzzletron/mip_sweep_example.png is excluded by !**/*.png
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (212)

.github/CODEOWNERS
.github/workflows/example_tests.yml
.github/workflows/gpu_tests.yml
.pre-commit-config.yaml
examples/pruning/README.md
examples/puzzletron/GPTOSS.md
examples/puzzletron/README.md
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b.yaml
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/gptoss-20b_remove_experts_memory.yaml
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/validate_model_defaults.yaml
examples/puzzletron/configs/gptoss-20b_remove_experts_memory/validate_solutions_defaults.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/Llama-3_1-8B.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/attn_pruning.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/hidden_dim_pruning.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_model_defaults.yaml
examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/validate_solutions_defaults.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/Llama-3_2-3B.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/llama-3_2-3B_pruneffn_memory.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/validate_model_defaults.yaml
examples/puzzletron/configs/llama-3_2-3B_pruneffn_memory/validate_solutions_defaults.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/Mistral-Small-24B.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/mistral-small-24b-instruct-2501_pruneffn_memory.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/pruning/attn_pruning.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/pruning/hidden_dim_pruning.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/validate_model_defaults.yaml
examples/puzzletron/configs/mistral-small-24b-instruct-2501_pruneffn_memory/validate_solutions_defaults.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/nemotron_nano_12b_v2.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/nemotron_nano_12b_v2_pruneffn_memory.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/attn_pruning.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/ffn_pruning.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/hidden_dim_pruning.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/pruning/pruning_defaults.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/validate_model_defaults.yaml
examples/puzzletron/configs/nemotron-nano-12b-v2/validate_solutions_defaults.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/pruning/attn_pruning.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/pruning/hidden_dim_pruning.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/qwen2_5_7b_instruct.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/qwen2_5_7b_instruct_pruneffn_memory.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/validate_model_defaults.yaml
examples/puzzletron/configs/qwen2_5_7b_instruct_pruneffn_memory/validate_solutions_defaults.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/pruning/attn_pruning.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/pruning/ffn_pruning.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/pruning/hidden_dim_pruning.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/pruning/pruning_defaults.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/qwen3_8b.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/qwen3_8b_pruneffn_memory.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/validate_model_defaults.yaml
examples/puzzletron/configs/qwen3-8b_pruneffn_memory/validate_solutions_defaults.yaml
examples/puzzletron/evaluation/hf_deployable_anymodel.py
examples/puzzletron/evaluation/lm_eval_anymodel.py
examples/puzzletron/evaluation/nemo_evaluator_instructions.md
examples/puzzletron/main.py
examples/puzzletron/mbridge_distillation/README.md
examples/puzzletron/mbridge_distillation/distill_hf.py
examples/puzzletron/requirements.txt
modelopt/torch/prune/importance_hooks/__init__.py
modelopt/torch/prune/importance_hooks/base_hooks.py
modelopt/torch/prune/importance_hooks/expert_removal_hooks.py
modelopt/torch/puzzletron/README.md
modelopt/torch/puzzletron/__init__.py
modelopt/torch/puzzletron/activation_scoring/activation_hooks/__init__.py
modelopt/torch/puzzletron/activation_scoring/activation_hooks/utils.py
modelopt/torch/puzzletron/activation_scoring/score_pruning_activations.py
modelopt/torch/puzzletron/anymodel/README.md
modelopt/torch/puzzletron/anymodel/__init__.py
modelopt/torch/puzzletron/anymodel/converter/__init__.py
modelopt/torch/puzzletron/anymodel/converter/convert_any_model.py
modelopt/torch/puzzletron/anymodel/converter/converter.py
modelopt/torch/puzzletron/anymodel/converter/converter_factory.py
modelopt/torch/puzzletron/anymodel/model_descriptor/__init__.py
modelopt/torch/puzzletron/anymodel/model_descriptor/model_descriptor.py
modelopt/torch/puzzletron/anymodel/model_descriptor/model_descriptor_factory.py
modelopt/torch/puzzletron/anymodel/models/__init__.py
modelopt/torch/puzzletron/anymodel/models/gpt_oss/__init__.py
modelopt/torch/puzzletron/anymodel/models/gpt_oss/gpt_oss_converter.py
modelopt/torch/puzzletron/anymodel/models/gpt_oss/gpt_oss_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/gpt_oss/gpt_oss_pruned_to_mxfp4.py
modelopt/torch/puzzletron/anymodel/models/llama/__init__.py
modelopt/torch/puzzletron/anymodel/models/llama/llama_converter.py
modelopt/torch/puzzletron/anymodel/models/llama/llama_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/mistral_small/__init__.py
modelopt/torch/puzzletron/anymodel/models/mistral_small/mistral_small_converter.py
modelopt/torch/puzzletron/anymodel/models/mistral_small/mistral_small_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h/__init__.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h/nemotron_h_converter.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h/nemotron_h_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h_v2/__init__.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h_v2/nemotron_h_v2_converter.py
modelopt/torch/puzzletron/anymodel/models/nemotron_h_v2/nemotron_h_v2_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/qwen2/__init__.py
modelopt/torch/puzzletron/anymodel/models/qwen2/qwen2_converter.py
modelopt/torch/puzzletron/anymodel/models/qwen2/qwen2_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/qwen3_8b/__init__.py
modelopt/torch/puzzletron/anymodel/models/qwen3_8b/qwen3_8b_converter.py
modelopt/torch/puzzletron/anymodel/models/qwen3_8b/qwen3_8b_model_descriptor.py
modelopt/torch/puzzletron/anymodel/models/qwen3_vl_30b_a3b_instruct/__init__.py
modelopt/torch/puzzletron/anymodel/models/qwen3_vl_30b_a3b_instruct/qwen3_vl_30b_a3b_instruct_converter.py
modelopt/torch/puzzletron/anymodel/models/qwen3_vl_30b_a3b_instruct/qwen3_vl_30b_a3b_instruct_model_descriptor.py
modelopt/torch/puzzletron/anymodel/puzzformer/__init__.py
modelopt/torch/puzzletron/anymodel/puzzformer/no_op.py
modelopt/torch/puzzletron/anymodel/puzzformer/utils.py
modelopt/torch/puzzletron/build_library_and_stats.py
modelopt/torch/puzzletron/dataset/__init__.py
modelopt/torch/puzzletron/dataset/prepare_dataset.py
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/__init__.py
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/block_config.py
modelopt/torch/puzzletron/decilm/deci_lm_hf_code/modeling_decilm.py
modelopt/torch/puzzletron/export/mbridge/__init__.py
modelopt/torch/puzzletron/export/mbridge/base.py
modelopt/torch/puzzletron/export/mbridge/distillation_provider.py
modelopt/torch/puzzletron/export/mbridge/export_mbridge_to_hf.py
modelopt/torch/puzzletron/export/mbridge/llama.py
modelopt/torch/puzzletron/export/mbridge/qwen3.py
modelopt/torch/puzzletron/mip/mip_and_realize_models.py
modelopt/torch/puzzletron/mip/mip_with_multi_layer_replacements.py
modelopt/torch/puzzletron/mip/run_puzzle.py
modelopt/torch/puzzletron/mip/sweep.py
modelopt/torch/puzzletron/mip/utils.py
modelopt/torch/puzzletron/nas/plugins/puzzletron_nas_plugin.py
modelopt/torch/puzzletron/pruning/expert_removal_pruning_mixin.py
modelopt/torch/puzzletron/pruning/ffn_intermediate_pruning_mixin.py
modelopt/torch/puzzletron/pruning/kv_heads_pruning_mixin.py
modelopt/torch/puzzletron/pruning/pruning_ckpts.py
modelopt/torch/puzzletron/pruning/pruning_mixin.py
modelopt/torch/puzzletron/pruning/pruning_utils.py
modelopt/torch/puzzletron/puzzletron.py
modelopt/torch/puzzletron/replacement_library/build_replacement_library.py
modelopt/torch/puzzletron/replacement_library/replacement_library.py
modelopt/torch/puzzletron/replacement_library/replacement_utils.py
modelopt/torch/puzzletron/scoring/scoring.py
modelopt/torch/puzzletron/sewing_kit/__init__.py
modelopt/torch/puzzletron/sewing_kit/core.py
modelopt/torch/puzzletron/sewing_kit/passage/__init__.py
modelopt/torch/puzzletron/sewing_kit/passage/core.py
modelopt/torch/puzzletron/sewing_kit/utils.py
modelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.py
modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py
modelopt/torch/puzzletron/tools/__init__.py
modelopt/torch/puzzletron/tools/bypassed_training/child_init.py
modelopt/torch/puzzletron/tools/bypassed_training/init_child_from_parent.py
modelopt/torch/puzzletron/tools/checkpoint_utils.py
modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py
modelopt/torch/puzzletron/tools/common.py
modelopt/torch/puzzletron/tools/hydra_utils.py
modelopt/torch/puzzletron/tools/kd_model.py
modelopt/torch/puzzletron/tools/logger.py
modelopt/torch/puzzletron/tools/post_init_sparse.py
modelopt/torch/puzzletron/tools/robust_json.py
modelopt/torch/puzzletron/tools/sharded_checkpoint_utils.py
modelopt/torch/puzzletron/tools/validate_model.py
modelopt/torch/puzzletron/tools/validate_puzzle_with_multi_replacements.py
modelopt/torch/puzzletron/tools/validation_utils.py
modelopt/torch/puzzletron/utils/checkpoint_manager.py
modelopt/torch/puzzletron/utils/data/dataloaders.py
modelopt/torch/puzzletron/utils/data/dataset.py
modelopt/torch/puzzletron/utils/dummy_modules.py
modelopt/torch/puzzletron/utils/parsing.py
modelopt/torch/puzzletron/utils/utils.py
modelopt/torch/puzzletron/utils/validate_runtime_pipeline.py
modelopt/torch/puzzletron/utils/validation.py
modelopt/torch/utils/robust_json.py
pyproject.toml
tests/_test_utils/torch/puzzletron/utils.py
tests/examples/puzzletron/mbridge_distillation/test_distill_hf.py
tests/gpu/torch/conftest.py
tests/gpu/torch/puzzletron/conftest.py
tests/gpu/torch/puzzletron/nas/plugins/test_nas_convert.py
tests/gpu/torch/puzzletron/nas/plugins/test_nas_search.py
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen2.5-7B-Instruct/Qwen2.5-7B-Instruct.yaml
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen2.5-7B-Instruct/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen3-8B/Qwen3-8B.yaml
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen3-8B/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen3-VL-30B-A3B-Instruct/Qwen3-VL-30B-A3B-Instruct.yaml
tests/gpu/torch/puzzletron/resources/configs/Qwen/Qwen3-VL-30B-A3B-Instruct/pruning/expert_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.1-8B-Instruct/Llama-3.1-8B-Instruct-attn-pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.1-8B-Instruct/Llama-3.1-8B-Instruct.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.1-8B-Instruct/pruning/attn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.1-8B-Instruct/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.2-3B-Instruct/Llama-3.2-3B-Instruct.yaml
tests/gpu/torch/puzzletron/resources/configs/meta-llama/Llama-3.2-3B-Instruct/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/mistralai/Mistral-Small-24B-Instruct-2501/Mistral-Small-24B-Instruct-2501.yaml
tests/gpu/torch/puzzletron/resources/configs/mistralai/Mistral-Small-24B-Instruct-2501/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16.yaml
tests/gpu/torch/puzzletron/resources/configs/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16/pruning/expert_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/nvidia/NVIDIA-Nemotron-Nano-12B-v2/NVIDIA-Nemotron-Nano-12B-v2.yaml
tests/gpu/torch/puzzletron/resources/configs/nvidia/NVIDIA-Nemotron-Nano-12B-v2/pruning/ffn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/openai/gpt-oss-20b/gpt-oss-20b.yaml
tests/gpu/torch/puzzletron/resources/configs/openai/gpt-oss-20b/pruning/expert_removal.yaml
tests/gpu/torch/puzzletron/resources/configs/pruning/attn_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/pruning/ffn_pruning_base.yaml
tests/gpu/torch/puzzletron/resources/configs/pruning/hidden_dim_pruning.yaml
tests/gpu/torch/puzzletron/resources/configs/pruning/pruning_defaults.yaml
tests/gpu/torch/puzzletron/resources/configs/validate_model_defaults.yaml
tests/gpu/torch/puzzletron/resources/configs/validate_solutions_defaults.yaml
tests/gpu/torch/puzzletron/resources/tokenizer/special_tokens_map.json
tests/gpu/torch/puzzletron/resources/tokenizer/tokenizer.json
tests/gpu/torch/puzzletron/resources/tokenizer/tokenizer_config.json
tests/gpu/torch/puzzletron/resources/tokenizer/truncate_tokenizer.py
tests/gpu/torch/puzzletron/test_puzzletron.py
tox.ini

coderabbitai · 2026-03-25T19:35:56Z

examples/puzzletron/configs/nemotron-nano-12b-v2/validate_solutions_defaults.yaml

+solutions_to_validate:
+skip_validation: false
+save_models: false
+bigger_is_better: false
+sort_solutions_by:
+calculate_full_score_ablations: false


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash python -m pip install --quiet pyyaml python - <<'PY' from pathlib import Path import yaml p = Path("examples/puzzletron/configs/nemotron-nano-12b-v2/validate_solutions_defaults.yaml") data = yaml.safe_load(p.read_text()) print("Top-level keys:", list(data.keys())) print("solutions_to_validate value:", data.get("solutions_to_validate")) for k in ["skip_validation", "save_models", "bigger_is_better", "sort_solutions_by"]: print(f"top-level has '{k}':", k in data) PY

Repository: NVIDIA/Model-Optimizer

Length of output: 413

Indentation bug: solutions_to_validate block is not nested.

Lines 6-10 define keys at the top level instead of under solutions_to_validate, which will cause the solutions_to_validate key to be null rather than a dictionary. This breaks any config consumption expecting solutions_to_validate.skip_validation etc.

🐛 Proposed fix

solutions_to_validate: -skip_validation: false -save_models: false -bigger_is_better: false -sort_solutions_by: -calculate_full_score_ablations: false + skip_validation: false + save_models: false + bigger_is_better: false + sort_solutions_by: + calculate_full_score_ablations: false

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

solutions_to_validate:

skip_validation: false

save_models: false

bigger_is_better: false

sort_solutions_by:

calculate_full_score_ablations: false

solutions_to_validate:

skip_validation: false

save_models: false

bigger_is_better: false

sort_solutions_by:

calculate_full_score_ablations: false

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/puzzletron/configs/nemotron-nano-12b-v2/validate_solutions_defaults.yaml` around lines 5 - 10, The YAML file has an indentation bug: the keys skip_validation, save_models, bigger_is_better, sort_solutions_by, and calculate_full_score_ablations must be nested under the solutions_to_validate mapping; update the block so that solutions_to_validate is a mapping containing those keys (i.e., indent those five keys one level under solutions_to_validate) so that code referencing solutions_to_validate.skip_validation etc. sees a dictionary rather than null.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/mip/run_puzzle.py

+def _assert_valid_config(args, puzzle_profile):
+    required_args = (
+        "subblock_stats_path",
+        "objective",
+        "output_path",
+    )
+    missing_args = [arg for arg in required_args if arg not in args or getattr(args, arg) is None]
+    if missing_args:
+        mprint(f"error: The following arguments are required: {', '.join(missing_args)}")
+        exit(1)
+
+    # Make sure we have specified subblock_stats_args
+    if "subblock_stats_args" not in args and "subblock_stats_args" not in puzzle_profile:
+        mprint(
+            "error: Must specify `subblock_stats_arrs` in either puzzle_profile or as a commandline arg."
+        )
+        exit(1)
+
+    # Make sure we have specified constraints
+    if (
+        "mip_constraints" not in args
+        and "human_constraints" not in args
+        and "mip_constraints" not in puzzle_profile
+        and "human_constraints" not in puzzle_profile
+    ):
+        mprint(
+            "error: Must specify either `mip_constraints` or `human_constraints` in one of puzzle_profile or as a commandline argument."
+        )
+        exit(1)


⚠️ Potential issue | 🔴 Critical

The CLI validation path crashes on argparse.Namespace.

These membership checks use arg in args, but Namespace is not iterable. Running the script via __main__ therefore throws TypeError before any friendly validation message is emitted.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/mip/run_puzzle.py` around lines 395 - 423, The validation uses membership checks like "subblock_stats_args" in args and "mip_constraints" in args but args is an argparse.Namespace (not iterable), which raises TypeError; update _assert_valid_config to use getattr(args, '<name>', None) checks instead of "in args" (e.g., replace `"subblock_stats_args" not in args` with `getattr(args, "subblock_stats_args", None) is None`) and similarly for `"mip_constraints"` and `"human_constraints"`; keep the existing checks against puzzle_profile (which may be a dict) unchanged and ensure the missing_args logic still uses getattr as it already does.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/sewing_kit/core.py

+                    process_group = node.target.process_group
+                    peers = node.target.peer_rank
+                    if not isinstance(peers, Sequence):
+                        peers = [peers]
+
+                    if len(values_to_node) > 0:
+                        items_to_send = list(self.create_output_overrides(values_to_node).items())
+
+                        data_descriptors: list[RemoteDataDescriptor] = []
+                        tensors_to_send: list[torch.Tensor] = []
+
+                        for key, value in items_to_send:
+                            if isinstance(value, torch.Tensor):
+                                if value.is_cuda:
+                                    tensor_device = "cuda"
+                                elif value.is_cpu:
+                                    tensor_device = "cpu"
+                                else:
+                                    raise RuntimeError(
+                                        f"Invalid tensor device to send to remote target: {value.device}"
+                                    )
+
+                                data_descriptor = RemoteTensorDataDescriptor(
+                                    key=key,
+                                    device=tensor_device,
+                                    dtype=value.dtype,
+                                    shape=value.shape,
+                                )
+                                tensors_to_send.append(value)
+
+                            else:
+                                data_descriptor = RemotePythonDataDescriptor(
+                                    key=key,
+                                    value=value,
+                                )
+
+                            data_descriptors.append(data_descriptor)
+
+                        works: list[Optional[torch.distributed.Work]] = []
+                        for peer in peers:
+                            if process_group is not None:
+                                peer = torch.distributed.get_global_rank(process_group, peer)
+
+                            peer_works = distributed_isend_obj(data_descriptors, dst=peer)
+                            works.extend(peer_works)
+
+                            for tensor in tensors_to_send:
+                                work = torch.distributed.isend(tensor, dst=peer)
+                                works.append(work)
+
+                        if node.target.blocking:
+                            for work in works:
+                                if work is not None:
+                                    work.wait()


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, explore the file structure and locate the target file find . -name "core.py" -path "*puzzletron/sewing_kit*" | head -20

Repository: NVIDIA/Model-Optimizer

Length of output: 167

🏁 Script executed:

# Check the file size and read the relevant sections wc -l modelopt/torch/puzzletron/sewing_kit/core.py

Repository: NVIDIA/Model-Optimizer

Length of output: 114

🏁 Script executed:

# Read the specific lines mentioned in the review (624-677 and 687-718) sed -n '620,730p' modelopt/torch/puzzletron/sewing_kit/core.py | cat -n

Repository: NVIDIA/Model-Optimizer

Length of output: 6346

🏁 Script executed:

# Search for all uses of distributed_isend_obj, distributed_recv_obj in the file rg "distributed_(isend_obj|recv_obj|isend|irecv)" modelopt/torch/puzzletron/sewing_kit/core.py -B2 -A2

Repository: NVIDIA/Model-Optimizer

Length of output: 698

🏁 Script executed:

# Check the function signatures - find where these are imported from rg "from.*distributed_isend_obj|import.*distributed_isend_obj" modelopt/torch/puzzletron/sewing_kit/ -B2 -A2

Repository: NVIDIA/Model-Optimizer

Length of output: 445

🏁 Script executed:

# Check if there's a utils module or similar that defines distributed_isend_obj find modelopt/ -name "*.py" -type f -exec rg -l "def distributed_isend_obj|def distributed_recv_obj" {} \;

Repository: NVIDIA/Model-Optimizer

Length of output: 111

🏁 Script executed:

# Check the function signatures in utils.py sed -n '1,200p' modelopt/torch/puzzletron/sewing_kit/utils.py | grep -A 15 "def distributed_isend_obj\|def distributed_recv_obj"

Repository: NVIDIA/Model-Optimizer

Length of output: 48

🏁 Script executed:

# Get full definitions of both functions rg "def distributed_(isend_obj|recv_obj)" modelopt/torch/puzzletron/sewing_kit/utils.py -A 20

Repository: NVIDIA/Model-Optimizer

Length of output: 1582

🏁 Script executed:

# Check PyTorch distributed documentation - search for torch.distributed.isend signature rg "torch.distributed.isend|torch.distributed.irecv" modelopt/torch/puzzletron/sewing_kit/ -B2 -A8

Repository: NVIDIA/Model-Optimizer

Length of output: 3244

🏁 Script executed:

# Search for any examples in the codebase that use group parameter with these functions rg "distributed_isend_obj.*group|torch.distributed.isend.*group|torch.distributed.irecv.*group" modelopt/

Repository: NVIDIA/Model-Optimizer

Length of output: 652

🏁 Script executed:

# Search for any other potential P2P call sites in the file that might have similar issues rg "distributed.isend|distributed.irecv|distributed.send|distributed.recv|distributed_isend|distributed_recv" modelopt/torch/puzzletron/sewing_kit/core.py -n

Repository: NVIDIA/Model-Optimizer

Length of output: 483

Pass group=process_group to all distributed P2P calls.

The code extracts process_group and converts local peer ranks to global ranks when process_group is not None (lines 646 and 688 via torch.distributed.get_global_rank()), but the actual distributed_isend_obj(), isend(), distributed_recv_obj(), and irecv() calls at lines 667, 671, 690, and 713 do not receive the group parameter. This causes all communication to use the default process group instead of the specified subgroup, resulting in hangs or communication with the wrong peers.

🛠️ Suggested fix

- peer_works = distributed_isend_obj(data_descriptors, dst=peer) + peer_works = distributed_isend_obj( + data_descriptors, + dst=peer, + group=process_group, + ) @@ - work = torch.distributed.isend(tensor, dst=peer) + work = torch.distributed.isend( + tensor, + dst=peer, + group=process_group, + ) @@ - data_descriptors = distributed_recv_obj(src=peer) + data_descriptors = distributed_recv_obj( + src=peer, + group=process_group, + ) @@ - work = torch.distributed.irecv(tensor, src=peer) + work = torch.distributed.irecv( + tensor, + src=peer, + group=process_group, + )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/sewing_kit/core.py` around lines 624 - 677, The P2P calls are missing the process group argument so they use the default group; update all distributed send/recv calls to pass group=process_group where available: add group=process_group to calls to distributed_isend_obj(...) and distributed_recv_obj(...), and to torch.distributed.isend(...) and torch.distributed.irecv(...); keep the existing logic that maps local->global ranks via torch.distributed.get_global_rank(process_group, peer) and only pass group when process_group is not None so subgroup communication uses the correct group.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py

+def _load_subblock_configs_from_subblock_library(master_puzzle_dir: Path) -> list[SubblockConfig]:
+    subblocks_df = pd.read_json(master_puzzle_dir / "subblock_library.json")
+    subblocks_df["attention_config"] = subblocks_df["attention_config"].apply(
+        partial(_dataclass_from_dict, cls=AttentionConfig)
+    )
+    subblocks_df["ffn_config"] = subblocks_df["ffn_config"].apply(
+        partial(_dataclass_from_dict, cls=FFNConfig)
+    )
+    attention_configs = subblocks_df["attention_config"].dropna().drop_duplicates().tolist()
+    ffn_configs = subblocks_df["ffn_config"].dropna().drop_duplicates().tolist()
+    subblock_configs = attention_configs + ffn_configs
+    return subblock_configs


⚠️ Potential issue | 🔴 Critical

Return the same indexed shape from the fallback loader.

This fallback returns bare AttentionConfig/FFNConfig instances, but calculate_subblock_stats() later indexes each item as ["subblock_config"] and ["parent_layer_indices"]. When replacement_library.json is absent, this path fails immediately instead of producing stats.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py` around lines 405 - 416, The fallback loader _load_subblock_configs_from_subblock_library currently returns raw AttentionConfig/FFNConfig objects but callers (calculate_subblock_stats) expect list entries with keys/fields "subblock_config" and "parent_layer_indices" (i.e. SubblockConfig-shaped items); change the function to construct and return SubblockConfig instances (or dicts matching SubblockConfig) for each unique attention/ffn config, populating subblock_config with the AttentionConfig/FFNConfig and setting parent_layer_indices to a sensible default (e.g. empty list) or the original parent indices if present in the JSON, so the returned list matches the same indexed shape used by calculate_subblock_stats. Ensure the function name _load_subblock_configs_from_subblock_library and the SubblockConfig/AttentionConfig/FFNConfig symbols are used to locate and instantiate the correct dataclass.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/bypassed_training/child_init.py

+    # Prepare arguments for parallel processing
+    process_layer_partial = partial(
+        _process_single_layer,
+        pruning_mixin=pruning_mixin,
+        descriptor=descriptor,
+        parent_state_dict=original_state_dict,
+        new_state_dict=new_state_dict,
+        original_config=original_config,
+        new_config=new_config,
+        gqa_init_mode=gqa_init_mode,
+        mlp_init_mode=mlp_init_mode,
+        mlp_init_config=mlp_init_config,
+        linear_init_mode=linear_init_mode,
+        ignored_keys=ignored_keys,
+        keys=keys,
+        is_original_mha=is_original_mha,
+        head_size=head_size,
+        hidden_size=hidden_size,
+    )


⚠️ Potential issue | 🔴 Critical

Don't share a mutable keys map across worker threads.

create_child_state_dict() passes the same keys dict into every worker, and _init_mlp() mutates it with keys.pop(...). That introduces a real race in the fallback path: workers can remove each other's entries, yielding nondeterministic KeyErrors or copying the wrong leftovers into the final state dict.

Also applies to: 452-455, 694-696

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/bypassed_training/child_init.py` around lines 425 - 443, The issue is that the same mutable keys dict is shared across worker tasks causing races when _init_mlp mutates it; update each partial so it receives a worker-local copy of keys (e.g., pass keys.copy() or dict(keys)) instead of the original dict when creating process_layer_partial (and the other partials at the other sites noted), and ensure create_child_state_dict/_process_single_layer/_init_mlp keep using their existing names (create_child_state_dict, _process_single_layer, _init_mlp) but operate on the copied keys to avoid cross-thread pops and nondeterministic KeyError or wrong-state copying.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py

+def _write_file_process_safe(
+    content: Any,
+    path: Path | str,
+    write_fn: Callable[[Any, BinaryIO], None] = _write_text,
+) -> None:
+    """
+    Write a file in a multi-process safe way.
+    If another process tries to write the same file using this method, the current process
+    "gives up" and assumes that the matter is being taken care of by another process.
+
+    write_fn is a function that receives file contents and a binary file object,
+    and writes the content to the file. It can be _write_text (defined above), or torch.save,
+    or a similar function (not safetensors.torch.save_file since it expects a path).
+    """
+    with open(path, "wb") as f:
+        # Try to acquire an exclusive, non-blocking lock
+        try:
+            fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
+        except BlockingIOError:
+            return  # Exit immediately if the lock is not acquired
+
+        write_fn(content, f)  # Write the content if lock is acquired
+        f.flush()  # Ensure data is written to disk
+
+        # Release the lock
+        fcntl.flock(f, fcntl.LOCK_UN)


⚠️ Potential issue | 🔴 Critical

Acquire the file lock before truncating the index file.

open(path, "wb") truncates the file before flock() runs. If another process already owns the lock, this contender still zeroes the file and then returns, so the writer you intended to protect can lose or corrupt the index.

🛠️ Proposed fix

- with open(path, "wb") as f: + with open(path, "ab+") as f: # Try to acquire an exclusive, non-blocking lock try: fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB) except BlockingIOError: return # Exit immediately if the lock is not acquired - + f.seek(0) + f.truncate() write_fn(content, f) # Write the content if lock is acquired f.flush() # Ensure data is written to disk + os.fsync(f.fileno())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/checkpoint_utils_hf.py` around lines 335 - 360, The current _write_file_process_safe opens the file with "wb" which truncates before acquiring the flock, risking data loss; change the open/lock sequence so you acquire the exclusive non-blocking lock before truncating/writing: open the file in a non-truncating mode that still creates the file if missing (e.g., "a+b"), immediately acquire fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB), return on BlockingIOError, then seek to start and f.truncate(0) (or otherwise reset) before calling write_fn(content, f), flushing, and finally releasing the lock in _write_file_process_safe. Ensure you still support write_fn semantics and file creation behavior.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/checkpoint_utils.py

+warnings.filterwarnings("ignore", "You are using `torch.load` with `weights_only=False`*.")
+
+
+def load_state_dict(checkpoint_dir: Path | str) -> dict[str, torch.Tensor]:
+    checkpoint_dir = _normalize_checkpoint_dir(checkpoint_dir)
+
+    if (state_dict_path := checkpoint_dir / STATE_DICT_FILE_NAME).exists():
+        return torch.load(state_dict_path, map_location="cpu", weights_only=False)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n modelopt/torch/puzzletron/tools/checkpoint_utils.py | head -100

Repository: NVIDIA/Model-Optimizer

Length of output: 4528

🏁 Script executed:

cat pyproject.toml | grep -A 5 "torch"

Repository: NVIDIA/Model-Optimizer

Length of output: 1584

🏁 Script executed:

rg "trust_remote_code" modelopt/torch/puzzletron/tools/checkpoint_utils.py -A 2

Repository: NVIDIA/Model-Optimizer

Length of output: 810

Do not force pickle-based torch.load() for state dicts.

These paths explicitly set weights_only=False and then globally suppress the warning. That re-enables arbitrary code execution for untrusted checkpoints even though these files are plain tensor state dicts. Use weights_only=True; if a specific path truly needs pickle, document why that file is trusted inline instead of hiding the warning.

Suggested fix

-warnings.filterwarnings("ignore", "You are using `torch.load` with `weights_only=False`*.") - - def load_state_dict(checkpoint_dir: Path | str) -> dict[str, torch.Tensor]: @@ if (state_dict_path := checkpoint_dir / STATE_DICT_FILE_NAME).exists(): - return torch.load(state_dict_path, map_location="cpu", weights_only=False) + return torch.load(state_dict_path, map_location="cpu", weights_only=True) @@ if len(torch_paths) != 0: - load_fn = partial(torch.load, map_location="cpu", weights_only=False) + load_fn = partial(torch.load, map_location="cpu", weights_only=True) file_paths = torch_paths

As per coding guidelines, "Do not use torch.load(..., weights_only=False) unless a documented exception is provided."

Also applies to: 74-80

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/checkpoint_utils.py` around lines 38 - 45, The code is forcing pickle-based torch.load by setting weights_only=False and globally suppressing the warning; update all torch.load calls in this file (e.g., load_state_dict and the similar optimizer/state loaders around lines 74-80) to use weights_only=True for plain tensor state dicts, remove the global warnings.filterwarnings line, and only allow weights_only=False on a per-call basis with an inline, documented justification (no global suppression) when a trusted file genuinely requires pickle semantics.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/checkpoint_utils.py

+def copy_tokenizer(
+    source_dir_or_tokenizer_name: Path | str,
+    target_dir: Path | str,
+    on_failure: Literal["raise", "warn"] = "raise",
+) -> None:
+    """Prefer loading the tokenizer from huggingface hub (when tokenizer_name.txt file is available)
+    to avoid collision between transformers versions.
+    """
+    source_tokenizer_name_path = Path(source_dir_or_tokenizer_name) / "tokenizer_name.txt"
+    if source_tokenizer_name_path.exists():
+        source_dir_or_tokenizer_name = source_tokenizer_name_path.read_text().strip()
+
+    tokenizer = None
+    try:
+        tokenizer = AutoTokenizer.from_pretrained(
+            source_dir_or_tokenizer_name, trust_remote_code=True
+        )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find all calls to copy_tokenizer() to understand its usage pattern rg "copy_tokenizer" --type py -B 3 -A 3

Repository: NVIDIA/Model-Optimizer

Length of output: 3915

🏁 Script executed:

# Check the full function signature and surrounding context fd "checkpoint_utils.py" --type py | head -1 | xargs cat -n | sed -n '155,185p'

Repository: NVIDIA/Model-Optimizer

Length of output: 235

🏁 Script executed:

# Search for ModelDescriptor classes in puzzletron to understand the pattern fd "descriptor" --type py | xargs rg "class.*Descriptor|requires_trust_remote_code" -B 2 -A 5 2>/dev/null | head -100

Repository: NVIDIA/Model-Optimizer

Length of output: 7792

🏁 Script executed:

# Get the full function definition and docstring cat -n modelopt/torch/puzzletron/tools/checkpoint_utils.py | sed -n '161,210p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1690

🏁 Script executed:

# Check if copy_tokenizer is ever passed a descriptor or trust_remote_code-related parameter rg "copy_tokenizer\(" --context 5

Repository: NVIDIA/Model-Optimizer

Length of output: 3576

copy_tokenizer() should not hard-code trust_remote_code=True.

This helper accepts arbitrary local paths or Hub IDs without any guarantee of trustworthiness, enabling execution of repository-provided Python during tokenizer loading. Add a trust_remote_code: bool = False parameter and pass the descriptor's setting from callers, matching the pattern used by load_model_config() in the same codebase.

Suggested fix

def copy_tokenizer( source_dir_or_tokenizer_name: Path | str, target_dir: Path | str, on_failure: Literal["raise", "warn"] = "raise", + trust_remote_code: bool = False, ) -> None: @@ try: tokenizer = AutoTokenizer.from_pretrained( - source_dir_or_tokenizer_name, trust_remote_code=True + source_dir_or_tokenizer_name, trust_remote_code=trust_remote_code )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/checkpoint_utils.py` around lines 161 - 177, The copy_tokenizer function currently hard-codes AutoTokenizer.from_pretrained(..., trust_remote_code=True); change the signature of copy_tokenizer to accept a trust_remote_code: bool = False parameter and pass that variable into AutoTokenizer.from_pretrained instead of True, and update any callers to forward the descriptor's trust_remote_code setting (same pattern used by load_model_config) so tokenizer loading honors the caller-provided trust flag rather than always allowing remote code execution.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/logger.py

+        elif (
+            (ranks == "main" and self.global_rank != 0)
+            or (ranks == "last" and self.local_rank != self.world_size - 1)
+            or (ranks == "local_main" and self.local_rank != 0)
+        ):
+            return


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for how lprint/last rank logging is used in the codebase rg -n "lprint|ranks.*last" --type=py

Repository: NVIDIA/Model-Optimizer

Length of output: 568

🏁 Script executed:

cd modelopt/torch/puzzletron/tools && head -100 logger.py | tail -60

Repository: NVIDIA/Model-Optimizer

Length of output: 2338

🏁 Script executed:

cat -n modelopt/torch/puzzletron/tools/logger.py | sed -n '50,90p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1838

🏁 Script executed:

rg -n "lprint\(" --type=py -A 2 -B 2 | head -50

Repository: NVIDIA/Model-Optimizer

Length of output: 380

🏁 Script executed:

rg -n "local_world_size|LOCAL_WORLD_SIZE" --type=py

Repository: NVIDIA/Model-Optimizer

Length of output: 160

Fix "last" rank logic to use global_rank instead of local_rank.

Line 74 compares self.local_rank against self.world_size - 1, which is incorrect. The docstring states "last" should log from "rank -1 in node 0" (the last global rank), and the lprint() function's docstring confirms this intent. This requires comparing self.global_rank != self.world_size - 1 to match the semantics of "main" (which checks global_rank != 0). The current code mixes local_rank (rank within a node) with world_size (global count), causing the "last" rank logic to fail.

- or (ranks == "last" and self.local_rank != self.world_size - 1) + or (ranks == "last" and self.global_rank != self.world_size - 1)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/logger.py` around lines 72 - 77, The "last" rank check uses self.local_rank against self.world_size - 1 which mixes per-node rank with global world size; update the condition in the logger method (the branch handling ranks == "last") to compare self.global_rank to self.world_size - 1 (i.e., change the check from self.local_rank != self.world_size - 1 to self.global_rank != self.world_size - 1) so the "last" semantics match the "main"/global checks used by lprint/logger.

coderabbitai · 2026-03-25T19:35:57Z

modelopt/torch/puzzletron/tools/validate_model.py

+        tokenizer = AutoTokenizer.from_pretrained(
+            tokenizer_name or args.model_name_or_path, trust_remote_code=True
+        )


⚠️ Potential issue | 🔴 Critical

CRITICAL: Hardcoded trust_remote_code=True violates security guidelines.

The AutoTokenizer.from_pretrained call hardcodes trust_remote_code=True, which enables arbitrary Python code execution from untrusted model sources. Per coding guidelines, this should be exposed as a caller-configurable parameter defaulting to False, or derived from the ModelDescriptor pattern (as the descriptor already provides requires_trust_remote_code()).

🔒 Proposed fix: Use descriptor's trust_remote_code setting

+from modelopt.torch.puzzletron.anymodel.model_descriptor import ModelDescriptorFactory + def prepare_dataloader( - args: DictConfig, tokenizer: PreTrainedTokenizerBase | None = None + args: DictConfig, tokenizer: PreTrainedTokenizerBase | None = None, trust_remote_code: bool = False ) -> DataLoader: if tokenizer is None: tokenizer_name = getattr(args, "tokenizer_name", None) assert (tokenizer_name is not None) or (args.model_name_or_path is not None) tokenizer = AutoTokenizer.from_pretrained( - tokenizer_name or args.model_name_or_path, trust_remote_code=True + tokenizer_name or args.model_name_or_path, trust_remote_code=trust_remote_code )

Then in validate_model, derive trust_remote_code from the descriptor:

trust_remote_code = descriptor.requires_trust_remote_code() val_dataloader = prepare_dataloader(args, tokenizer, trust_remote_code=trust_remote_code) if dist.is_master() else None

Based on learnings: "trust_remote_code should be determined by the ModelDescriptor pattern, not by a parameter passed to convert_model()."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/puzzletron/tools/validate_model.py` around lines 239 - 241, Replace the hardcoded trust_remote_code=True in the AutoTokenizer.from_pretrained call inside validate_model by deriving the flag from the ModelDescriptor (call descriptor.requires_trust_remote_code()); assign that result to a local trust_remote_code variable and pass it into AutoTokenizer.from_pretrained and any downstream callers such as prepare_dataloader (e.g., prepare_dataloader(..., trust_remote_code=trust_remote_code)) so that trust_remote_code is determined by the descriptor and not hardcoded.

codecov · 2026-03-25T19:37:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.19%. Comparing base (291498b) to head (c5ec50b).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1121      +/-   ##
==========================================
- Coverage   70.21%   70.19%   -0.02%     
==========================================
  Files         228      228              
  Lines       25952    25956       +4     
==========================================
  Hits        18221    18221              
- Misses       7731     7735       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97 and others added 30 commits October 27, 2025 17:42

Add modelopt/torch/_compress CODEOWNERS

6c038f9

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Merge branch 'main' into feature/compress

230cee1

Remove llm_ptq example tests from CICD

54c5f0f

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Merge branch 'main' into feature/compress

ad1d18e

Add convert_llama3_config_to_decilm_config + unit test (#465)

cef3655

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Implement nas.convert() api for the compress algorithm (#482)

002b8b5

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

modelopt nas search() implementation for the compress algorithm (#490)

1c12fd8

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Co-authored-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Add decilm modelling code (#505)

f7d547f

## What does this PR do? Add decilm modelling code --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

llama converter is self-contained now (no dependency on internal nvid…

866e400

…ia code) (#552) ## What does this PR do? llama converter is self-contained now (no dependency on internal nvidia code) --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Merge branch 'main' into feature/compress

69726cc

Merge branch 'main' into feature/compress

07ca24d

Update README.md

2e559e7

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Add activation hooks used for pruning (#576)

f10be0d

## What does this PR do? Add activation hooks used for pruning --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add pruning checkpoints for the compress algorithm (#607)

1f72466

## What does this PR do? Add pruning checkpoints for the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add build replacement library to the compress algorithm. (#616)

97fe7f0

## What does this PR do? Add build replacement library to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add subblock stats to the compress algorithm (#623)

954103e

## What does this PR do? Add subblock stats to the compress algorithm. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add checkpoint save/load to ForwardHook + add IterativeChannelContrib…

56d95de

…utionHook (#610) ## What does this PR do? Add checkpoint save/load to ForwardHook + add IterativeChannelContributionHook. --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Merge branch 'main' into feature/compress

a1f63bc

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

danielkorzekwa and others added 18 commits March 12, 2026 17:51

Update uv.lock for nspect puzzletron scanning

2fbdf0e

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Merge branch 'main' into feature/puzzletron

f460d16

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Dkorzekwa/puzzletron use importance hooks from prune (#1115)

2f55c73

### What does this PR do? Use pruning importance hooks from prune/importance_hooks/ in modelopt.torch.puzzletron --------- Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Merge remote-tracking branch 'origin/main' into feature/puzzletron

c5ec50b

danielkorzekwa requested review from a team as code owners March 25, 2026 19:13

danielkorzekwa requested review from Edwardf0t1, jenchen13 and shengliangxu March 25, 2026 19:13

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

Conversation

danielkorzekwa commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

github-actions bot commented Mar 25, 2026

Built to branch gh-pages at 2026-03-25 19:18 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

danielkorzekwa commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-03-25 19:18 UTC.
Preview will be ready when the GitHub Pages deployment is complete.