Skip to content

feat: supervision#6

Open
xrusnack wants to merge 18 commits into
masterfrom
feature/supervision
Open

feat: supervision#6
xrusnack wants to merge 18 commits into
masterfrom
feature/supervision

Conversation

@xrusnack
Copy link
Copy Markdown
Member

@xrusnack xrusnack commented Mar 19, 2026

Summary by CodeRabbit

  • New Features

    • Added a comprehensive per-slide supervision system supporting annotation, prediction, CAM, and agreement modes for flexible label handling.
    • Exposed supervision utilities to assemble dataset-level supervision and per-slide summaries.
  • Configuration

    • Added training presets for crop-level and nuclei-level experiments and a baseline training configuration.
    • Expanded data and supervision configuration options and default MLflow URI wiring for dataset-specific runtimes.

@xrusnack xrusnack requested review from matejpekar and vejtek March 19, 2026 11:30
@xrusnack xrusnack self-assigned this Mar 19, 2026
@xrusnack xrusnack requested a review from a team March 19, 2026 11:30
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 19, 2026

Warning

Rate limit exceeded

@xrusnack has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 37 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 56 minutes and 37 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d26f6ed8-f30a-4ee9-ad38-d58b8e6a5391

📥 Commits

Reviewing files that changed from the base of the PR and between 6a042eb and a3c55f0.

📒 Files selected for processing (8)
  • configs/data/supervision/agreement.yaml
  • configs/data/supervision/annotation.yaml
  • configs/data/supervision/base.yaml
  • configs/data/supervision/cam.yaml
  • configs/data/supervision/prediction.yaml
  • configs/experiment/modeling/training/base.yaml
  • configs/ml.yaml
  • nuclei_graph/data/supervision.py
📝 Walkthrough

Walkthrough

Adds a new slide- and dataset-level supervision subsystem (configs + runtime factory) and new Hydra configurations for multiple supervision modes and training experiments; implements NucleiSupervision classes, a SupervisionStrategy factory, and build_supervision to produce DatasetSupervision mappings.

Changes

Cohort / File(s) Summary
Supervision configs
configs/data/supervision/base.yaml, configs/data/supervision/annotation.yaml, configs/data/supervision/cam.yaml, configs/data/supervision/agreement.yaml, configs/data/supervision/prediction.yaml
Add Hydra configs for supervision base and mode-specific overlays (annotation, cam, agreement, prediction); each composes the base and sets mode plus interpolated MLflow URIs.
Training experiment configs
configs/experiment/modeling/training/base.yaml, configs/experiment/modeling/training/crop_level.yaml, configs/experiment/modeling/training/nuclei_level.yaml
Add training base and two experiment presets (crop-level, nuclei-level) wiring supervision strategies, data placeholders, batch sizes, and metadata/run names.
Main ML config
configs/ml.yaml
Extend defaults to include nuclei-level training; expand dataset and structured data config (batch_size, workers, mlflow_uris, dataset, supervision.eval_strategy).
Supervision implementation
nuclei_graph/data/supervision.py, nuclei_graph/data/__init__.py
New module implementing abstract NucleiSupervision and concrete strategies (Dense, Annotation, Prediction, CAM, Agreement), SlideSupervision/DatasetSupervision dataclasses, SupervisionStrategy factory, and build_supervision builder. Export build_supervision from package.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Builder as build_supervision()
    participant Strategy as SupervisionStrategy
    participant Nuclei as NucleiSupervision
    participant Dataset as DatasetSupervision

    User->>Builder: provide sup_dfs, carcinoma_map, strategy
    Builder->>Builder: merge/join supervision DataFrames by slide_id
    loop per slide
        Builder->>Strategy: create(is_carcinoma, available label tensors)
        Strategy->>Nuclei: instantiate appropriate NucleiSupervision subclass
        Nuclei-->>Strategy: return supervision instance
        Strategy-->>Builder: supervision instance
        Builder->>Builder: wrap into SlideSupervision
    end
    Builder->>Dataset: assemble SlideSupervision map
    Dataset-->>Builder: DatasetSupervision
    Builder-->>User: return DatasetSupervision
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I stitched the slides with tiny thread,
Labels and CAMs hop in my head—
Agreement, prediction, annotation too,
A garden of signals, fresh as dew. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "feat: supervision" directly and concisely summarizes the main feature addition. The changeset introduces a complete supervision system with configuration files, core implementation, and public API exports, all centered on the supervision feature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/supervision

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@xrusnack
Copy link
Copy Markdown
Member Author

/gemini summary

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

This pull request introduces a flexible and extensible supervision system for nucleus-level data within the nuclei_graph project. It provides a structured way to define and apply different labeling strategies, such as those derived from pathologist annotations, CAMs, or model predictions, to guide model training and evaluation. The changes include new Python classes for supervision logic, corresponding Hydra configurations, and updates to integrate this system into the overall data and experiment setup.

Highlights

  • New Supervision Framework: Introduced a comprehensive framework for defining and applying nucleus-level supervision strategies, including abstract and concrete classes for different supervision modes.
  • Configurable Supervision Modes: Added support for various supervision modes such as annotation-based, CAM-based, agreement-based (between annotation and CAM), and prediction-based, each with its own configuration.
  • Hydra Configuration Integration: Integrated the new supervision strategies into the Hydra configuration system, allowing for flexible selection and parameterization of supervision during training and evaluation.
  • New Training Experiment Configurations: Added new base and specific training experiment configurations (crop-level and nuclei-level) that leverage the newly introduced supervision framework.
  • Data Module Enhancement: Updated the core data module to include a utility function for building dataset-wide supervision based on the chosen strategy and available labels.
Changelog
  • configs/data/supervision/agreement.yaml
    • Added configuration for agreement-based nuclei supervision, combining annotation and CAM labels.
  • configs/data/supervision/annotation.yaml
    • Added configuration for annotation-based nuclei supervision.
  • configs/data/supervision/base.yaml
    • Added a base configuration for all nuclei supervision strategies, defining common parameters.
  • configs/data/supervision/cam.yaml
    • Added configuration for CAM-based nuclei supervision.
  • configs/data/supervision/prediction.yaml
    • Added configuration for prediction-based nuclei supervision.
  • configs/experiment/modeling/training/base.yaml
    • Added a base training configuration that incorporates supervision parameters for both training and validation strategies.
  • configs/experiment/modeling/training/crop_level.yaml
    • Added a new training experiment configuration for crop-level models, defaulting to annotation supervision.
  • configs/experiment/modeling/training/nuclei_level.yaml
    • Added a new training experiment configuration for nucleus-level models, using annotation for training and agreement for evaluation supervision.
  • configs/ml.yaml
    • Updated the default training experiment to 'nuclei_level' and expanded the data configuration to include batch size, worker counts, MLflow URIs for metadata and supervision, and an evaluation supervision strategy.
  • nuclei_graph/data/init.py
    • Imported and exposed the build_supervision function from the new supervision module.
  • nuclei_graph/data/supervision.py
    • Implemented abstract and concrete classes for various nuclei supervision types (Annotation, CAM, Prediction, Agreement).
    • Introduced SupervisionStrategy for dynamically creating supervision objects based on configuration.
    • Added build_supervision function to construct dataset-wide supervision maps.
Activity
  • A bot encountered an error while attempting to generate a summary.
  • Another bot reported exceeding the rate limit for commit reviews.
  • The author manually triggered a summary generation.

@RationAI RationAI deleted a comment from gemini-code-assist Bot Mar 19, 2026
@xrusnack
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
configs/ml.yaml (1)

13-22: Consider clarifying the field override hierarchy.

The data section defines several fields that are also present in the experiment configs (e.g., batch_size, dataset, supervision.eval_strategy). Given the defaults composition order (line 4 places nuclei_level after _self_), values from nuclei_level.yaml will override these placeholders. While this is valid Hydra behavior, the duplication might confuse users.

Consider either:

  • Adding a comment explaining the override hierarchy
  • Consolidating required fields to avoid duplication
  • Using more specific placeholder names to clarify which configs should provide values

For example, nuclei_level.yaml sets batch_size: 32, which will override batch_size: ??? defined here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@configs/ml.yaml` around lines 13 - 22, The data section currently repeats
fields (data.batch_size, data.dataset, data.supervision.eval_strategy) that are
also provided by included level configs (e.g., nuclei_level.yaml) and this can
be confusing given the defaults composition order that places nuclei_level after
_self_. Update configs to clarify the override hierarchy by either adding a
concise comment in configs/ml.yaml explaining that nuclei_level.yaml will
override these placeholders, consolidating/removing duplicated placeholders here
so only one source supplies values, or renaming the placeholders to explicit
names (e.g., data.batch_size_default or data.batch_size_override) so it's clear
which file should supply the final value; reference data.batch_size,
data.dataset, data.supervision.eval_strategy, nuclei_level.yaml and the _self_
inclusion when making the change.
configs/experiment/modeling/training/nuclei_level.yaml (1)

12-20: Ensure TODO placeholders are addressed before production use.

The config contains multiple TODO sections (trainer, dataset, sampler, model) that must be populated for the experiment to run. This same pattern appears across multiple training experiment configs (crop_level.yaml, base.yaml), indicating systemic scaffolding. No documentation was found to guide completion of these placeholders.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@configs/experiment/modeling/training/nuclei_level.yaml` around lines 12 - 20,
The YAML contains unfilled TODO placeholders (trainer, dataset, sampler, model)
that will prevent runs; replace each TODO with concrete config entries: for
trainer fill optimizer, learning_rate, epochs, device, checkpointing and seed
(e.g., optimizer name, lr schedule, max_epochs), for dataset replace dataset
with dataset_name/path, split strategy and transforms/preprocessing, for sampler
replace sampler with sampler_type and any params (e.g., weighted, random,
batch_sampler settings), and for model specify architecture, pretrained flag,
num_classes and any head/hyperparameters; apply the same fixes to
crop_level.yaml and base.yaml, validate each file against the training schema or
config loader, and add/update short docs/comments describing required keys and
example values so future PRs don't reintroduce TODOs.
nuclei_graph/data/supervision.py (1)

197-209: Inner join may silently drop slides with partial supervision data.

The reduce with inner merge means only slides present in all provided DataFrames will be included. If a slide has annotation labels but missing CAM labels, it will be silently excluded. Consider whether an outer join with explicit handling of missing columns would be more appropriate, or at minimum log which slides were dropped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nuclei_graph/data/supervision.py` around lines 197 - 209, The current
reduce+pd.merge in sup_groups uses how="inner", which silently drops slides not
present in every DataFrame (sources from sup_dfs); change the merge strategy to
how="outer" (or perform pairwise outer merges) to preserve slides with partial
supervision, then explicitly handle missing columns/rows (e.g., fillna, add
boolean flags, or filter later) and log which slide_ids were dropped or have
missing supervision types; update references to sources/sup_dfs/sup_groups and
the reduce(pd.merge(..., how="inner")) call accordingly so downstream code can
detect and handle NaNs rather than losing rows silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@configs/experiment/modeling/training/base.yaml`:
- Around line 14-15: The supervision block only declares train_strategy
(supervision.train_strategy: ???) but other code expects
supervision.eval_strategy; add a symmetric placeholder by declaring
supervision.eval_strategy: ??? so missing eval_strategy fails fast and clearly
during config composition—update the YAML block that defines supervision (the
train_strategy entry and add a new eval_strategy entry) to include the eval
placeholder.

In `@nuclei_graph/data/supervision.py`:
- Around line 174-176: Annotate the __init__ to satisfy mypy and clarify uris
usage: change def __init__(self, mode: str, **uris: str) -> None and add a typed
attribute assignment like self.uris: dict[str, str] = uris; if the stored uris
are intentionally unused, either remove the assignment entirely or rename to
self._uris: dict[str, str] = uris (or add a # noqa / # unused comment) to signal
future use. Ensure you import any typing primitives if needed for older Python
versions.
- Around line 180-188: Update the create signature to annotate the kwargs so
mypy knows the types of label lists: change def create(self, is_carcinoma: bool,
**all_labels) -> NucleiSupervision: to def create(self, is_carcinoma: bool,
**all_labels: Sequence[Any]) -> NucleiSupervision: and add the necessary typing
imports (Sequence, Any) at the top; keep the rest of the body the same
(filtered_labels usage will then be typed as a dict of sequences).
- Around line 220-224: The code calls sup_groups.get_group(slide_id) which
raises KeyError for carcinoma slides absent from the merged supervision groups;
to fix, guard that call by checking membership (e.g., if slide_id in
sup_groups.groups:) or wrap get_group in try/except KeyError, and on missing
group skip populating COL_TO_KWARG entries (or set explicit default tensors) and
optionally log a warning; update the block around is_carcinoma to use this check
and only iterate COL_TO_KWARG when the group exists to avoid the crash.

---

Nitpick comments:
In `@configs/experiment/modeling/training/nuclei_level.yaml`:
- Around line 12-20: The YAML contains unfilled TODO placeholders (trainer,
dataset, sampler, model) that will prevent runs; replace each TODO with concrete
config entries: for trainer fill optimizer, learning_rate, epochs, device,
checkpointing and seed (e.g., optimizer name, lr schedule, max_epochs), for
dataset replace dataset with dataset_name/path, split strategy and
transforms/preprocessing, for sampler replace sampler with sampler_type and any
params (e.g., weighted, random, batch_sampler settings), and for model specify
architecture, pretrained flag, num_classes and any head/hyperparameters; apply
the same fixes to crop_level.yaml and base.yaml, validate each file against the
training schema or config loader, and add/update short docs/comments describing
required keys and example values so future PRs don't reintroduce TODOs.

In `@configs/ml.yaml`:
- Around line 13-22: The data section currently repeats fields (data.batch_size,
data.dataset, data.supervision.eval_strategy) that are also provided by included
level configs (e.g., nuclei_level.yaml) and this can be confusing given the
defaults composition order that places nuclei_level after _self_. Update configs
to clarify the override hierarchy by either adding a concise comment in
configs/ml.yaml explaining that nuclei_level.yaml will override these
placeholders, consolidating/removing duplicated placeholders here so only one
source supplies values, or renaming the placeholders to explicit names (e.g.,
data.batch_size_default or data.batch_size_override) so it's clear which file
should supply the final value; reference data.batch_size, data.dataset,
data.supervision.eval_strategy, nuclei_level.yaml and the _self_ inclusion when
making the change.

In `@nuclei_graph/data/supervision.py`:
- Around line 197-209: The current reduce+pd.merge in sup_groups uses
how="inner", which silently drops slides not present in every DataFrame (sources
from sup_dfs); change the merge strategy to how="outer" (or perform pairwise
outer merges) to preserve slides with partial supervision, then explicitly
handle missing columns/rows (e.g., fillna, add boolean flags, or filter later)
and log which slide_ids were dropped or have missing supervision types; update
references to sources/sup_dfs/sup_groups and the reduce(pd.merge(...,
how="inner")) call accordingly so downstream code can detect and handle NaNs
rather than losing rows silently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db751d6d-0769-4759-9de0-ca7d218d8bbf

📥 Commits

Reviewing files that changed from the base of the PR and between a377343 and 49aa6a5.

📒 Files selected for processing (11)
  • configs/data/supervision/agreement.yaml
  • configs/data/supervision/annotation.yaml
  • configs/data/supervision/base.yaml
  • configs/data/supervision/cam.yaml
  • configs/data/supervision/prediction.yaml
  • configs/experiment/modeling/training/base.yaml
  • configs/experiment/modeling/training/crop_level.yaml
  • configs/experiment/modeling/training/nuclei_level.yaml
  • configs/ml.yaml
  • nuclei_graph/data/__init__.py
  • nuclei_graph/data/supervision.py

Comment thread configs/experiment/modeling/training/base.yaml
Comment thread nuclei_graph/data/supervision.py Outdated
Comment thread nuclei_graph/data/supervision.py Outdated
Comment thread nuclei_graph/data/supervision.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
nuclei_graph/data/supervision.py (2)

197-198: Consider using explicit exception for empty sources validation.

Same as above—assert can be disabled with -O. An empty sources list indicates a configuration error that should fail reliably.

♻️ Proposed fix
     sources = [df for df in sup_dfs.values() if df is not None]
-    assert sources
+    if not sources:
+        raise ValueError("At least one supervision DataFrame must be provided")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nuclei_graph/data/supervision.py` around lines 197 - 198, Replace the
runtime-unsafe assert with an explicit check that raises a clear exception:
after building sources = [df for df in sup_dfs.values() if df is not None], add
an if not sources: raise ValueError(...) (or a custom ConfigurationError) with a
descriptive message mentioning sup_dfs/sources so the configuration error fails
reliably even when Python is run with optimizations disabled.

184-186: Consider using explicit exception instead of assert for runtime validation.

assert statements can be disabled when Python runs with optimization flags (-O or -OO). For carcinoma slides, missing required labels is a data integrity issue that should fail reliably.

♻️ Proposed fix
         if is_carcinoma:
             for k, v in filtered_labels.items():
-                assert len(v) > 0, f"Missing required label {k} for slide"
+                if len(v) == 0:
+                    raise ValueError(f"Missing required label '{k}' for carcinoma slide")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nuclei_graph/data/supervision.py` around lines 184 - 186, Replace the runtime
use of assert in the carcinoma-label check with an explicit exception so
failures can't be skipped under Python optimizations: in the block that checks
is_carcinoma and iterates filtered_labels (the loop using for k, v in
filtered_labels.items()), change the assertion to raise a clear exception (e.g.,
ValueError or a custom DataValidationError) that includes the missing label name
and context about the slide so the validation always fails reliably at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@nuclei_graph/data/supervision.py`:
- Around line 197-198: Replace the runtime-unsafe assert with an explicit check
that raises a clear exception: after building sources = [df for df in
sup_dfs.values() if df is not None], add an if not sources: raise
ValueError(...) (or a custom ConfigurationError) with a descriptive message
mentioning sup_dfs/sources so the configuration error fails reliably even when
Python is run with optimizations disabled.
- Around line 184-186: Replace the runtime use of assert in the carcinoma-label
check with an explicit exception so failures can't be skipped under Python
optimizations: in the block that checks is_carcinoma and iterates
filtered_labels (the loop using for k, v in filtered_labels.items()), change the
assertion to raise a clear exception (e.g., ValueError or a custom
DataValidationError) that includes the missing label name and context about the
slide so the validation always fails reliably at runtime.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5182ffe5-2ba7-44e3-a529-d3b3e392474c

📥 Commits

Reviewing files that changed from the base of the PR and between 49aa6a5 and 6a042eb.

📒 Files selected for processing (1)
  • nuclei_graph/data/supervision.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant