Skip to content

plot: add dotplot — ax-embeddable fold-change dotplot + tutorial example#9

Merged
settylab-dotto-bot[bot] merged 3 commits into
mainfrom
dominik/kompot-dotplot
Apr 28, 2026
Merged

plot: add dotplot — ax-embeddable fold-change dotplot + tutorial example#9
settylab-dotto-bot[bot] merged 3 commits into
mainfrom
dominik/kompot-dotplot

Conversation

@katosh
Copy link
Copy Markdown
Collaborator

@katosh katosh commented Apr 24, 2026

Summary

Adds kompot.plot.dotplot — an ax-embeddable fold-change dotplot with optional Mahalanobis auto-pick. Library-only, purely additive.

  • Color = mean of a per-cell LFC layer within each groupby category (e.g. kompot_de_<c1>_to_<c2>_fold_change).
  • Size = fraction of cells in each category whose expression layer exceeds expr_threshold (default 0).
  • Gene selection is either an explicit list or auto-picked top-N by Mahalanobis from the latest kompot DE run (filter_key restricts candidates, e.g. to is_de=True).
  • axes=None → standalone 3-panel figure (main + colorbar + size legend). axes=(main, cbar, size_legend) → drop into an externally-laid-out composite figure.

Why

Fig 3 and Fig 4 of the kompot paper each reimplemented this dotplot:

  • Fig 3 uses sc.pl.DotPlot + swap_axes + a custom legend rebuilder.
  • Fig 4 dropped scanpy entirely (~160 lines of matplotlib primitives) because sc.pl.DotPlot builds its own internal GridSpec and cannot be composed into a subfigure with a caller-provided ax=.

A third figure in a future paper (or any downstream kompot user who wants this view in a panel composite) would copy the same code again. Lifting a stateless, ax-embeddable version into kompot.plot collapses both call sites to a few lines and makes the Mahalanobis-ranked "clean DE" view first-class in the library.

Heatmap-primitive reuse

Per reviewer feedback, dotplot now shares its gene-selection / layer-fetch / colorbar semantics with kompot.plot.heatmap by reusing four existing helpers from kompot.plot.heatmap.utils:

Helper Purpose Before refactor After refactor
_infer_score_key Mahalanobis-column inference from run_info Local copy in dotplot.py Imported from heatmap.utils
_prepare_gene_list list-or-top-N-by-score gene resolution Hand-rolled in dotplot.py Imported from heatmap.utils (used for the no-filter_key path)
_get_expression_matrix dense layer / X fetch with sparse + missing-layer handling Local _resolve_layer Imported from heatmap.utils
_setup_colormap_normalization TwoSlopeNorm + cmap-string→object Inline Normalize(vmin=-vabs, vmax=vabs) + plt.cm.get_cmap Imported from heatmap.utils

What stayed local

  • _infer_lfc_layer. Heatmap colors by per-condition expression means, while dotplot colors by the per-cell LFC layer kompot writes during DE. There is no heatmap analog; adding one would only complicate heatmap's existing API.
  • _group_aggregate. Heatmap uses cond1_df.groupby(groupby, observed=observed).mean() inline. Dotplot now uses the same pandas-groupby idiom (rather than the previous manual masking loop) so the two modules are trivially poachable into a shared helper once a third caller appears. A TODO in dotplot.py:_group_aggregate flags that shared-extraction as the intended next step.
  • Dot-size encoding and size legend. Dotplot-only — heatmap has no fraction-expressing concept, and heatmap's sidebar legend is for conditions not dot sizes.
  • filter_key (e.g. restrict auto-pick to is_de=True). Heatmap doesn't expose this; adding it to _prepare_gene_list would be a behavioral change in heatmap and is out of scope for this PR.

What API shape changed vs. the original kompot-dotplot commit

Nothing visible to callers. Arguments, defaults, and returns are unchanged from ad2715b. All 15 behavioral tests still pass, plus one new test (test_dotplot_reuses_heatmap_primitives) that pins the reuse surface so a future rename in heatmap.utils surfaces here first.

What's in the diff

  • kompot/plot/dotplot.py — rewired to import _infer_score_key, _prepare_gene_list, _get_expression_matrix, _setup_colormap_normalization from kompot.plot.heatmap.utils. Per-group aggregation rewritten to use pandas groupby().mean() to match heatmap's idiom.
  • kompot/plot/__init__.py — exports dotplot with the same try/except import pattern used for siblings.
  • tests/test_plot_dotplot.py — 16 tests covering standalone vs. embedded rendering, auto-pick-by-Mahalanobis, filter_key narrowing, symmetric color scale with vabs_pct cap, vabs_min floor, vmax override, min_cells drop, categories_order filter/reorder, and the heatmap-helper reuse surface.
  • CHANGELOG.md — new [Unreleased] section above 0.7.0.

Tutorial example

Adds a Dotplot Customization section to examples/02_differential_expression_detailed.ipynb, mirroring the existing Heatmap Customization section. Three new cells demonstrate:

  1. Auto-picking genes by Mahalanobis from the latest DE run, with categories_order used to drop a low-cell category.
  2. Explicit-gene mode on custom_genes — the same gene list used by the adjacent heatmap example, so readers can compare the two views of the same data.
  3. Embedded mode — rendering the dotplot into a caller-provided (main, cbar, size_legend) matplotlib axes triple, demonstrating composability into a larger figure layout.

What's NOT in this PR

  • No refactor of Fig 3's panel_j_dotplot.py / panel_m_dotplot.py — separate downstream PR.
  • No refactor of Fig 4's panel_N_dotplot — separate downstream PR.
  • No chimera-filter helpers (_IMPRINTED_EXCLUDE, _HB_RE) — fig-specific, stays in the notebook.
  • No changes to heatmap.py internals or its public API. The reuse is one-directional (dotplot imports from heatmap; heatmap is untouched).

Test plan

  • pytest tests/test_plot_dotplot.py — 16/16 pass locally
  • Regression: pytest tests/test_plot_imports_and_volcano_da.py tests/test_plot_functions.py — 60 passed, 1 skipped (untouched)
  • CI green on this PR

Execution status

Tutorial notebook outputs: not yet populated. Run jupyter nbconvert --execute examples/02_differential_expression_detailed.ipynb before merge (in whichever env you normally use for this tutorial). The existing mamba envs reachable from the agent's sandbox (kompot_v1, kompot_v2) are currently broken on unrelated deps (zarr/anndata mismatch, numba/NumPy 2.4 mismatch respectively), so best-effort execution was attempted but skipped rather than leave half-executed state in the notebook.

katosh added 3 commits April 28, 2026 16:21
Fold-change-per-group dotplot where color = mean of a per-cell LFC
layer within each groupby category and size = fraction of cells
expressing. Gene list can be passed explicitly or auto-picked top-N
by Mahalanobis from kompot DE run history (with optional is_de
filter). Pass axes=(main, cbar, size_legend) to embed into a
composite figure, or axes=None for standalone.

This was previously duplicated across fig 3 (scanpy DotPlot variant)
and fig 4 (matplotlib-primitive variant ~160 lines) of the kompot
paper. Lifting it here gives every kompot user an ax-composable
version — scanpy's DotPlot fights externally-provided ax= by building
its own GridSpec, which is why the fig-4 authors had to reimplement
the whole plot to get it into a composite subfigure.

Additive API change only; no existing plot code touched.
Rewire dotplot to share its gene-selection, layer-fetch, and
colormap-normalization plumbing with kompot.plot.heatmap rather
than duplicating it. Four helpers moved from local copies to
imports from heatmap.utils:

  * _infer_score_key — Mahalanobis column inference from run_info
  * _prepare_gene_list — list-or-top-N-by-score gene resolution
    (used for the no-filter_key path; the filter path stays local
    since heatmap has no equivalent)
  * _get_expression_matrix — dense layer / X fetch
  * _setup_colormap_normalization — TwoSlopeNorm + cmap resolution

Per-category aggregation rewritten to use pandas groupby().mean()
matching heatmap's idiom, so a future extraction into a shared
plot/_utils.py helper is trivial. TODO comment flags it.

What stayed local and why:
  * _infer_lfc_layer — heatmap colors by per-condition expression,
    not per-cell LFC; no analog in heatmap.utils.
  * _group_aggregate — see TODO; no third caller yet to justify
    lifting.
  * Dot-size encoding + size-legend — dotplot-only concerns.

Public API unchanged from ad2715b — all existing tests still pass,
plus one new test (test_dotplot_reuses_heatmap_primitives) pinning
the reuse surface so future renames in heatmap.utils surface here
first.
Mirrors the Heatmap Customization section on the same tutorial data,
showing dotplot as a fold-change + fraction-expressing alternative to
heatmap(fold_change_mode=True). Three parallel variants:

- auto-pick top-15 genes by Mahalanobis (matches heatmap default)
- explicit `genes=` list (matches heatmap custom-gene cell)
- scale tuning (vabs_pct, size_exponent, dot_max) — no heatmap analog;
  demonstrates the size-encoding dimension dotplot adds.

Output cells are empty; run `jupyter nbconvert --execute` (or re-render
via the docs build) before merge to populate them.
@katosh katosh force-pushed the dominik/kompot-dotplot branch from c2f8af3 to 4432d4f Compare April 28, 2026 23:22
@settylab-dotto-bot settylab-dotto-bot Bot merged commit 62e83b6 into main Apr 28, 2026
4 of 5 checks passed
@settylab-dotto-bot settylab-dotto-bot Bot deleted the dominik/kompot-dotplot branch April 28, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant