Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ All notable changes to this project will be documented in this file.

- **`--dry-run` flag for `kompot de` CLI**: estimates memory, disk, and output field requirements without running the analysis. Outputs machine-parseable JSON to stdout and a human-readable report to stderr. Exit code reflects feasibility.
- **`kompot.configure_logging(stream)`**: reconfigure the kompot logger output stream. The CLI now logs to stderr by default, keeping stdout clean for machine-parseable output (dry-run JSON, table output).
- **`kompot.plot.dotplot`**: ax-embeddable fold-change-per-group dotplot. Color = mean of a per-cell LFC layer within each `groupby` category; size = fraction of cells expressing. Gene selection is either an explicit list or auto-picked top-N by Mahalanobis from run history (with optional `filter_key`, e.g. restricting to `is_de=True`). Pass `axes=(main, cbar, size_legend)` to compose into a larger figure, or leave `axes=None` for a standalone figure. Unlike `scanpy.pl.DotPlot`, this function does not build its own `GridSpec` and does not fight externally-provided axes, which is the whole reason it exists. Shares gene-selection, layer-fetch, and colormap-normalization primitives with `kompot.plot.heatmap` via the existing `heatmap.utils` helpers.

### Improvements

Expand All @@ -18,7 +19,6 @@ All notable changes to this project will be documented in this file.
- Add `smooth_expression()` module to Sphinx API docs.
- Add `RunInfo.to_settings()` and `call_args()` to documented members.
- Fix "Gene Expression Imputation" → "Gene Expression Smoothing" in docs toctree.

## [0.7.0] - 2026-04-13

### Breaking changes
Expand Down
142 changes: 113 additions & 29 deletions examples/02_differential_expression_detailed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@
{
"data": {
"text/plain": [
"AnnData object with n_obs × n_vars = 8090 × 16285\n",
"AnnData object with n_obs \u00d7 n_vars = 8090 \u00d7 16285\n",
" obs: 'Compartment', 'Replicate', 'Age', 'Sample', 'Info', 'batch', 'doublet_score', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_hb', 'pct_counts_hb', 'S_score', 'G2M_score', 'phase', 'leiden', 'phenograph', 'highres_celltype', 'midres_celltype'\n",
" var: 'gene_ids', 'feature_types', 'genome', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'hb', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'highly_variable_nbatches', 'highly_variable_intersection'\n",
" uns: 'Age_colors', 'Compartment_colors', 'DMEigenValues', 'Info_colors', 'README', 'Replicate_colors', 'Sample_colors', 'batch_colors', 'draw_graph', 'highres_celltype_colors', 'hvg', 'leiden', 'leiden_colors', 'midres_celltype_colors', 'neighbors', 'pca', 'phase_colors', 'umap', 'DM_EigenValues'\n",
Expand Down Expand Up @@ -148,7 +148,7 @@
"\n",
"### FDR Settings (`kompot.FDRSettings`)\n",
"\n",
"- **`null_genes`**: Number of permuted genes for FDR estimation (default: \"auto\" 2000)\n",
"- **`null_genes`**: Number of permuted genes for FDR estimation (default: \"auto\" \u2192 2000)\n",
" - Higher values give better FDR estimates but increase computation time\n",
" - Set to 0 to disable FDR computation\n",
"\n",
Expand Down Expand Up @@ -669,6 +669,90 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dotplot Customization\n",
"\n",
"The [dotplot](https://kompot.readthedocs.io/en/latest/plotting.html#kompot.plot.dotplot) function renders the same per-group fold-change summary as `kompot.plot.heatmap` but adds a second encoding dimension: dot **color** is the mean per-cell LFC (like `heatmap(fold_change_mode=True)`), and dot **size** is the fraction of cells in each category whose expression exceeds a threshold. Both encodings come from the same kompot DE run.\n",
"\n",
"It also accepts externally-provided axes (`axes=(main, cbar, size_legend)`), so it composes cleanly into figure-level layouts where `scanpy.pl.DotPlot`'s built-in GridSpec would fight back.\n",
"\n",
"### Auto-pick top genes by Mahalanobis\n",
"\n",
"With `genes=None`, the top `n_top` genes are picked by the Mahalanobis column inferred from the latest kompot DE run \u2014 same default as `heatmap`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"categories = [\n",
" c for c in adata.obs[CELL_TYPE_COLUMN].cat.categories\n",
" if c != \"Plasma cell\"\n",
"]\n",
"\n",
"kompot.plot.dotplot(\n",
" adata,\n",
" genes=None,\n",
" groupby=CELL_TYPE_COLUMN,\n",
" categories_order=categories,\n",
" n_top=15,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Custom gene list\n",
"\n",
"Pass an explicit `genes` list when you want to match a specific comparison or reproduce a figure \u2014 same shape as `heatmap`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kompot.plot.dotplot(\n",
" adata,\n",
" genes=custom_genes,\n",
" groupby=CELL_TYPE_COLUMN,\n",
" categories_order=categories,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Adjusting the color scale and size encoding\n",
"\n",
"The color scale is symmetric around 0 keyed on the `vabs_pct`-th percentile of `|LFC|` by default. Tighten the scale when a few outlier genes compress the visible dynamic range, and tune `size_exponent` / `dot_max` to emphasise fraction-expressing differences:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kompot.plot.dotplot(\n",
" adata,\n",
" genes=custom_genes,\n",
" groupby=CELL_TYPE_COLUMN,\n",
" categories_order=categories,\n",
" vabs_pct=90, # tighter color scale\n",
" size_exponent=2.0, # stronger fraction-expressing emphasis\n",
" dot_max=80,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -973,14 +1057,14 @@
" display: none;\n",
" }\n",
" .kompot-runinfo summary::before {\n",
" content: \" \";\n",
" content: \"\u25b6 \";\n",
" display: inline-block;\n",
" font-size: 0.8em;\n",
" color: #888;\n",
" margin-right: 5px;\n",
" }\n",
" .kompot-runinfo details[open] > summary::before {\n",
" content: \" \";\n",
" content: \"\u25bc \";\n",
" }\n",
" .kompot-runinfo table {\n",
" width: 100%;\n",
Expand Down Expand Up @@ -1168,14 +1252,14 @@
" display: none;\n",
" }\n",
" .kompot-comparison summary::before {\n",
" content: \" \";\n",
" content: \"\u25b6 \";\n",
" display: inline-block;\n",
" font-size: 0.8em;\n",
" color: #888;\n",
" margin-right: 5px;\n",
" }\n",
" .kompot-comparison details[open] > summary::before {\n",
" content: \" \";\n",
" content: \"\u25bc \";\n",
" }\n",
" .kompot-comparison table {\n",
" width: 100%;\n",
Expand Down Expand Up @@ -1433,17 +1517,17 @@
" Memory: 31.76 GB (10% of available)\n",
"\n",
"Memory Allocations:\n",
" Mellon precision matrix L (condition 1, 2,917/3,116 cells) (np.int64(2917), np.int64(2917)): 64.92 MB\n",
" Mellon precision matrix L (condition 2, 2,917/3,116 cells) (np.int64(3116), np.int64(3116)): 74.08 MB\n",
" Imputed expression (condition 1) (8090, 18285): 1.10 GB adata.layers['kompot_de_Young_imputed']\n",
" Imputed expression (condition 2) (8090, 18285): 1.10 GB adata.layers['kompot_de_Old_imputed']\n",
" Fold change (8090, 18285): 1.10 GB adata.layers['kompot_de_Young_to_Old_fold_change']\n",
" Temporary matrices during predictions (batch_size=100) (100, 5000) + (100, 18285): 17.77 MB\n",
" Peak intermediate arrays during predictions (~25 arrays) 25×(8090, 18285): 27.55 GB\n",
" Function predictor covariances (per condition) (5000, 5000): 381.47 MB\n",
" Combined covariance matrix (5000, 5000): 190.73 MB\n",
" Cholesky decomposition (for Mahalanobis) (5000, 5000): 190.73 MB\n",
" Mahalanobis batch processing (batch_size=100) (100, 5000): 3.81 MB\n",
" \u2022 Mellon precision matrix L (condition 1, 2,917/3,116 cells) (np.int64(2917), np.int64(2917)): 64.92 MB\n",
" \u2022 Mellon precision matrix L (condition 2, 2,917/3,116 cells) (np.int64(3116), np.int64(3116)): 74.08 MB\n",
" \u2022 Imputed expression (condition 1) (8090, 18285): 1.10 GB \u2192 adata.layers['kompot_de_Young_imputed']\n",
" \u2022 Imputed expression (condition 2) (8090, 18285): 1.10 GB \u2192 adata.layers['kompot_de_Old_imputed']\n",
" \u2022 Fold change (8090, 18285): 1.10 GB \u2192 adata.layers['kompot_de_Young_to_Old_fold_change']\n",
" \u2022 Temporary matrices during predictions (batch_size=100) (100, 5000) + (100, 18285): 17.77 MB\n",
" \u2022 Peak intermediate arrays during predictions (~25 arrays) 25\u00d7(8090, 18285): 27.55 GB\n",
" \u2022 Function predictor covariances (per condition) (5000, 5000): 381.47 MB\n",
" \u2022 Combined covariance matrix (5000, 5000): 190.73 MB\n",
" \u2022 Cholesky decomposition (for Mahalanobis) (5000, 5000): 190.73 MB\n",
" \u2022 Mahalanobis batch processing (batch_size=100) (100, 5000): 3.81 MB\n",
"\n",
"Output Fields:\n",
" adata.layers:\n",
Expand All @@ -1457,16 +1541,16 @@
" - kompot_de_Young_to_Old_is_de\n",
"\n",
"Info:\n",
" Null distribution will use 2000 additional genes (total: 18285 genes processed)\n",
" Cell batching reduces memory: Each of 4 prediction operations uses ~17.77 MB temporary arrays instead of 1.40 GB (saving 1.39 GB).\n",
" Prediction creates ~25 intermediate arrays of shape (8,090, 18285). These coexist at peak memory (27.55 GB) but are freed before completion.\n",
" Mahalanobis computation processes 100 genes per batch. Reduce via gp=GPSettings(batch_size=...) to lower peak memory (currently 3.81 MB for batch arrays).\n",
" \u2139 Null distribution will use 2000 additional genes (total: 18285 genes processed)\n",
" \u2139 Cell batching reduces memory: Each of 4 prediction operations uses ~17.77 MB temporary arrays instead of 1.40 GB (saving 1.39 GB).\n",
" \u2139 Prediction creates ~25 intermediate arrays of shape (8,090, 18285). These coexist at peak memory (27.55 GB) but are freed before completion.\n",
" \u2139 Mahalanobis computation processes 100 genes per batch. Reduce via gp=GPSettings(batch_size=...) to lower peak memory (currently 3.81 MB for batch arrays).\n",
"\n",
"Warnings:\n",
" Results with result_key='kompot_de' already exist (run_id=1). Previous run: 2026-03-26T05:39:38.441389 comparing Young to Mid (null_genes=2000). Fields that will be overwritten: var.kompot_de_Young_to_Old_mahalanobis, var.kompot_de_Young_to_Old_mean_lfc, layers.kompot_de_Young_imputed, layers.kompot_de_Old_imputed, layers.kompot_de_Young_to_Old_fold_change and 2 more\n",
" \u26a0 Results with result_key='kompot_de' already exist (run_id=1). Previous run: 2026-03-26T05:39:38.441389 comparing Young to Mid (null_genes=2000). Fields that will be overwritten: var.kompot_de_Young_to_Old_mahalanobis, var.kompot_de_Young_to_Old_mean_lfc, layers.kompot_de_Young_imputed, layers.kompot_de_Old_imputed, layers.kompot_de_Young_to_Old_fold_change and 2 more\n",
"\n",
"================================================================================\n",
"STATUS: FEASIBLE WITH WARNINGS - Proceed with caution\n",
"STATUS: \u26a0 FEASIBLE WITH WARNINGS - Proceed with caution\n",
"================================================================================\n"
]
}
Expand Down Expand Up @@ -1562,12 +1646,12 @@
"\n",
"This tutorial covered:\n",
"\n",
" Customizing DE parameters (`null_genes`, `sigma`, `batch_size`) \n",
" Advanced volcano plot options \n",
" Expression visualization techniques \n",
" Heatmap customization \n",
" Managing multiple comparisons with `run_id` \n",
" Resource planning with dry runs \n",
"\u2713 Customizing DE parameters (`null_genes`, `sigma`, `batch_size`) \n",
"\u2713 Advanced volcano plot options \n",
"\u2713 Expression visualization techniques \n",
"\u2713 Heatmap customization \n",
"\u2713 Managing multiple comparisons with `run_id` \n",
"\u2713 Resource planning with dry runs \n",
"\n",
"### Next Steps\n",
"\n",
Expand Down
14 changes: 14 additions & 0 deletions kompot/plot/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,20 @@ def plot_smoothing(*args, **kwargs):
raise ImportError("Smoothing plot unavailable due to missing dependencies.")


try:
from .dotplot import dotplot

__all__.append("dotplot")
except ImportError as e:
logger.warning(f"Could not import dotplot function due to: {e}")

def dotplot(*args, **kwargs):
raise ImportError(
"Dotplot unavailable due to missing dependencies. "
"matplotlib is required."
)


# Import StringDB report class
try:
from .stringdb import StringDBReport
Expand Down
Loading
Loading