align Mahalanobis and DA PTP with manuscript; harmonize use_empirical_variance default; bump to 0.8.0#14
Conversation
Three statistical corrections + one default-value harmonization;
all called out in the manuscript-vs-implementation review.
* Mahalanobis denominator sums the two posterior covariances
(cov1 + cov2) instead of averaging them. The manuscript defines
D(a,b) = sqrt((mu_a - mu_b)^T (Sigma_a + Sigma_b)^(-1) (mu_a - mu_b)),
i.e. the variance of the difference of two independent posterior
estimators. The GP-only branch in differential_expression.py was
computing (cov1 + cov2) / 2 even though the sample-variance branch
at the same call site was already summing.
Effect: D shrinks by sqrt(2) in the GP-only regime. Relative
rankings unchanged; FDR re-calibrates against the null. The
resource-estimation comment is brought in line too.
* Differential-abundance posterior tail probability is one-sided:
PTP = Phi(-|z|), matching the manuscript prose ("probability that
the true density difference has the opposite sign to the estimated
change") and the one-sided formula. The implementation was
emitting 2 * Phi(-|z|) via a spurious + np.log(2).
Effect: PTPs are halved; neg_log10_fold_change_ptp increases by
log10(2). A previously-published cutoff of PTP < 1e-3 corresponded
to |z| >= 3.29 (two-sided) and now corresponds to |z| >= 3.09
(one-sided). Re-tune any ptp_threshold chosen against 0.7.0.
* use_empirical_variance defaults to False everywhere. The
recommended kompot.de() path already defaulted to False via
GPSettings; the deprecated wrappers (compute_differential_expression,
compute_smoothed_expression), DifferentialExpression.__init__,
ExpressionModel.__init__, smooth_expression(gp=None), and
smooth_config_template.yaml all defaulted to True. Now consistent
with the manuscript's "empirical variance is disabled by default"
statement. Opt in via use_empirical_variance=True / GPSettings.
Regression coverage in tests/test_audit_fixes.py:
* monkeypatches compute_mahalanobis_distances to capture the
combined-covariance kwarg from a controlled-kernel fit and
asserts equality with cov1 + cov2 (not the average);
* replicates the ln_ptp formula and asserts equality with the
closed-form Phi(-|z|) both directly and end-to-end through
DifferentialAbundance.fit().predict();
* introspects every public entry point's signature and asserts
the default is False; also parses both CLI YAML templates.
tests/test_empirical_variance.py::test_default_is_on was the only
pre-existing test pinning the old default; it has been flipped and
renamed test_default_is_off.
pyproject.toml and kompot/version.py bumped to 0.8.0; CHANGELOG
entry under [0.8.0] - 2026-05-28 documents the migration notes
above. Release tag is not cut by this commit.
Gene-set-enrichment lollipop: one row per enriched term, a stem to a dot whose x-position encodes significance (-log10(FDR) by default, or any score column) and whose area encodes the matched-gene count, with a dashed FDR=0.05 guide and an in-axes aesthetic key. Ported from the kompot manuscript (Fig 3 G/L) and generalized. Headline feature is input flexibility: accepts a kompot.plot.StringDBReport (its get_functional_enrichment() is called for you), the signal-sorted frame that method returns, or a generic enrichment table from another tool (gseapy/enrichr, GOATOOLS, clusterProfiler). Column-name mapping params (term_col/score_col/count_col/fdr_col) with case-insensitive autodetection — including the gseapy 'k/K' Overlap-string parser — bridge the schemas. Like dotplot, composes into an externally-provided ax= instead of building its own GridSpec. Additive only; no existing plot code touched. 22 tests.
|
Added One row per enriched term; a stem runs to a dot whose x-position encodes Headline feature is input flexibility: accepts a Additive only; no existing plot code touched. New file Synthetic-data example renders and details are tracked internally. |
The DE Mahalanobis posterior tail probability was computed with chi2.sf in linear space. For an embedding dimension on the order of tens, chi2.sf returns values numerically indistinguishable from 1.0 for the majority of genes (every gene with D^2 below the chi-squared mean), saturating the stored statistic and destroying gene-ranking resolution at the head of the distribution. Compute the tail probability with scipy.stats.chi2.logsf in float64 and store -log10(PTP) in a renamed field neg_log10_ptp, mirroring the differential-abundance path (neg_log10_lfc_ptp). logsf is evaluated directly (never forming 1 - cdf), so every gene keeps a distinct value and the stored statistic recovers the Mahalanobis ranking exactly. scipy float64 is used deliberately: jax runs in float32 unless x64 is enabled, and float32 logsf re-collapses the dynamic range. - predict() now returns neg_log10_ptp; the stored var field and de() table column are renamed ..._ptp -> ..._neg_log10_ptp. - volcano_de(y_axis_type="ptp") reads the pre-transformed column directly (no extra -log10) and maps the probability threshold onto the -log10 axis; significance_threshold is still given as a probability. - Regression tests assert ranking equivalence with the Mahalanobis distance, restored distinct-value count, and dynamic range beyond [0, 1] that the old linear field could not represent.
Brings the implementation into agreement with the manuscript on
three statistical points and harmonizes a default-argument
inconsistency that had crept in across deprecated wrappers. Targets
v0.8.0.These are scalar recalibrations, not changes to what is detected.
The DE Mahalanobis change is a single uniform multiplicative factor
(
/√2on the distance): it does not alter significance detectionor gene ordering — every gene's distance scales by the same constant,
the null scales identically, so relative rankings and FDR call-sets are
unchanged after recalibration. The DA PTP change is likewise a scalar
(a halving, i.e. a constant shift in log space): it leaves the ordering
of points intact and only makes a fixed default threshold slightly
more sensitive. Absolute thresholds tuned against
0.7.0should bere-checked, but no qualitative result moves.
Mahalanobis denominator — sum, not average
The manuscript defines the gene-wise distance as
D(a,b) = sqrt((μ_a − μ_b)^T (Σ_a + Σ_b)^(-1) (μ_a − μ_b)),but the GP-posterior branch in
kompot/differential/differential_expression.py:629was computingcombined_cov = (cov1 + cov2) / 2. This change drops the/2so thecombined covariance equals the variance of the difference of two
independent posterior estimators. The sample-variance branch at the
same call site was already correctly summed — which is what made the
inconsistency visible.
Effect — purely a scalar.
Dcontracts by√2(andD²by2)everywhere. Because the identical factor applies to both the
observed and the null distances, this is a uniform rescaling of the
axis: it does not change which genes are called significant and it
does not change gene order. FDR thresholds re-calibrate against the
rescaled null, so call-sets are essentially unchanged; only the raw
numeric magnitude of
Ddiffers.DA posterior tail probability — one-sided
The manuscript prose describes a one-sided test ("the probability that
the true density difference has the opposite sign to the estimated
change"), and the formula
PTP(x_i) = Φ(−|Δ(x_i)| / sqrt(σ_a² + σ_b²))is one-sided. Theimplementation at
kompot/differential/differential_abundance.py:607-610was emitting2 · Φ(−|z|)(two-sided). This change removes the spurious+ np.log(2)soln_ptp = log Φ(−|z|).Effect — a scalar shift, slightly more sensitive default. Numeric
PTPs are halved (a constant multiplicative factor;
neg_log10_fold_change_ptpshifts up by
log10(2) ≈ 0.301). Point ordering is untouched. Thepractical consequence is only that a fixed PTP cutoff becomes a touch
more permissive: a
PTP < 1e-3threshold that meant|z| ≥ 3.29(two-sided) now means
|z| ≥ 3.09(one-sided) — i.e. the defaultthreshold is slightly more sensitive. Re-tune any
ptp_thresholdcalibrated against
0.7.0if exact parity is required.use_empirical_variance—FalseeverywhereThe manuscript states empirical variance is disabled by default. This
was already true for the recommended
kompot.de()path (viaGPSettings(use_empirical_variance=False)) and forde_config_template.yaml. Several other publicly-exposed entry pointswere inconsistent (default
True):kompot.differential.DifferentialExpression.__init__kompot.differential.expression_model.ExpressionModel.__init__kompot.anndata.differential_expression.compute_differential_expression(deprecated wrapper)
kompot.anndata.smooth.compute_smoothed_expression(deprecated wrapper)kompot.anndata.smooth.smooth_expression(gp=Nonefallback)kompot/cli/templates/smooth_config_template.yamlAll now default to
False, matching the recommended path and themanuscript. Docstrings and the resource-estimation comment for
combined-covariance memory are brought in line. Code relying on
empirical variance must now opt in explicitly via
GPSettings(use_empirical_variance=True)/ the keyword arg.0.7.0 → 0.8.0pyproject.tomlandkompot/version.pybumped to0.8.0; a newCHANGELOG.mdsection under## [0.8.0]records the three behaviorchanges above with migration notes. No release tag is cut by this PR.
Tests
tests/test_audit_fixes.pyadds three focused regression suites pinningthe post-
0.8.0invariants:compute_mahalanobis_distancesto capture thecovariancekwarg froma controlled-kernel fit (
cov1 = 2I,cov2 = 3I) and asserts equalitywith
5 · I. Pre-0.8.0would have produced2.5 · I.ln_ptpformula and asserts itequals the closed-form
Φ(−|z|), both directly and through a fullDifferentialAbundance.fit().predict()round-trip.use_empirical_variancedefault — introspects every public entrypoint's signature and asserts the default is
False; also parses bothCLI YAML templates.
tests/test_empirical_variance.py— the one pre-existing test pinningthe old buggy default (
test_default_is_on) is flipped and renamedtest_default_is_off. The Mahalanobis-approach tests(
tests/test_mahalanobis_approaches.py) compare implementations againsteach other rather than pinned numerics, so they are insensitive to the
scale shift.