Skip to content

align Mahalanobis and DA PTP with manuscript; harmonize use_empirical_variance default; bump to 0.8.0#14

Open
settylab-dotto-bot[bot] wants to merge 3 commits into
mainfrom
dominik/audit-fixes-2026-05-28
Open

align Mahalanobis and DA PTP with manuscript; harmonize use_empirical_variance default; bump to 0.8.0#14
settylab-dotto-bot[bot] wants to merge 3 commits into
mainfrom
dominik/audit-fixes-2026-05-28

Conversation

@settylab-dotto-bot
Copy link
Copy Markdown
Contributor

@settylab-dotto-bot settylab-dotto-bot Bot commented May 29, 2026

Brings the implementation into agreement with the manuscript on
three statistical points and harmonizes a default-argument
inconsistency that had crept in across deprecated wrappers. Targets
v0.8.0.

These are scalar recalibrations, not changes to what is detected.
The DE Mahalanobis change is a single uniform multiplicative factor
(/√2 on the distance): it does not alter significance detection
or gene ordering — every gene's distance scales by the same constant,
the null scales identically, so relative rankings and FDR call-sets are
unchanged after recalibration. The DA PTP change is likewise a scalar
(a halving, i.e. a constant shift in log space): it leaves the ordering
of points intact and only makes a fixed default threshold slightly
more sensitive
. Absolute thresholds tuned against 0.7.0 should be
re-checked, but no qualitative result moves.

Mahalanobis denominator — sum, not average

The manuscript defines the gene-wise distance as
D(a,b) = sqrt((μ_a − μ_b)^T (Σ_a + Σ_b)^(-1) (μ_a − μ_b)),
but the GP-posterior branch in
kompot/differential/differential_expression.py:629 was computing
combined_cov = (cov1 + cov2) / 2. This change drops the /2 so the
combined covariance equals the variance of the difference of two
independent posterior estimators. The sample-variance branch at the
same call site was already correctly summed — which is what made the
inconsistency visible.

Effect — purely a scalar. D contracts by √2 (and by 2)
everywhere. Because the identical factor applies to both the
observed and the null distances, this is a uniform rescaling of the
axis: it does not change which genes are called significant and it
does not change gene order. FDR thresholds re-calibrate against the
rescaled null, so call-sets are essentially unchanged; only the raw
numeric magnitude of D differs.

DA posterior tail probability — one-sided

The manuscript prose describes a one-sided test ("the probability that
the true density difference has the opposite sign to the estimated
change"), and the formula
PTP(x_i) = Φ(−|Δ(x_i)| / sqrt(σ_a² + σ_b²)) is one-sided. The
implementation at
kompot/differential/differential_abundance.py:607-610 was emitting
2 · Φ(−|z|) (two-sided). This change removes the spurious
+ np.log(2) so ln_ptp = log Φ(−|z|).

Effect — a scalar shift, slightly more sensitive default. Numeric
PTPs are halved (a constant multiplicative factor; neg_log10_fold_change_ptp
shifts up by log10(2) ≈ 0.301). Point ordering is untouched. The
practical consequence is only that a fixed PTP cutoff becomes a touch
more permissive: a PTP < 1e-3 threshold that meant |z| ≥ 3.29
(two-sided) now means |z| ≥ 3.09 (one-sided) — i.e. the default
threshold is slightly more sensitive. Re-tune any ptp_threshold
calibrated against 0.7.0 if exact parity is required.

use_empirical_varianceFalse everywhere

The manuscript states empirical variance is disabled by default. This
was already true for the recommended kompot.de() path (via
GPSettings(use_empirical_variance=False)) and for
de_config_template.yaml. Several other publicly-exposed entry points
were inconsistent (default True):

  • kompot.differential.DifferentialExpression.__init__
  • kompot.differential.expression_model.ExpressionModel.__init__
  • kompot.anndata.differential_expression.compute_differential_expression
    (deprecated wrapper)
  • kompot.anndata.smooth.compute_smoothed_expression (deprecated wrapper)
  • kompot.anndata.smooth.smooth_expression (gp=None fallback)
  • kompot/cli/templates/smooth_config_template.yaml

All now default to False, matching the recommended path and the
manuscript. Docstrings and the resource-estimation comment for
combined-covariance memory are brought in line. Code relying on
empirical variance must now opt in explicitly via
GPSettings(use_empirical_variance=True) / the keyword arg.

0.7.0 → 0.8.0

pyproject.toml and kompot/version.py bumped to 0.8.0; a new
CHANGELOG.md section under ## [0.8.0] records the three behavior
changes above with migration notes. No release tag is cut by this PR.

Tests

tests/test_audit_fixes.py adds three focused regression suites pinning
the post-0.8.0 invariants:

  • Mahalanobis denominator — monkeypatches
    compute_mahalanobis_distances to capture the covariance kwarg from
    a controlled-kernel fit (cov1 = 2I, cov2 = 3I) and asserts equality
    with 5 · I. Pre-0.8.0 would have produced 2.5 · I.
  • DA PTP — replicates the exact ln_ptp formula and asserts it
    equals the closed-form Φ(−|z|), both directly and through a full
    DifferentialAbundance.fit().predict() round-trip.
  • use_empirical_variance default — introspects every public entry
    point's signature and asserts the default is False; also parses both
    CLI YAML templates.

tests/test_empirical_variance.py — the one pre-existing test pinning
the old buggy default (test_default_is_on) is flipped and renamed
test_default_is_off. The Mahalanobis-approach tests
(tests/test_mahalanobis_approaches.py) compare implementations against
each other rather than pinned numerics, so they are insensitive to the
scale shift.

Three statistical corrections + one default-value harmonization;
all called out in the manuscript-vs-implementation review.

* Mahalanobis denominator sums the two posterior covariances
  (cov1 + cov2) instead of averaging them. The manuscript defines
  D(a,b) = sqrt((mu_a - mu_b)^T (Sigma_a + Sigma_b)^(-1) (mu_a - mu_b)),
  i.e. the variance of the difference of two independent posterior
  estimators. The GP-only branch in differential_expression.py was
  computing (cov1 + cov2) / 2 even though the sample-variance branch
  at the same call site was already summing.
  Effect: D shrinks by sqrt(2) in the GP-only regime. Relative
  rankings unchanged; FDR re-calibrates against the null. The
  resource-estimation comment is brought in line too.

* Differential-abundance posterior tail probability is one-sided:
  PTP = Phi(-|z|), matching the manuscript prose ("probability that
  the true density difference has the opposite sign to the estimated
  change") and the one-sided formula. The implementation was
  emitting 2 * Phi(-|z|) via a spurious + np.log(2).
  Effect: PTPs are halved; neg_log10_fold_change_ptp increases by
  log10(2). A previously-published cutoff of PTP < 1e-3 corresponded
  to |z| >= 3.29 (two-sided) and now corresponds to |z| >= 3.09
  (one-sided). Re-tune any ptp_threshold chosen against 0.7.0.

* use_empirical_variance defaults to False everywhere. The
  recommended kompot.de() path already defaulted to False via
  GPSettings; the deprecated wrappers (compute_differential_expression,
  compute_smoothed_expression), DifferentialExpression.__init__,
  ExpressionModel.__init__, smooth_expression(gp=None), and
  smooth_config_template.yaml all defaulted to True. Now consistent
  with the manuscript's "empirical variance is disabled by default"
  statement. Opt in via use_empirical_variance=True / GPSettings.

Regression coverage in tests/test_audit_fixes.py:
* monkeypatches compute_mahalanobis_distances to capture the
  combined-covariance kwarg from a controlled-kernel fit and
  asserts equality with cov1 + cov2 (not the average);
* replicates the ln_ptp formula and asserts equality with the
  closed-form Phi(-|z|) both directly and end-to-end through
  DifferentialAbundance.fit().predict();
* introspects every public entry point's signature and asserts
  the default is False; also parses both CLI YAML templates.

tests/test_empirical_variance.py::test_default_is_on was the only
pre-existing test pinning the old default; it has been flipped and
renamed test_default_is_off.

pyproject.toml and kompot/version.py bumped to 0.8.0; CHANGELOG
entry under [0.8.0] - 2026-05-28 documents the migration notes
above. Release tag is not cut by this commit.
@settylab-dotto-bot settylab-dotto-bot Bot requested a review from katosh as a code owner May 29, 2026 22:51
@settylab-dotto-bot settylab-dotto-bot Bot changed the title Re-establish #11 audit fixes (v0.8.0) against restored main align Mahalanobis and DA PTP with manuscript; harmonize use_empirical_variance default; bump to 0.8.0 May 29, 2026
Gene-set-enrichment lollipop: one row per enriched term, a stem to a dot
whose x-position encodes significance (-log10(FDR) by default, or any score
column) and whose area encodes the matched-gene count, with a dashed
FDR=0.05 guide and an in-axes aesthetic key. Ported from the kompot
manuscript (Fig 3 G/L) and generalized.

Headline feature is input flexibility: accepts a kompot.plot.StringDBReport
(its get_functional_enrichment() is called for you), the signal-sorted frame
that method returns, or a generic enrichment table from another tool
(gseapy/enrichr, GOATOOLS, clusterProfiler). Column-name mapping params
(term_col/score_col/count_col/fdr_col) with case-insensitive autodetection
— including the gseapy 'k/K' Overlap-string parser — bridge the schemas.

Like dotplot, composes into an externally-provided ax= instead of building
its own GridSpec. Additive only; no existing plot code touched. 22 tests.
@settylab-dotto-bot
Copy link
Copy Markdown
Contributor Author

Added kompot.plot.lollipop to this branch (commit ed30264) — an
ax-embeddable gene-set-enrichment lollipop ported from the manuscript
(Fig 3 G/L) and generalized.

One row per enriched term; a stem runs to a dot whose x-position encodes
significance (-log10(FDR) by default, or any score column) and whose area
encodes the matched-gene count, with a dashed FDR = 0.05 guide and an
in-axes aesthetic key. Like dotplot, it composes into an externally-provided
ax= rather than building its own GridSpec.

Headline feature is input flexibility: accepts a kompot.plot.StringDBReport
(its get_functional_enrichment() is called for you), the signal-sorted
frame that method returns, or a generic enrichment table from another tool
(gseapy/enrichr, GOATOOLS, clusterProfiler) via term_col / score_col /
count_col / fdr_col mapping params with case-insensitive autodetection —
including the gseapy "k/K" Overlap-string count parser.

Additive only; no existing plot code touched. New file
kompot/plot/lollipop.py + tests/test_plot_lollipop.py (22 tests, green),
exported from kompot.plot, with CHANGELOG + docs updates.

Synthetic-data example renders and details are tracked internally.

The DE Mahalanobis posterior tail probability was computed with
chi2.sf in linear space. For an embedding dimension on the order of
tens, chi2.sf returns values numerically indistinguishable from 1.0
for the majority of genes (every gene with D^2 below the chi-squared
mean), saturating the stored statistic and destroying gene-ranking
resolution at the head of the distribution.

Compute the tail probability with scipy.stats.chi2.logsf in float64
and store -log10(PTP) in a renamed field neg_log10_ptp, mirroring the
differential-abundance path (neg_log10_lfc_ptp). logsf is evaluated
directly (never forming 1 - cdf), so every gene keeps a distinct value
and the stored statistic recovers the Mahalanobis ranking exactly.
scipy float64 is used deliberately: jax runs in float32 unless x64 is
enabled, and float32 logsf re-collapses the dynamic range.

- predict() now returns neg_log10_ptp; the stored var field and de()
  table column are renamed ..._ptp -> ..._neg_log10_ptp.
- volcano_de(y_axis_type="ptp") reads the pre-transformed column
  directly (no extra -log10) and maps the probability threshold onto
  the -log10 axis; significance_threshold is still given as a probability.
- Regression tests assert ranking equivalence with the Mahalanobis
  distance, restored distinct-value count, and dynamic range beyond
  [0, 1] that the old linear field could not represent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant