Fix VAE sampling: enable reparameterization by default, return mu for inference by Copilot · Pull Request #62 · ohsu-comp-bio/embedding-kit

Copilot · 2026-05-05T17:35:18Z

Encoder defaulted to sampling=False, causing the encoder to return the raw hidden state h as z instead of the reparameterized sample — making every VAE a plain autoencoder silently.

Root cause

forward() returned (mu, logvar, h) when sampling=False (wrong: h ≠ mu). The build_encoder factory never passed sampling=True. Result: KL loss computed against mu/logvar but decoder received h.

Changes

encoder.py: Default sampling=True. When sampling=False, return (mu, logvar, mu) — the third element is now always the correct decode input (reparameterized z or deterministic mu).
base_vae.py: build_encoder() accepts and forwards sampling=True. encode() returns mu for stable deterministic embeddings. SimpleEncoder likewise returns mu.
vae.py: VAE.__init__ exposes sampling: bool = True, passed through to build_encoder. Serialized in to_dict/from_dict.
commands/model.py: Added --sampling/--no-sampling flag to train-vae (default True). encode command now uses result[0] (mu) instead of result[2] for inference output.

# Before: z was the raw hidden state h — not a VAE sample
enc = Encoder(feature_dim=64, latent_dim=16)  # sampling=False default
mu, logvar, z = enc(x)
assert torch.allclose(z, mu)  # False! z was h, not even mu

# After: sampling=True by default; sampling=False gives z=mu (deterministic AE)
enc = Encoder(feature_dim=64, latent_dim=16)  # sampling=True
mu, logvar, z = enc(x)
# z = mu + eps*std  ✓

enc_det = Encoder(feature_dim=64, latent_dim=16, sampling=False)
mu, logvar, z = enc_det(x)
assert torch.allclose(z, mu)  # True ✓

Tests

Four new tests in test_encoder.py cover: sampling enabled produces stochastic z, disabled gives z == mu, default is True, and forward() always returns a 3-tuple.

… inference Agent-Logs-Url: https://github.com/ohsu-comp-bio/embedding-kit/sessions/0b5866f0-ed9a-412e-9be8-afccd0368e05 Co-authored-by: kellrott <113868+kellrott@users.noreply.github.com>

github-actions · 2026-05-05T18:03:37Z

☂️ Python Coverage

current status: ❌

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
2602	1255	48%	30%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
src/embkit/commands/model.py	35%	🔴
src/embkit/models/vae/base_vae.py	67%	🔴
src/embkit/models/vae/encoder.py	94%	🟢
src/embkit/models/vae/vae.py	37%	🔴
TOTAL	58%	🔴

updated for commit: 970fbc2 by action🐍

kbcoulter · 2026-05-11T21:06:44Z

I didn't see this on the coverage report, but the tests themselves are working.

kbcoulter

The new tests do not seem to appear in the coverage report (missing rather than excluded), but everything runs as expected and the math looks correct.

Initial plan

f0a5a3a

Copilot AI assigned Copilot and kellrott May 5, 2026

Copilot started work on behalf of kellrott May 5, 2026 17:35 View session

Copilot AI linked an issue May 5, 2026 that may be closed by this pull request

Potential Issues with sampling #61

Open

Fix VAE sampling: enable reparameterization by default, return mu for…

c140999

… inference Agent-Logs-Url: https://github.com/ohsu-comp-bio/embedding-kit/sessions/0b5866f0-ed9a-412e-9be8-afccd0368e05 Co-authored-by: kellrott <113868+kellrott@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix sampling behavior in VAE training process~~ Fix VAE sampling: enable reparameterization by default, return mu for inference May 5, 2026

Copilot finished work on behalf of kellrott May 5, 2026 17:44

Copilot AI requested a review from kellrott May 5, 2026 17:44

kbcoulter reviewed May 11, 2026

View reviewed changes

kellrott added 2 commits May 19, 2026 09:45

Turning off enocder sampling when in eval mode

0289d99

Fixing unit tests

970fbc2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix VAE sampling: enable reparameterization by default, return mu for inference#62

Fix VAE sampling: enable reparameterization by default, return mu for inference#62
Copilot wants to merge 4 commits into
developfrom
copilot/fix-sampling-issues-during-training

Copilot AI commented May 5, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 5, 2026 •

edited

Loading

Uh oh!

kbcoulter May 11, 2026

Uh oh!

kbcoulter left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause

Changes

Tests

Uh oh!

github-actions Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

kbcoulter May 11, 2026

Choose a reason for hiding this comment

Uh oh!

kbcoulter left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 5, 2026 •

edited

Loading

github-actions Bot commented May 5, 2026 •

edited

Loading

kbcoulter left a comment •

edited

Loading