You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TortuosityMLMG no longer scales linearly on real porous microstructures. The bake-off in notebooks/profiling_and_tuning.ipynb and the python/tests/test_mlmg_porespy.py regression now return correct answers but with super-linear iteration growth, undermining the "MLMG is the default because it's the only O(N) option" rationale used to make MLMG the default solver (v4.3.0).
Background — how we got here
Three sequential issues, all fixed on branch claude/debug-mlmg-solver-GxIT6 → master:
Original bug ([resolved]): TortuositySolverBase::generateActivityMask marks dead-end and isolated phase-target cells as non-percolating. TortuosityHypre decouples those rows in the matrix (A_ii=1, A_ij=0, rhs=0 at src/props/TortuosityHypre.cpp:1100). TortuosityMLMG did not — MLABecLaplacian's B-coefficient field was built straight from m_mf_diff_coeff (which is 1 on every phase-target cell) with no mask awareness. Non-percolating islands then formed Neumann subdomains with no Dirichlet contact; MLMG drove the local residual to zero but those potentials were indeterminate, breaking the boundary flux balance that TortuositySolverBase::value() audits → "converged but produced an invalid result" NaN on every porespy run.
Pin-induced silent abort ([resolved]): pinning inactive cells via setScalars(1.0, 1.0) + acoef = 1 on inactive cells + harmonic-mean B-faces zeroed at active/inactive interfaces gave a well-posed operator, but MLMG default behaviour on non-convergence is amrex::Abort (SIGABRT), not throw. The existing try { mlmg.solve(...) } catch (std::exception&) did nothing useful → kernel died silently. Fixed by mlmg.setThrowException(true).
Unreachable tolerance ([resolved]): eps = 1e-11 default was chosen to keep the integrated per-cell residual under the 1e-4 boundary-flux conservation tolerance. With the pin, MLMG stalls around 1e-5 (~5% reduction per V-cycle) and never gets to 1e-11. Loosened to 1e-7 with maxiter = 400 — still three orders below the flux tolerance, so correctness unaffected.
The remaining problem
After (1)–(3), MLMG on porespy 64³ runs to completion with a correct answer, but: the V-cycles only give ~5% residual reduction each, not the order-of-magnitude per cycle that healthy geometric multigrid produces. Iteration count grows with problem size — this is the super-linear scaling we now see in §4 of the profiling notebook on heterogeneous data.
Root cause is structural, not a bug we can patch around: MLABecLaplacian's geometric coarsening averages the a-coefficient and B-faces across 2³ fine cells per coarse cell. A coarse cell covering a mix of active (a=0) and inactive (a=1) fine cells gets a partial pin (a≈0.5) and partially-zeroed B-faces. That coarse operator does not faithfully represent the restriction of the fine operator's action on the active subdomain. V-cycle correction therefore contains components that the fine smoother cannot fix, and convergence stagnates.
This is the same reason pcg+pfmg / gmres+pfmg fail in the bake-off: HYPRE's structured PFMG semicoarsens through the same masked rows and stalls.
What scales where (post-fix, current behaviour)
Geometry
MLMG outcome
Iter scaling
Uniform / fully connected (np.zeros)
Correct, fast
Near-linear (the §4 plot, with warm-up point trimmed) — a=0 everywhere, pin never activates, coarsening clean
Migrate TortuosityMLMG to MLEBABecLaplacian with the active mask encoded as embedded-boundary metadata.
AMReX's EB framework is designed for exactly this case: solid obstacles inside a fluid domain. Each cell carries a volume fraction vfrac (1 = fully open, 0 = fully covered) and each face an aperture (apx, apy, apz). The MG hierarchy is built with EB-aware restriction/prolongation: coarse cells track the union of their fine cells' coverage, and the EB-aware stencil at irregular cells is automatically derived from the apertures. No alpha*a pin needed — covered cells are simply absent from the system.
Why this should restore linear scaling:
Inactive (non-percolating) cells get vfrac = 0, are removed from the operator entirely
Active/inactive interface faces get aperture = 0, decoupling the two regions geometrically
MG coarsening preserves EB metadata correctly across levels — coarse-grid operators stay consistent with their fine counterparts
This is the AMReX-native analogue of HYPRE's row-decoupling, just with the bookkeeping done geometrically rather than algebraically
Why it's tractable for us:
AMReX is already built with -DAMReX_EB=ON in both containers/Dockerfile:93 and containers/Singularity.deps.def:176
The active mask we already compute in TortuositySolverBase::generateActivityMask (parallel flood fill from inlet & outlet seeds) is exactly the input EB needs — cell_active → vfrac = 1.0, cell_inactive → vfrac = 0.0
The EB2::Build + makeShop path with a custom implicit function (ActiveMaskIF) works without needing EB2::IF_* analytic shapes
Implementation (in progress on claude/issue-289-mlmg-eb-migration):
ActiveMaskIF struct converts the active mask into an EB implicit function (negative = fluid/active, positive = body/inactive)
setScalars(0.0, 1.0) — pure Laplacian, no alpha*a pin
EB default Neumann BC on the body surface (no flux into solid)
Domain-boundary ghost cells of m_mf_diff_coeff filled by extrapolation before B-coefficient computation (MLEBABecLap uses user B at boundary faces, unlike MLABecLaplacian which overrides internally)
EB2::IndexSpace::pop() at end of solve() to avoid leaking EB metadata
mlmg.setThrowException(true) retained for graceful failure
eps restored to 1e-11, maxiter to 200
Early CI results (uniform 32³): MLMG converges in 10 iterations with ~14x reduction per V-cycle — healthy MG behaviour restored. Boundary flux conservation was initially failing due to uninitialised ghost D at domain boundaries; fixed by explicit ghost-cell extrapolation.
Continuous SDF SDFImplicitFunction (returns signed distance)
Every cell is fully regular or fully covered
Interface cells are cut cells with 0 < vfrac < 1
Recovers MG scaling by removing inactive cells
Adds sub-voxel geometric accuracy at interfaces
To go from #289 → #69, the only change is which implicit function is passed to makeShop. The rest of the pipeline stays identical.
Sub-voxel accuracy strategy (revised from #69 discussion)
The original #69 proposal focused on ingesting ML probability maps. After analysis, a revised strategy acknowledges that OpenImpala's primary input is binary segmented XCT data, and the approaches for improving accuracy on that data differ significantly in robustness:
Approaches considered, ranked by robustness on heterogeneous 3D microstructures:
Approach
Accuracy
Effort
Robustness
Notes
Richardson extrapolation on binary grid
Best (rigorous)
Low
Excellent
Solve at N³, 2N³, 4N³ and extrapolate. O(dx) staircase error has known convergence rate. Quantifies discretisation error as a byproduct. Cost: 3–4 solves at increasing resolution.
Set D_cell = phase_fraction directly. No EB needed. Requires upstream pipeline to provide partial-volume data instead of binary thresholding.
Effective-medium face coefficients (Schneider et al. 2019)
Good
Medium
Good
Smooth the operator, not the geometry. Face D derived from local neighbourhood porosity. Works on existing solver path.
Anti-aliased SDF → EB (Frisken et al. 2000)
Good
Medium-high
Moderate
Analyse 2x2x2 boundary neighbourhoods to estimate sub-voxel interface position. Better than naive distance transform for corners/thin features.
Naive SDF from distance_transform_edt → EB
Variable
Low
Poor
Rounds all corners to circles/spheres, can't distinguish real features from staircase artifacts, systematically distorts thin features (< 5 voxels).
Key insight: the naive SDF-from-binary approach (compute Euclidean distance transform, feed to EB) gives false confidence on heterogeneous data. The staircase error from binary voxels is well-characterised and converges as O(dx); the SDF-smoothed error is geometry-dependent and harder to quantify. For highly heterogeneous media (porespy blobs, real battery electrodes), the smoothing can actually make results worse by rounding away real microstructural features.
Phase 2 (Feature Request: Sub-Voxel Accuracy via AMReX Embedded Boundaries (EB) #69): Accept an optional user-provided SDF/level-set (level_set= kwarg). Don't auto-compute from binary data. The user decides whether their upstream pipeline (ML segmentation, grayscale-to-SDF, partial-volume CT) produces something worth smoothing with. Document clearly: "For genuine sub-voxel accuracy, provide an SDF derived from the original grayscale or ML probability field, not from a distance transform on binary data."
Don't implement auto-smoothing (smooth_eb=True). It would give users false confidence on heterogeneous data.
For highest-accuracy validation on binary data: use Richardson extrapolation (solve at multiple resolutions, extrapolate). No geometric assumptions, quantifies discretisation error, works on any microstructure.
Implementation plan
src/props/TortuosityMLMG.{H,cpp} [done - on branch]
- Replace MLABecLaplacian with MLEBABecLap
- ActiveMaskIF implicit function from m_mf_active_mask
- EB2::Build → EBFArrayBoxFactory → MLEBABecLap pipeline
- Domain-boundary ghost-cell extrapolation for m_mf_diff_coeff
- Drop alpha*a pin, drop dc_masked workaround
- Restore eps=1e-11, maxiter=200
python/bindings/solvers.cpp [done - on branch]
- Updated default eps/maxiter to match header
src/props/TortuositySolverBase.{H,cpp} [optional]
- Expose m_mf_active_mask for other solvers to adopt EB pattern
tests/tTortuosityMLMG.cpp [TODO]
- Add non-uniform geometry test case
python/tests/test_mlmg_porespy.py [TODO]
- Add iteration-count regression (≤20 V-cycles at 32³)
notebooks/profiling_and_tuning.ipynb [TODO]
- Update §4 to validate scaling on porespy blobs, not np.zeros
Backup paths (lesser options)
Custom MLLinOp subclass carrying the active mask natively, overriding Fapply / Fsmooth / applyBC / restriction. Equivalent expressive power to EB but considerably more code and more chances to get the restriction wrong. Only worth considering if EB turns out to have a blocker we don't anticipate.
Wrap HYPRE BoomerAMG (algebraic multigrid). Linear scaling guaranteed on M-matrices regardless of mask. But: loses the matrix-free / low-memory property that justifies MLMG as the default in the first place, requires IJ-matrix assembly (we currently use the Struct interface), and adds AMG setup cost that's amortised poorly on small grids. Useful as a fallback for cases where EB can't handle the geometry (multi-phase D > 0 with high contrast?), but not the right primary path.
Compactify the active subdomain into a smaller BoxArray. Conceptually clean but the active region of porespy is highly irregular — load balancing collapses, and setLevelBC on a non-cuboidal domain isn't a thing in MLABecLaplacian. Reject.
Acceptance criteria
MLMG reaches eps = 1e-11 in ≤ 20 V-cycles on porous_blobs (32³, ~50% porosity).
Iteration count is bounded by a constant (say ≤ 30) across 32³, 64³, 96³, 128³, 256³ for porespy data. The §4 scaling fit on porous data gives p < 1.2.
test_mlmg_matches_hypre_within_one_percent still passes.
Bake-off (notebooks/profiling_and_tuning.ipynb §3) shows MLMG as the fastest solver on 64³ porespy, recovering the original "default solver since v4.3.0" claim.
Summary
TortuosityMLMGno longer scales linearly on real porous microstructures. The bake-off innotebooks/profiling_and_tuning.ipynband thepython/tests/test_mlmg_porespy.pyregression now return correct answers but with super-linear iteration growth, undermining the "MLMG is the default because it's the only O(N) option" rationale used to make MLMG the default solver (v4.3.0).Background — how we got here
Three sequential issues, all fixed on branch
claude/debug-mlmg-solver-GxIT6→master:Original bug ([resolved]):
TortuositySolverBase::generateActivityMaskmarks dead-end and isolated phase-target cells as non-percolating.TortuosityHypredecouples those rows in the matrix (A_ii=1, A_ij=0, rhs=0atsrc/props/TortuosityHypre.cpp:1100).TortuosityMLMGdid not —MLABecLaplacian's B-coefficient field was built straight fromm_mf_diff_coeff(which is1on every phase-target cell) with no mask awareness. Non-percolating islands then formed Neumann subdomains with no Dirichlet contact; MLMG drove the local residual to zero but those potentials were indeterminate, breaking the boundary flux balance thatTortuositySolverBase::value()audits → "converged but produced an invalid result" NaN on every porespy run.Pin-induced silent abort ([resolved]): pinning inactive cells via
setScalars(1.0, 1.0)+acoef = 1on inactive cells + harmonic-mean B-faces zeroed at active/inactive interfaces gave a well-posed operator, but MLMG default behaviour on non-convergence isamrex::Abort(SIGABRT), not throw. The existingtry { mlmg.solve(...) } catch (std::exception&)did nothing useful → kernel died silently. Fixed bymlmg.setThrowException(true).Unreachable tolerance ([resolved]):
eps = 1e-11default was chosen to keep the integrated per-cell residual under the 1e-4 boundary-flux conservation tolerance. With the pin, MLMG stalls around 1e-5 (~5% reduction per V-cycle) and never gets to 1e-11. Loosened to 1e-7 withmaxiter = 400— still three orders below the flux tolerance, so correctness unaffected.The remaining problem
After (1)–(3), MLMG on porespy 64³ runs to completion with a correct answer, but: the V-cycles only give ~5% residual reduction each, not the order-of-magnitude per cycle that healthy geometric multigrid produces. Iteration count grows with problem size — this is the super-linear scaling we now see in §4 of the profiling notebook on heterogeneous data.
Root cause is structural, not a bug we can patch around:
MLABecLaplacian's geometric coarsening averages thea-coefficient and B-faces across2³fine cells per coarse cell. A coarse cell covering a mix of active (a=0) and inactive (a=1) fine cells gets a partial pin (a≈0.5) and partially-zeroed B-faces. That coarse operator does not faithfully represent the restriction of the fine operator's action on the active subdomain. V-cycle correction therefore contains components that the fine smoother cannot fix, and convergence stagnates.This is the same reason
pcg+pfmg/gmres+pfmgfail in the bake-off: HYPRE's structured PFMG semicoarsens through the same masked rows and stalls.What scales where (post-fix, current behaviour)
np.zeros)a=0everywhere, pin never activates, coarsening cleanProposed fix — EB-based MLMG (Phase 1)
Migrate
TortuosityMLMGtoMLEBABecLaplacianwith the active mask encoded as embedded-boundary metadata.AMReX's EB framework is designed for exactly this case: solid obstacles inside a fluid domain. Each cell carries a volume fraction
vfrac(1 = fully open, 0 = fully covered) and each face an aperture (apx, apy, apz). The MG hierarchy is built with EB-aware restriction/prolongation: coarse cells track the union of their fine cells' coverage, and the EB-aware stencil at irregular cells is automatically derived from the apertures. Noalpha*apin needed — covered cells are simply absent from the system.Why this should restore linear scaling:
vfrac = 0, are removed from the operator entirelyWhy it's tractable for us:
-DAMReX_EB=ONin bothcontainers/Dockerfile:93andcontainers/Singularity.deps.def:176TortuositySolverBase::generateActivityMask(parallel flood fill from inlet & outlet seeds) is exactly the input EB needs —cell_active→vfrac = 1.0,cell_inactive→vfrac = 0.0EB2::Build+makeShoppath with a custom implicit function (ActiveMaskIF) works without needingEB2::IF_*analytic shapesImplementation (in progress on
claude/issue-289-mlmg-eb-migration):ActiveMaskIFstruct converts the active mask into an EB implicit function (negative = fluid/active, positive = body/inactive)EB2::Build(makeShop(if_obj), geom, ...)→EBFArrayBoxFactory→MLEBABecLapsetScalars(0.0, 1.0)— pure Laplacian, no alpha*a pinm_mf_diff_coefffilled by extrapolation before B-coefficient computation (MLEBABecLap uses user B at boundary faces, unlike MLABecLaplacian which overrides internally)EB2::IndexSpace::pop()at end ofsolve()to avoid leaking EB metadatamlmg.setThrowException(true)retained for graceful failureEarly CI results (uniform 32³): MLMG converges in 10 iterations with ~14x reduction per V-cycle — healthy MG behaviour restored. Boundary flux conservation was initially failing due to uninitialised ghost D at domain boundaries; fixed by explicit ghost-cell extrapolation.
Relationship to #69 (Sub-Voxel Accuracy via EB)
This issue implements Phase 1 infrastructure from #69. The same
EB2 → EBFArrayBoxFactory → MLEBABecLap → MLMGpipeline serves both:ActiveMaskIF(returns ±1)SDFImplicitFunction(returns signed distance)0 < vfrac < 1To go from #289 → #69, the only change is which implicit function is passed to
makeShop. The rest of the pipeline stays identical.Sub-voxel accuracy strategy (revised from #69 discussion)
The original #69 proposal focused on ingesting ML probability maps. After analysis, a revised strategy acknowledges that OpenImpala's primary input is binary segmented XCT data, and the approaches for improving accuracy on that data differ significantly in robustness:
Approaches considered, ranked by robustness on heterogeneous 3D microstructures:
D_cell = phase_fractiondirectly. No EB needed. Requires upstream pipeline to provide partial-volume data instead of binary thresholding.distance_transform_edt→ EBKey insight: the naive SDF-from-binary approach (compute Euclidean distance transform, feed to EB) gives false confidence on heterogeneous data. The staircase error from binary voxels is well-characterised and converges as O(dx); the SDF-smoothed error is geometry-dependent and harder to quantify. For highly heterogeneous media (porespy blobs, real battery electrodes), the smoothing can actually make results worse by rounding away real microstructural features.
Recommended strategy:
Phase 1 (this issue, TortuosityMLMG: recover linear scaling on porous geometry (alpha*a pin breaks MG coarsening) #289): EB with binary mask — proves the
EB2 → MLEBABecLappipeline, handles inactive-cell masking correctly, recovers MG scaling. No sub-voxel accuracy claim.Phase 2 (Feature Request: Sub-Voxel Accuracy via AMReX Embedded Boundaries (EB) #69): Accept an optional user-provided SDF/level-set (
level_set=kwarg). Don't auto-compute from binary data. The user decides whether their upstream pipeline (ML segmentation, grayscale-to-SDF, partial-volume CT) produces something worth smoothing with. Document clearly: "For genuine sub-voxel accuracy, provide an SDF derived from the original grayscale or ML probability field, not from a distance transform on binary data."Don't implement auto-smoothing (
smooth_eb=True). It would give users false confidence on heterogeneous data.For highest-accuracy validation on binary data: use Richardson extrapolation (solve at multiple resolutions, extrapolate). No geometric assumptions, quantifies discretisation error, works on any microstructure.
Implementation plan
Backup paths (lesser options)
Custom
MLLinOpsubclass carrying the active mask natively, overridingFapply/Fsmooth/applyBC/ restriction. Equivalent expressive power to EB but considerably more code and more chances to get the restriction wrong. Only worth considering if EB turns out to have a blocker we don't anticipate.Wrap HYPRE BoomerAMG (algebraic multigrid). Linear scaling guaranteed on M-matrices regardless of mask. But: loses the matrix-free / low-memory property that justifies MLMG as the default in the first place, requires IJ-matrix assembly (we currently use the Struct interface), and adds AMG setup cost that's amortised poorly on small grids. Useful as a fallback for cases where EB can't handle the geometry (multi-phase D > 0 with high contrast?), but not the right primary path.
Compactify the active subdomain into a smaller
BoxArray. Conceptually clean but the active region of porespy is highly irregular — load balancing collapses, andsetLevelBCon a non-cuboidal domain isn't a thing in MLABecLaplacian. Reject.Acceptance criteria
MLMGreacheseps = 1e-11in ≤ 20 V-cycles onporous_blobs(32³, ~50% porosity).p < 1.2.test_mlmg_matches_hypre_within_one_percentstill passes.notebooks/profiling_and_tuning.ipynb§3) shows MLMG as the fastest solver on 64³ porespy, recovering the original "default solver since v4.3.0" claim.Related
claude/issue-289-mlmg-eb-migration(commits2ddf43c…b5ec0ee)claude/debug-mlmg-solver-GxIT6(commitsf511bb5…711defc)TortuosityHypre.cpp:1100for the row-decoupling reference HYPRE uses