Avoid inst.body() duplicate call which reallocates AllocIds#120
Merged
automergerpr-permission-manager[bot] merged 1 commit intomasterfrom Feb 20, 2026
Merged
Avoid inst.body() duplicate call which reallocates AllocIds#120automergerpr-permission-manager[bot] merged 1 commit intomasterfrom
inst.body() duplicate call which reallocates AllocIds#120automergerpr-permission-manager[bot] merged 1 commit intomasterfrom
Conversation
Collaborator
|
@dkcumming , your PR investigation/summary hints at a deeper issue: the pipeline is doing three conceptually related things as separate steps, each destroying information that the next step needs. I'll have to give this a think 🤔 🧐 |
palinatolmach
approved these changes
Feb 20, 2026
Collaborator
Author
5 tasks
dkcumming
pushed a commit
that referenced
this pull request
Feb 26, 2026
## What's this about? So, #120 fixed the immediate alloc-id mismatch by carrying collected items forward instead of re-fetching them. That was the right call. But it left the underlying structure intact: three separate phases (mk_item, collect_unevaluated_constant_items, collect_interned_values), each with full access to `TyCtxt`, each free to call `inst.body()` or any other side-effecting rustc query whenever it felt like it. Nothing in the types prevented that, and the bug was a direct consequence: one phase called `inst.body()` a second time, rustc minted fresh AllocIds, and suddenly the alloc map had ids that didn't correspond to anything in the stored bodies. The question is: how do we make that class of bug structurally impossible, rather than just fixed for the one case we caught? The full decision record is in [`ADR-002`](https://github.com/cds-rs/stable-mir-json/blob/dc/declarative-spike/docs/adr/002-declarative-pipeline-with-allocmap-coherence.md). ### The restructuring The fix is to split the pipeline into phases with type signatures that enforce the boundary: ```rust collect_and_analyze_items(HashMap<String, Item>) -> (CollectedCrate, DerivedInfo) assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson ``` `CollectedCrate` holds items and unevaluated consts (the output of talking to rustc). `DerivedInfo` holds calls, allocs, types, and spans (the output of walking bodies). `assemble_smir` takes both by value and does pure data transformation; it structurally cannot call `inst.body()` because it has no `MonoItem` or `Instance` to call it on. That's the whole point: if you can't reach the query, you can't accidentally call it. The two body-walking visitors (`InternedValueCollector` and `UnevaluatedConstCollector`) are merged into a single `BodyAnalyzer` that walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: when `BodyAnalyzer` finds an unevaluated const, it records it; the outer loop creates the new `Item` (the one place `inst.body()` is called) and enqueues it. ### But what about catching regressions? Turns out the existing integration tests normalize away `alloc_id`s (via the jq filter), so they literally cannot catch this class of bug. The golden files don't contain alloc ids at all; you could scramble every id in the output and the tests would still pass. `AllocMap` replaces the bare `HashMap<AllocId, ...>` with a newtype that, under `#[cfg(debug_assertions)]`, tracks every insertion and flags duplicates. After the collect/analyze phase completes, `verify_coherence` walks every stored `Item` body with an `AllocIdCollector` visitor and asserts that each referenced `AllocId` exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time; zero cost in release builds. ### Other things that fell out of this - Static items now store their body in `MonoItemKind::MonoItemStatic` (collected once in `mk_item`), so the analysis phase never goes back to rustc for static bodies - `get_item_details` takes the pre-collected body as a parameter instead of calling `inst.body()` independently - The `items_clone` full HashMap clone is replaced by a `HashSet` of original item names (which is all the static fixup actually needed) - [we uncovered and fixed a very old bug](#121 (comment)) ### What's deleted `InternedValueCollector`, `UnevaluatedConstCollector`, `collect_interned_values`, `collect_unevaluated_constant_items`, the `InternedValues` type alias, and `items_clone`. Good riddance. ### Downstream impact The tighter allocs representation has already shown positive downstream effects in KMIR: the proof engine can now decode allocations inline (resolving to concrete values like `StringVal("123")`) instead of deferring them as opaque thunks. @dkcumming 's `offset-u8` test went from thunking through `#decodeConstant(constantKindAllo...)` to directly producing `toAlloc(allocId(0)), StringVal("123")`. The test's expected output needed updating, but the new failure mode is semantically grounded in actual data rather than deferred interpretation. ### Test plan - [x] `cargo build` compiles - [x] `cargo clippy` clean - [x] `cargo fmt` clean - [x] `make integration-test` passes (all 28 tests, identical output) - [x] KMIR downstream: `test_prove_rs[offset-u8-fail]` expected output updated
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR corrects a bug in generating the Stable MIR JSON artefacts. Prior to this fix allocations were being collected twice (calling
Instance::bodytwice), which resulted in certain allocs (slices) having multiple allocation ids. The fix is to refer to previously collected items so that the collector does not assign allocation ids again.Problem
There has been a disconnection between
TyandAllocIdin the.smir.jsonartefacts (for slices). A constant should be able to be traced to an allocation by theTythat will yield anAllocId. However attempting to do this lead to a case where theAllocIdfrom theTywas not in theAllocsmap, and allocations in the map hadAllocIds unseen in the rest of the.smir.json.Cause
The disconnection betwen the
Tys andAllocIds listed above came from there being a second collector pass callingInstance::bodyagain that assigned newAllocIds for slices that were not connected to theTys (still referencing the originals). The code that initiates the second collector pass is from stable-mir-json, while the code that re-allocates fresh ids is inrustcitself.A rough trace of the pipeline goes like this:
Details
collect_smirby let items = collect_items(tcx);MonoItemis mapped to anItemby mk_item(tcx, item.clone(), name)mk_itemif theMonoItemisMonoItemFn(inst: Instance)callInstance::body(&self)oninstby inst.body()rustc_smirContext::instance_body (forTablesWrapperthatimpl Context)Context::instance_bodycalls BodyBuilder::new(tables.tcx, instance) to create a new BodyBuilder which then calls the BodyBuilder::build methodBodyBuilder::buildcreatesmono_bodywhich is of type rustc_middle::mir::Body (notstable_miryet), which is then is converted to a stable_mir::mir::Body by mono_body.stable(tables)stable_mir::mir::Bodythrough stable will convert all theBasicBlocks,Statements,Terminators,LocalDecls, and importantly rustc_middle::mir::Const into a stable_mir::ty::{TyConst,MirConst}Constto covert is arustc_middle::mir::Const::Valthen it is converted with a new stable_mir::ty::Allocation as argument to the stable_mir::ty::ConstantKind::Allocated by ConstantKind::Allocated(alloc::new_allocation(ty, val, tables));Allocationwill match theConstValue, and if it isConstValue::Slicethen theAllocIdproduced will be from let alloc_id = tables.tcx.reserve_and_set_memory_alloc(data); which will call TyCtxt::reserve_alloc_id and which will create a new allocation id by calling AllocMap::reserve <- IMPORTANTAllocIdwas created from callinginst.body(), let's skip over the remaining details in executinglet items = collect_items(tcx);to collect_interned_values(tcx, items.iter().map(|i| &i.mono_item).collect::<Vec<_>>());collect_interned_valuesiterates throughitemsand if it matches aMonoItem::Fn(inst)then inst.body() is called again - PROBLEM. This repeats steps 4. - 9. again and at step 9. will allocate anotherAllocIdfor theInternedValueCollectorthat does not match what was originally allocated.Solution
Once the
Itemsare collected we can just thread a slice through and use theBodythat is already collected which meansinst.body()does not need to be called again and will not reallocate ids.