Skip to content

Avoid inst.body() duplicate call which reallocates AllocIds#120

Merged
automergerpr-permission-manager[bot] merged 1 commit intomasterfrom
dc/alloc-id-fix
Feb 20, 2026
Merged

Avoid inst.body() duplicate call which reallocates AllocIds#120
automergerpr-permission-manager[bot] merged 1 commit intomasterfrom
dc/alloc-id-fix

Conversation

@dkcumming
Copy link
Collaborator

This PR corrects a bug in generating the Stable MIR JSON artefacts. Prior to this fix allocations were being collected twice (calling Instance::body twice), which resulted in certain allocs (slices) having multiple allocation ids. The fix is to refer to previously collected items so that the collector does not assign allocation ids again.

Problem

There has been a disconnection between Ty and AllocId in the .smir.json artefacts (for slices). A constant should be able to be traced to an allocation by the Ty that will yield an AllocId. However attempting to do this lead to a case where the AllocId from the Ty was not in the Allocs map, and allocations in the map had AllocIds unseen in the rest of the .smir.json.

Cause

The disconnection betwen the Tys and AllocIds listed above came from there being a second collector pass calling Instance::body again that assigned new AllocIds for slices that were not connected to the Tys (still referencing the originals). The code that initiates the second collector pass is from stable-mir-json, while the code that re-allocates fresh ids is in rustc itself.

A rough trace of the pipeline goes like this:

Details
  1. First collection initiated in collect_smir by let items = collect_items(tcx);
  2. Each MonoItem is mapped to an Item by mk_item(tcx, item.clone(), name)
  3. In mk_item if the MonoItem is MonoItemFn(inst: Instance) call Instance::body(&self) on inst by inst.body()
  4. Instance::Body calls the rustc_smir Context::instance_body (for TablesWrapper that impl Context)
  5. Context::instance_body calls BodyBuilder::new(tables.tcx, instance) to create a new BodyBuilder which then calls the BodyBuilder::build method
  6. BodyBuilder::build creates mono_body which is of type rustc_middle::mir::Body (not stable_mir yet), which is then is converted to a stable_mir::mir::Body by mono_body.stable(tables)
  7. Converting to a stable_mir::mir::Body through stable will convert all the BasicBlocks, Statements, Terminators, LocalDecls, and importantly rustc_middle::mir::Const into a stable_mir::ty::{TyConst,MirConst}
  8. If the Const to covert is a rustc_middle::mir::Const::Val then it is converted with a new stable_mir::ty::Allocation as argument to the stable_mir::ty::ConstantKind::Allocated by ConstantKind::Allocated(alloc::new_allocation(ty, val, tables));
  9. The creation of the new Allocation will match the ConstValue, and if it is ConstValue::Slice then the AllocId produced will be from let alloc_id = tables.tcx.reserve_and_set_memory_alloc(data); which will call TyCtxt::reserve_alloc_id and which will create a new allocation id by calling AllocMap::reserve <- IMPORTANT
  10. Given 9. explains how the AllocId was created from calling inst.body(), let's skip over the remaining details in executing let items = collect_items(tcx); to collect_interned_values(tcx, items.iter().map(|i| &i.mono_item).collect::<Vec<_>>());
  11. collect_interned_values iterates through items and if it matches a MonoItem::Fn(inst) then inst.body() is called again - PROBLEM. This repeats steps 4. - 9. again and at step 9. will allocate another AllocId for the InternedValueCollector that does not match what was originally allocated.

Solution

Once the Items are collected we can just thread a slice through and use the Body that is already collected which means inst.body() does not need to be called again and will not reallocate ids.

@cds-amal
Copy link
Collaborator

@dkcumming , your PR investigation/summary hints at a deeper issue: the pipeline is doing three conceptually related things as separate steps, each destroying information that the next step needs. I'll have to give this a think 🤔 🧐

@dkcumming
Copy link
Collaborator Author

@cds-amal I will circle back to #121 after some downstream work this fix opens up

@automergerpr-permission-manager automergerpr-permission-manager bot merged commit 9a78109 into master Feb 20, 2026
5 checks passed
@automergerpr-permission-manager automergerpr-permission-manager bot deleted the dc/alloc-id-fix branch February 20, 2026 05:40
dkcumming pushed a commit that referenced this pull request Feb 26, 2026
## What's this about?

So, #120 fixed the immediate alloc-id mismatch by carrying collected
items forward instead of re-fetching them. That was the right call. But
it left the underlying structure intact: three separate phases (mk_item,
collect_unevaluated_constant_items, collect_interned_values), each with
full access to `TyCtxt`, each free to call `inst.body()` or any other
side-effecting rustc query whenever it felt like it. Nothing in the
types prevented that, and the bug was a direct consequence: one phase
called `inst.body()` a second time, rustc minted fresh AllocIds, and
suddenly the alloc map had ids that didn't correspond to anything in the
stored bodies.

The question is: how do we make that class of bug structurally
impossible, rather than just fixed for the one case we caught?

The full decision record is in
[`ADR-002`](https://github.com/cds-rs/stable-mir-json/blob/dc/declarative-spike/docs/adr/002-declarative-pipeline-with-allocmap-coherence.md).

### The restructuring

The fix is to split the pipeline into phases with type signatures that
enforce the boundary:

```rust
collect_and_analyze_items(HashMap<String, Item>)
  -> (CollectedCrate, DerivedInfo)

assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson
```

`CollectedCrate` holds items and unevaluated consts (the output of
talking to rustc). `DerivedInfo` holds calls, allocs, types, and spans
(the output of walking bodies). `assemble_smir` takes both by value and
does pure data transformation; it structurally cannot call `inst.body()`
because it has no `MonoItem` or `Instance` to call it on. That's the
whole point: if you can't reach the query, you can't accidentally call
it.

The two body-walking visitors (`InternedValueCollector` and
`UnevaluatedConstCollector`) are merged into a single `BodyAnalyzer`
that walks each body exactly once. The fixpoint loop for transitive
unevaluated constant discovery is integrated: when `BodyAnalyzer` finds
an unevaluated const, it records it; the outer loop creates the new
`Item` (the one place `inst.body()` is called) and enqueues it.

### But what about catching regressions?

Turns out the existing integration tests normalize away `alloc_id`s (via
the jq filter), so they literally cannot catch this class of bug. The
golden files don't contain alloc ids at all; you could scramble every id
in the output and the tests would still pass.

`AllocMap` replaces the bare `HashMap<AllocId, ...>` with a newtype
that, under `#[cfg(debug_assertions)]`, tracks every insertion and flags
duplicates. After the collect/analyze phase completes,
`verify_coherence` walks every stored `Item` body with an
`AllocIdCollector` visitor and asserts that each referenced `AllocId`
exists in the map. This catches both "walked a stale body" (missing ids)
and "walked the same body twice" (duplicate insertions) at dev time;
zero cost in release builds.

### Other things that fell out of this

- Static items now store their body in `MonoItemKind::MonoItemStatic`
(collected once in `mk_item`), so the analysis phase never goes back to
rustc for static bodies
- `get_item_details` takes the pre-collected body as a parameter instead
of calling `inst.body()` independently
- The `items_clone` full HashMap clone is replaced by a `HashSet` of
original item names (which is all the static fixup actually needed)
- [we uncovered and fixed a very old
bug](#121 (comment))

### What's deleted

`InternedValueCollector`, `UnevaluatedConstCollector`,
`collect_interned_values`, `collect_unevaluated_constant_items`, the
`InternedValues` type alias, and `items_clone`. Good riddance.

### Downstream impact

The tighter allocs representation has already shown positive downstream
effects in KMIR: the proof engine can now decode allocations inline
(resolving to concrete values like `StringVal("123")`) instead of
deferring them as opaque thunks. @dkcumming 's `offset-u8` test went
from thunking through `#decodeConstant(constantKindAllo...)` to directly
producing `toAlloc(allocId(0)), StringVal("123")`. The test's expected
output needed updating, but the new failure mode is semantically
grounded in actual data rather than deferred interpretation.

### Test plan

- [x] `cargo build` compiles
- [x] `cargo clippy` clean
- [x] `cargo fmt` clean
- [x] `make integration-test` passes (all 28 tests, identical output)
- [x] KMIR downstream: `test_prove_rs[offset-u8-fail]` expected output
updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants