Skip to content

Highway to Compat (AC⚡DC rustc edition)#123

Open
cds-amal wants to merge 9 commits intoruntimeverification:masterfrom
cds-rs:spike/hex-rustc
Open

Highway to Compat (AC⚡DC rustc edition)#123
cds-amal wants to merge 9 commits intoruntimeverification:masterfrom
cds-rs:spike/hex-rustc

Conversation

@cds-amal
Copy link
Collaborator

@cds-amal cds-amal commented Feb 21, 2026

This PR builds on #121 (the pipeline refactor) with two goals: make it structurally easy to absorb rustc API changes, and remove the manual bookkeeping around which rustc commit our tests need.

This also addresses Issue #90 and supersedes PR #91 (see "Why this supersedes #91" below).

The compat layer (src/compat/)

All usage of rustc internal APIs (rustc_middle, rustc_smir, rustc_span, etc.) is now routed through a single module. When rustc changes its internals (which it does regularly), fixes concentrate in compat/ rather than scattering across the codebase. The design decision, boundary contract, and validation results are documented in ADR-003.

The module is organized by concern:

Submodule What it wraps
bridge.rs Stable-to-internal conversions (Instance, InstanceKind, unevaluated consts), plus the lifetime-free OpaqueInstanceKind
mono_collect.rs tcx.collect_and_partition_mono_items() and symbol name resolution
output.rs Output filename resolution from the compiler session
spans.rs span_to_location_info() and FileNameDisplayPreference
types.rs Type queries: generics, signatures, discriminants, attributes
mod.rs Crate re-exports (rustc_middle, rustc_smir, stable_mir, vendored serde) and common type aliases (TyCtxt, DefId)

The compat layer deliberately does not abstract over stable MIR's own public API. When stable_mir changes its types (e.g. Rvalue::AddressOf gaining a new variant), any consumer has to adapt; that's expected. The boundary is: rustc implementation details go through compat; the stable MIR contract flows through directly.

This also fixes a gap in mk_graph/, which was using extern crate stable_mir directly (bypassing the abstraction). Those imports now go through compat like everything else. A few thin compat wrappers that were just one-line pass-throughs got inlined at the call site; the compat layer wraps unstable boundaries, not every function call.

Single source of truth for the rustc commit

This PR adds a [metadata] section to rust-toolchain.toml with the rustc-commit that backs our pinned nightly:

[metadata]
rustc-commit = "a2545fd6fc66b4323f555223a860c451885d1d2b"

But that created a gap: now the commit hash lives in rust-toolchain.toml and is still hardcoded in .github/workflows/test.yml (the actions/checkout ref for rust-lang/rust). Two sources of truth, one of which will inevitably drift. The fix for this is to make the TOML the single source of truth and have everything else read from it.

(rustup silently ignores unknown keys, so the [metadata] section is safe; it's purely for our scripts and CI.)

What changes

CI (test.yml)

The ui-tests job now installs yq (via mikefarah/yq-action), reads the commit hash from rust-toolchain.toml, and passes it to actions/checkout for rust-lang/rust. No more hardcoded hash in the workflow. If the metadata key is missing or empty, the step fails loudly with a non-zero exit.

Local test scripts (run_ui_tests.sh, remake_ui_tests.sh)

Both scripts now source a shared helper (ensure_rustc_commit.sh) that:

  1. Reads metadata.rustc-commit from rust-toolchain.toml using yq
  2. Detects whether RUST_DIR_ROOT points to a regular clone or a bare repo
  3. For a regular clone: runs git checkout <commit> (or skips if already there)
  4. For a bare repo: creates a detached worktree at $RUST_DIR/<short-hash>/ (or reuses it if one already exists at the right commit)

This means you no longer have to manually check out the right commit before running make test-ui or make remake-ui-tests. Just point RUST_DIR_ROOT at your rust checkout and the scripts handle the rest.

yq (the Go version, mikefarah/yq) is now required for UI tests. If it's missing, the script tells you how to install it and exits.

A note on worktrees for the rust checkout

Rust's repo is enormous. When you inevitably have to switch between commits (toolchain bumps, bisecting, etc.), git checkout in a regular clone rewrites hundreds of thousands of files, blows away build caches, and generally makes you wait while incremental builds start over from scratch. Not great.

A bare repo with worktrees avoids all of this. Each worktree shares the same object store but keeps its own working tree and build artifacts. Switching between commits is just a cd; no files get rewritten, no caches get invalidated. If you've already built at one commit, you can hop to another worktree and come back without losing anything.

To convert an existing rust clone to bare+worktree:

cd ..
mv rust rust-tmp
git clone --bare git@github.com:rust-lang/rust.git rust
cd rust

# Create a worktree for the commit our tests need
git worktree add ./a2545fd6fc66 a2545fd6fc66b4323f555223a860c451885d1d2b --detach

Then point the test scripts at the repo root:

RUST_DIR_ROOT=~/oss/rust make test-ui

The scripts detect the bare repo and create (or reuse) the worktree automatically. When the toolchain bumps to a new nightly, a new worktree gets created at the new commit; the old one sticks around and you can prune it whenever you feel like it.

Stress test validation

To verify the abstraction actually holds, we ran two toolchain bumps on ephemeral spike branches (since deleted):

6-month jump (nightly-2024-11-29 to nightly-2025-06-01, rustc 1.85 to 1.89) and 13-month jump (nightly-2024-11-29 to nightly-2026-01-15, rustc 1.85 to 1.94).

All rustc internal API changes were fully contained in compat/ and driver.rs:

Change Where it was absorbed
collect_and_partition_mono_items tuple to MonoItemPartitions struct compat/mono_collect.rs
RunCompiler::new().run() becoming run_compiler() driver.rs
stable_mir renamed to rustc_public compat/mod.rs (one-line alias)
rustc_smir renamed to rustc_public_bridge compat/mod.rs, driver.rs
FileNameDisplayPreference::Remapped removed compat/spans.rs

Nothing leaked into printer/ or mk_graph/.

Changes that did affect printer/ and mk_graph/ were all stable MIR public API evolution (the kind of thing any consumer would need to handle regardless):

  • Rvalue::AddressOf changed from Mutability to RawPtrKind
  • StatementKind::Deinit and Rvalue::NullaryOp removed from stable MIR
  • AggregateKind::CoroutineClosure added (async closures)
  • Coroutine and Dynamic lost a field each
  • PointerCoercion::ReifyFnPointer gained a Safety parameter
  • GlobalAlloc::TypeId added
  • Ty::visit() return type changed from () to ControlFlow<T>

The 13-month bump also exposed the cost of the original mk_graph/ gap: when stable_mir was renamed to rustc_public, all 5 mk_graph files needed extern crate rustc_public as stable_mir, while printer/ needed zero import changes because it already went through compat. This PR closes that gap; a future rename would be a single-line change in compat/mod.rs.

Why this supersedes #91

PR #91 and this PR both address Issue #90 (upgrading the rustc dependency to a recent nightly), but they take fundamentally different approaches.

The core difference: this PR creates an adapter API (src/compat/) that gates all rustc access through a single module, so that API churn is absorbed in one place. PR #91 patches every call site across the codebase directly.

When rustc changes its internal APIs, the cost of adapting depends on how many files touch those APIs. In the codebase as it existed when #91 was written, rustc internals were scattered everywhere: printer.rs, mk_graph/, driver.rs, and various helpers all had their own extern crate declarations and direct imports. Every file that directly calls a rustc API is a file that breaks when that API changes.

PR #91 updates the toolchain from nightly-2024-11-29 to nightly-2025-07-20 and fixes every call site that broke. The result works, but the changes scatter across printer.rs, mk_graph.rs, driver.rs, Cargo.toml, the jq filter, and every golden test file. The next toolchain bump would require the same scattershot patching, because nothing has changed about how the codebase interfaces with rustc.

This PR invests in the adapter layer first. To make the difference concrete, here's what the stable_mir to rustc_public crate rename looks like in each approach:

PR #91's approach: update every file that declares extern crate stable_mir or imports from it: printer.rs, all 5 mk_graph/ files, and driver.rs.

This PR's approach: change one line in compat/mod.rs:

// before
pub extern crate stable_mir;
// after
pub extern crate rustc_public as stable_mir;

Every consumer imports via use crate::compat::stable_mir, so nothing else changes.

This PR doesn't bump the toolchain yet. It lays the groundwork so that the bump, when it lands, is a contained change in compat/ and driver.rs rather than a codebase-wide patch. PR #91 should be closed once this merges; the upgrade itself can follow as a focused PR that only touches the compat layer.

Test plan

  • cargo build and cargo clippy pass
  • make integration-test passes
  • make test-ui passes (requires RUST_DIR_ROOT)
  • Verify no extern crate rustc_* or extern crate stable_mir declarations remain outside src/compat/ and src/driver.rs

Reviewer ask: try a toolchain bump

The best way to evaluate this PR is to try bumping the nightly toolchain and see where breakage lands. Change the channel in rust-toolchain.toml to a newer nightly date, run cargo build, and check whether the compiler errors are confined to src/compat/ and src/driver.rs. If something breaks in printer/ or mk_graph/, that's either a compat layer gap we should fix before merging, or a stable MIR public API change (which is expected and out of scope for this abstraction).

@cds-amal
Copy link
Collaborator Author

@jberthold , would love your feedback if you have the time.

Copy link
Member

@jberthold jberthold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quite substantial refactoring, and definitely worth merging.
While possibly the updates to new rustc versions won't happen too frequently, the proposed code has much cleaner structure.
I did not go through the many details of the changes and it is still in draft but this LGTM (very good).

src/printer.rs Outdated
Comment on lines +162 to +163
fn mono_item_name(tcx: TyCtxt<'_>, item: &MonoItem) -> String {
if let MonoItem::GlobalAsm(data) = item {
hash(data).to_string()
} else {
mono_item_name_int(tcx, &rustc_internal::internal(tcx, item))
}
}

fn mono_item_name_int<'a>(tcx: TyCtxt<'a>, item: &rustc_middle::mir::mono::MonoItem<'a>) -> String {
item.symbol_name(tcx).name.into()
crate::compat::mono_collect::mono_item_name(tcx, item)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read a few of these changes that replace a previous helper function with a call into a compat module, like fn blah(args..) -> .. { crate::compat::...::blah(args..) (sometimes with additional struct wrapper like the GenericData above).
Maybe these calls could be made directly through to compat::...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great catch, @jberthold . I was focused on wrapping rustc calls behind a managed API and missed these, which are very much in the spirit of limiting the blast radius. Thanks for pointing it out.

Consider this the "Spirit of Radio" pass, tightening the signal and cutting the noise. 🎸

@dkcumming dkcumming self-assigned this Feb 26, 2026
@dkcumming
Copy link
Collaborator

After #121 this will need an update, but I think this is ready to go when that has happened. Looks like a great improvement

@cds-amal
Copy link
Collaborator Author

cds-amal commented Mar 2, 2026

yech. have to get yq installed on the runner. will look at this tomorrow.

cds-amal added 9 commits March 2, 2026 14:22
Concentrate all rustc-internal API calls into src/compat/ so that
toolchain upgrades touch one module (plus driver.rs) instead of
the whole codebase.

New module layout:
  src/compat/mod.rs          - centralized extern crate + re-exports
  src/compat/bridge.rs       - OpaqueInstanceKind, internal conversions
  src/compat/mono_collect.rs - mono item collection and symbol naming
  src/compat/spans.rs        - span-to-source-location resolution
  src/compat/types.rs        - type queries (generics, fn sigs, attrs)
  src/compat/output.rs       - output filename resolution

Key change: replace middle::ty::InstanceKind<'tcx> with an owned
OpaqueInstanceKind { debug_repr, is_reify_shim }, eliminating the
'tcx lifetime parameter from SmirJson, LinkMapKey, FnSymInfo,
LinkMap, DerivedInfo, and SmirJsonDebugInfo.

After this, printer.rs has zero extern crate rustc_* declarations
and zero direct tcx.query() calls.
Add ensure_rustc_commit.sh: a helper sourced by both UI test scripts
that reads the expected commit from rust-toolchain.toml [metadata] and
ensures the rust checkout (regular or bare+worktree) is at that commit.

Both run_ui_tests.sh and remake_ui_tests.sh now use RUST_SRC_DIR
(set by ensure_rustc_commit.sh) instead of using the raw directory
argument directly. CI workflow updated to use yq for TOML parsing.

Cherry-picked from d021298 (spike/hex-rustc), adapted for the
PR runtimeverification#126 test script rework on this branch.
Callers now go through the compat boundary directly instead of
forwarding through local wrappers:

- mono_collect: removed from collect.rs, callers import from compat
- mono_item_name: removed from util.rs, callers import from compat
- has_attr: removed from util.rs, re-exported from lib.rs via compat
- def_id_to_inst: removed from util.rs, callers use mono_instance
- GenericData newtype: inlined to Vec<(String, String)>
- SourceData type alias: removed, callers use compat path directly

Adapted from daa0db4 (spike/hex-rustc) for the printer/ directory
structure on this branch.
Expand the module-level docs in compat/mod.rs to spell out the rule:
nothing outside compat/ (and driver.rs) should touch rustc internals
directly. Add a submodule reference table, explain the re-exports, and
document each pub use item so cargo doc --open tells a coherent story.

Also add a doc comment to SourceData in spans.rs.
Documents the decision to route all rustc internal API usage through
src/compat/, the boundary contract (rustc internals go through compat;
stable MIR public API flows through directly), and validation results
from two toolchain bump stress tests (6-month and 13-month jumps).
yq is used to read the current rustc version from the
rust-toolchain.toml file.
@cds-amal cds-amal changed the title Isolate rustc internals and make toolchain bumps less hurt Highway to Compat (AC⚡DC rustc edition) Mar 2, 2026
@cds-amal cds-amal marked this pull request as ready for review March 2, 2026 19:36
@cds-amal cds-amal requested a review from a team March 2, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants