coverage: Defer part of counter-creation until codegen
Follow-up to #135481 and #135873.
One of the pleasant properties of the new counter-assignment algorithm is that we can stop partway through the process, store the intermediate state in MIR, and then resume the rest of the algorithm during codegen. This lets it take into account which parts of the control-flow graph were eliminated by MIR opts, resulting in fewer physical counters and simpler counter expressions.
Those improvements end up completely obsoleting much larger chunks of code that were previously responsible for cleaning up the coverage metadata after MIR opts, while also doing a more thorough cleanup job.
(That change also unlocks some further simplifications that I've kept out of this PR to limit its scope.)
Don't reset cast kind without also updating the operand in `simplify_cast` in GVN
Consider this heavily elided segment of the pre-GVN example code that was committed as a test:
```rust
let _4: *const ();
let _5: *const [()];
let mut _6: *const ();
let _7: *mut ();
let mut _8: *const [()];
let mut _9: std::boxed::Box<()>;
let mut _10: *const ();
/* ... */
// Deref a box
_10 = copy ((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>) as *const () (Transmute);
_4 = copy _10;
_6 = copy _4;
// Inlined body of `slice::from_raw_parts`, to turn a unit pointer into a slice-of-unit pointer
_5 = *const [()] from (copy _6, copy _11);
_8 = copy _5;
// Cast the raw slice-of-unit pointer back to a unit pointer
_7 = copy _8 as *mut () (PtrToPtr);
```
A malformed optimization was changing `_7` (which casted the slice-of-unit ptr to a unit ptr) to:
```
_7 = copy _5 as *mut () (Transmute);
```
...where `_8` was just replaced with `_5` bc of simple copy propagation, that part is not important... the CastKind changing to Transmute is the important part here.
In #133324, two new functionalities were implemented:
* Peeking through unsized -> sized PtrToPtr casts whose operand is `AggregateKind::RawPtr`, to turn it into PtrToPtr casts of the base of the aggregate. In this case, this allows us to see that the value of `_7` is just a ptr-to-ptr cast of `_6`.
* Folding a PtrToPtr cast of an operand which is a Transmute cast into just a single Transmute, which (theoretically) allows us to treat `_7` as a transmute into `*mut ()` of the base of the cast of `_10`, which is the place projection of `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)`.
However, when applying those two subsequent optimizations, we must *not* update the CastKind of the final cast *unless* we also update the operand of the cast, since the operand may no longer make sense with the updated CastKind.
In this case, this is problematic because the type of `_8` is `*const [()]`, but that operand in assignment statement of `_7` does *not* get turned into something like `((_9.0: std::ptr::Unique<()>).0: std::ptr::NonNull<()>)` -- **in other words, `try_to_operand` fails** -- because GVN only turns value nodes into locals or consts, not projections of locals. So we fail to update the operand, but we still update the CastKind to Transmute, which means we now are transmuting types of different sizes (a wide pointer and a thin pointer).
r? `@scottmcm` or `@cjgillot`
Fixes#136361Fixes#135997
Pretty print pattern type values with transmute if they don't satisfy their pattern
Instead of printing `0_u32 is 1..`, we now print the default fallback rendering that we also use for invalid bools, chars, ...: `{transmute(0x00000000): (u32) is 1..=}`.
These cases can occur in mir dumps when const prop propagates a constant across a safety check that would prevent the actually UB value from existing. That's fine though, as it's dead code and we always need to allow UB in dead code.
follow-up to https://github.com/rust-lang/rust/pull/136176
cc ``@compiler-errors`` ``@scottmcm``
r? ``@RalfJung`` because of the interpreter changes
Implement unstable `new_range` feature
Switches `a..b`, `a..`, and `a..=b` to resolve to the new range types.
For rust-lang/rfcs#3550
Tracking issue #123741
also adds the re-export that was missed in the original implementation of `new_range_api`
Similar to how the alignment is already checked, this adds a check
for null pointer dereferences in debug mode. It is implemented similarly
to the alignment check as a MirPass.
This is related to a 2025H1 project goal for better UB checks in debug
mode: https://github.com/rust-lang/rust-project-goals/pull/177.
Render pattern types nicely in mir dumps
avoid falling through to the fallback rendering that just does a hex dump
r? ``@scottmcm``
best reviewed commit by commit
Add `#[optimize(none)]`
cc #54882
This extends the `optimize` attribute to add `none`, which corresponds to the LLVM `OptimizeNone` attribute.
Not sure if an MCP is required for this, happy to file one if so.
Don't drop types with no drop glue when building drops for tailcalls
this is required as otherwise drops of `&mut` refs count as a usage of a
'two-phase temporary' causing an ICE.
fixes#128097
The underlying issue is that the current code generates drops for `&mut` which are later counted as a second use of a two-phase temporary:
`bat t.rs -p`
```rust
#![expect(incomplete_features)]
#![feature(explicit_tail_calls)]
fn f(x: &mut ()) {
let _y = String::new();
become f(x);
}
fn main() {}
```
`rustc t.rs -Zdump_mir=f`
```text
error: internal compiler error: compiler/rustc_borrowck/src/borrow_set.rs:298:17: found two uses for 2-phase borrow temporary _4: bb2[1] and bb3[0]
--> t.rs:6:5
|
6 | become f(x);
| ^^^^^^^^^^^
thread 'rustc' panicked at compiler/rustc_borrowck/src/borrow_set.rs:298:17:
Box<dyn Any>
stack backtrace:
[REDACTED]
error: aborting due to 1 previous error
```
`bat ./mir_dump/t.f.-------.renumber.0.mir -p -lrust`
```rust
// MIR for `f` 0 renumber
fn f(_1: &mut ()) -> () {
debug x => _1;
let mut _0: ();
let mut _2: !;
let _3: std::string::String;
let mut _4: &mut ();
scope 1 {
debug _y => _3;
}
bb0: {
StorageLive(_3);
_3 = String::new() -> [return: bb1, unwind: bb4];
}
bb1: {
FakeRead(ForLet(None), _3);
StorageLive(_4);
_4 = &mut (*_1);
drop(_3) -> [return: bb2, unwind: bb3];
}
bb2: {
StorageDead(_3);
tailcall f(Spanned { node: move _4, span: t.rs:6:14: 6:15 (#0) });
}
bb3 (cleanup): {
drop(_4) -> [return: bb4, unwind terminate(cleanup)];
}
bb4 (cleanup): {
resume;
}
}
```
Note how `_4 is moved into the tail call in `bb2` and dropped in `bb3`.
This PR adds a check that the locals we drop need dropping.
r? `@oli-obk` (feel free to reassign, I'm not sure who would be a good reviewer, but thought you might have an idea)
cc `@beepster4096,` since you wrote the original drop implementation.
Reexport likely/unlikely in std::hint
Since `likely`/`unlikely` should be working now, we could reexport them in `std::hint`. I'm not sure if this is already approved or if it requires approval
Tracking issue: #26179
coverage: Completely overhaul counter assignment, using node-flow graphs
The existing code for choosing where to put physical counter-increments gets the job done, but is very ad-hoc and hard to modify without introducing tricky regressions.
This PR replaces all of that with a more principled approach, based on the algorithm described in "Optimal measurement points for program frequency counts" (Knuth & Stevenson, 1973).
---
We start by ensuring that our graph has “balanced flow”, i.e. each node's flow (execution count) is equal to the sum of all its in-edge flows, and equal to the sum of all its out-edge flows. That isn't naturally true of control-flow graphs, so we introduce a wrapper type `BalancedFlowGraph` to fix that by introducing synthetic nodes and edges as needed.
Once our graph has balanced flow, the next step is to create another view of that graph in which each node's successors have all been merged into one “supernode”. Consequently, each node's out-edges can be coalesced into a single out-edge to one of those supernodes. Because of the balanced-flow property, the flow of that coalesced edge is equal to the flow of the original node.
Having expressed all of our node flows as edge flows, we can then analyze node flows using techniques for analyzing edge flows. We incrementally build a spanning tree over the merged supernodes, such that each new edge in the spanning tree represents a node whose flow can be computed from that of other nodes.
When this is done, we end up with a list of “counter terms” for each node, describing which nodes need physical counters, and how the remaining nodes can have their flow calculated by adding and subtracting those physical counters.
---
The re-blessed coverage tests show that this results in modest or major improvements for our test programs. Some tests need fewer physical counters, some tests need fewer expression nodes for the same number of physical counters, and some tests show striking reductions in both.
Fix cycle error only occurring with -Zdump-mir
fixes#134205
During mir dumping, we evaluate static items to render their allocations. If a static item refers to itself, its own MIR will have a reference to itself, so during mir dumping we end up evaluating the static again, causing us to try to build MIR again (mir dumping happens during MIR building).
Thus I disabled evaluation of statics during MIR dumps in case the MIR body isn't far enough along yet to be able to be guaranteed cycle free.
Make MIR cleanup for functions with impossible predicates into a real MIR pass
It's a bit jarring to see the body of a function with an impossible-to-satisfy where clause suddenly go to a single `unreachable` terminator when looking at the MIR dump output in order, and I discovered it's because we manually replace the body outside of a MIR pass.
Let's make it into a fully flegded MIR pass so it's more clear what it's doing and when it's being applied.
Add an InstSimplify for repetitive array expressions
I noticed in https://github.com/rust-lang/rust/pull/135068#issuecomment-2569955426 that GVN's implementation of this same transform was quite profitable on the deep-vector benchmark. But of course GVN doesn't run in unoptimized builds, so this is my attempt to write a version of this transform that benefits the deep-vector case and is fast enough to run in InstSimplify.
The benchmark suite indicates that this is effective.
Adds `#[rustc_force_inline]` which is similar to always inlining but
reports an error if the inlining was not possible, and which always
attempts to inline annotated items, regardless of optimisation levels.
It can only be applied to free functions to guarantee that the MIR
inliner will be able to resolve calls.
We already did `Transmute`-then-`PtrToPtr`; this adds the nearly-identical `PtrToPtr`-then-`Transmute`.
It also adds `transmute(Foo(x))` → `transmute(x)`, when `Foo` is a single-field transparent type. That's useful for things like `NonNull { pointer: p }.as_ptr()`.
Found these as I was looking at MCP807-related changes.