Generator optimization: Overlap locals that never have storage live at the same time
The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](08bfe16129/src/libstd/macros.rs (L365-L381))), which we rely on to implement the optimization.
More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**
To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.
Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.
For the layout computation, we use a simplified approach.
1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.
Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".
So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:
You can think of a generator as an enum, where some fields are shared between variants. e.g.
```rust
enum Generator {
Unresumed,
Poisoned,
Returned,
Suspend0(A, B, X),
Suspend1(B),
Suspend2(A, Y, Z),
}
```
where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.
Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.
---
Depends on:
- [x] #59897 Multi-variant layouts for generators
- [x] #60840 Preserve local scopes in generator MIR
- [x] #61373 Emit StorageDead along unwind paths for generators
Before merging:
- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at https://github.com/rust-lang/rust/pull/59897#issuecomment-489468627)
get rid of visit_place recursion
r? @spastorino
this is groundwork for https://github.com/rust-lang/rust/pull/60913, since after that PR we won't be able to implement `visit_place` in a recursive manner without heavy cloning everywhere.
cc @eddyb this touches const qualif
Change visit api
r? @oli-obk
In the [first commit](37386d366a) of this PR, I'm changing `visit_place` to be the function that traverses the `Place` and have only that responsibility. Then there are two other functions `visit_place_base` and `visit_projection` which are the ones in charge of visiting the base and the projection. Visitor implementors can implement any of those.
In the [second commit](e786f631b8) we can already see some things that confuses me, which I think this division will make more clear. The old code, first checked if the place was a base, did something with it and then called `super_place` [here](e786f631b8 (diff-d583e4efe1a72516e274158e53223633L678)). `super_place` checks again if it's a base [here](https://github.com/rust-lang/rust/blob/master/src/librustc/mir/visit.rs#L679-L684) and in case is a local, visits the local and stuff like that. That's not very obvious on the code, and if I'm not wrong it's not needed. In this PR or we have [this](e786f631b8 (diff-d583e4efe1a72516e274158e53223633R673)) as I did or we can just do `- => self.super_place_base(...)` and that will be obvious that I'm letting the default implementation process the base.
rustc: remove Res::Upvar.
By keeping track of the current "`body_owner`" (the `DefId` of the current fn/closure/const/etc.) in several passes, `Res::Upvar` and `hir::Upvar` don't need to contain contextual information about the closure.
By leveraging [`indexmap`](https://docs.rs/indexmap), the list of upvars for a given closure can now also be queried, to check whether a local variable is a closure capture, and so `Res::Upvar` can be merged with `Res::Local`.
And finally, the `tcx.upvars(...)` query now collects upvars from HIR, without relying on `rustc_resolve`.
r? @petrochenkov cc @varkor @davidtwco
light refactoring of global AllocMap
* rename AllocKind -> GlobalAlloc. This stores the allocation itself, not just its kind.
* rename the methods that allocate stuff to have consistent names.
Cc @oli-obk
* rename AllocKind -> GlobalAlloc. This stores the allocation itself, not just its kind.
* rename the methods that allocate stuff to have consistent names.
Remove LLVM instruction stats and other (obsolete) codegen stats.
Both `-Z count_llvm_insns` and `-Z codegen-stats` are removed, as (AFAIK) they have been of little use in the last few years, especially after the transition to MIR->LLVM codegen.
Other than for the LLVM instruction counts, `-Z codegen-stats` has long been obsoleted anyway.
r? @nagisa cc @rust-lang/compiler
Changes the type `mir::Mir` into `mir::Body`
Fixes part 1 of #60229 (previously attempted in #60242).
I stumbled upon the issue and it seems that the previous attempt at solving it was not merged. This is a second try more up-to-date.
The commit should have changed comments as well.
At the time of writting, it passes the tidy and check tool.
The commit should have changed comments as well.
At the time of writting, it passes the tidy and check tool.
Revisions asked by eddyb :
- Renamed of all the occurences of {visit/super}_mir
- Renamed test structures `CachedMir` to `Cached`
Fixing the missing import on `AggregateKind`
rename Scalar::Bits to Scalar::Raw and bits field to data
Also use this opportunity to seal some abstraction leaks (other modules constructing `Scalar::Bits` directly instead of using a constructor).
r? @oli-obk