Don't store locals that have been moved from in generators
This avoids reserving storage in generators for locals that are moved
out of (and not re-initialized) prior to yield points. Fixes#59123.
This adds a new dataflow analysis, `RequiresStorage`, to determine whether the storage of a local can be destroyed without being observed by the program. The rules are:
1. StorageLive(x) => mark x live
2. StorageDead(x) => mark x dead
3. If a local is moved from, _and has never had its address taken_, mark it dead
4. If (any part of) a local is initialized, mark it live'
This is used to determine whether to save a local in the generator object at all, as well as which locals can be overlapped in the generator layout.
Here's the size in bytes of all testcases included in the change, before and after the change:
async fn test |Size before |Size after
-----------------|------------|----------
single | 1028 | 1028
single_with_noop | 2056 | 1032
joined | 5132 | 3084
joined_with_noop | 8208 | 3084
generator test |Size before |Size after
----------------------------|------------|----------
move_before_yield | 1028 | 1028
move_before_yield_with_noop | 2056 | 1032
overlap_move_points | 3080 | 2056
## Future work
Note that there is a possible extension to this optimization, which modifies rule 3 to read: "If a local is moved from, _**and either has never had its address taken, or is Freeze and has never been mutably borrowed**_, mark it dead." This was discussed at length in #59123 and then #61849. Because this would cause some behavior to be UB which was not UB before, it's a step that needs to be taken carefully.
A more immediate priority for me is inlining `std::mem::size_of_val(&x)` so it becomes apparent that the address of `x` is not taken. This way, using `size_of_val` to look at the size of your inner futures does not affect the size of your outer future.
cc @cramertj @eddyb @Matthias247 @nikomatsakis @RalfJung @Zoxc
Rollup of 8 pull requests
Successful merges:
- #62062 (Use a more efficient iteration order for forward dataflow)
- #62063 (Use a more efficient iteration order for backward dataflow)
- #62224 (rustdoc: remove unused derives and variants)
- #62228 (Extend the #[must_use] lint to boxed types)
- #62235 (Extend the `#[must_use]` lint to arrays)
- #62239 (Fix a typo)
- #62241 (Always parse 'async unsafe fn' + properly ban in 2015)
- #62248 (before_exec actually will only get deprecated with 1.37)
Failed merges:
r? @ghost
Stabilize `type_alias_enum_variants` in Rust 1.37.0
Stabilize `#![feature(type_alias_enum_variants)]` which allows type-relative resolution with highest priority to `enum` variants in both expression and pattern contexts. For example, you may now write:
```rust
enum Option<T> {
None,
Some(T),
}
type OptAlias<T> = Option<T>;
fn work_on_alias(x: Option<u8>) -> u8 {
match x {
OptAlias::Some(y) => y + 1,
OptAlias::None => 0,
}
}
```
Closes https://github.com/rust-lang/rfcs/issues/2218
Closes https://github.com/rust-lang/rust/issues/52118
r? @petrochenkov
Add `--pass $mode` to compiletest through `./x.py`
Adds a flag `--pass $mode` to compiletest, which is exposed through `./x.py`.
When `--pass $mode` is passed, `{check,build,compile,run}-pass` tests will be forced to run under the given `$mode` unless the directive `// ignore-pass` exists in the test file.
The modes are explained in https://github.com/rust-lang/rust/pull/61778:
- `check` has the same effect as `cargo check`
- `build` or `compile` have the same effect as `cargo build`
- `run` has the same effect as `cargo run`
On my machine, `./x.py -i test src/test/run-pass --stage 1 --pass check` takes 38 seconds whereas it takes 2 min 7 seconds without `--pass check`.
cc https://github.com/rust-lang/rust/issues/61712
r? @petrochenkov
Revert "Set test flag when rustdoc is running with --test option"
Reverts https://github.com/rust-lang/rust/pull/59940.
It caused doctests in this repository to no longer be tested including all of the core crate.
Don't ICE on item in `.await` expression
The code for lowering a `.await` expression missed that item IDs may already have been assigned for items inside of an `async` block, or for closures. This change means we no longer exit early after finding a `.await` in a block that isn't `async` and instead just emit the error. This avoids an ICE generated due to item IDs not being densely generated. (The `YieldSource` of the generated `yield` expression is used to avoid errors generated about having `yield` expressions outside of generator literals.)
r? @cramertj
Resolves#62009 and resolves#61685
Add suggestion for missing `.await` keyword
This commit adds a suggestion diagnostic for missing `.await`. In order to do this, the trait `Future` is promoted to a lang item.
Compiling code of the form:
```rust
#![feature(async_await)]
fn take_u32(x: u32) {}
async fn make_u32() -> u32 {
22
}
async fn foo() {
let x = make_u32();
take_u32(x)
}
fn main() {}
```
Will now result in the suggestion:
```
error[E0308]: mismatched types
--> src/main.rs:11:18
|
11 | take_u32(x)
| ^
| |
| expected u32, found opaque type
| help: consider using `.await` here: `x.await`
|
= note: expected type `u32`
found type `impl std::future::Future`
```
This commit does not cover chained expressions and therefore only covers the case originally pointed out in #61076. Cases I can think of that still need to be covered:
- [ ] Return places for functions
- [ ] Field access
- [ ] Method invocation
I'm planning to submit PRs for each of these separately as and when I have figured them out.
Clean up MIR drop generation
* Don't assign twice to the destination of a `while` loop containing a `break` expression
* Use `as_temp` to evaluate statement expression
* Avoid consecutive `StorageLive`s for the condition of a `while` loop
* Unify `return`, `break` and `continue` handling, and move it to `scopes.rs`
* Make some of the `scopes.rs` internals private
* Don't use `Place`s that are always `Local`s in MIR drop generation
Closes#42371Closes#61579Closes#61731Closes#61834Closes#61910Closes#62115
rustc: correctly transform memory_index mappings for generators.
Fixes#61793, closes#62011 (previous attempt at fixing #61793).
During #60187, I made the mistake of suggesting that the (re-)computation of `memory_index` in `ty::layout`, after generator-specific logic split/recombined fields, be done off of the `offsets` of those fields (which needed to be computed anyway), as opposed to the `memory_index`.
`memory_index` maps each field to its in-memory order index, which ranges over the same `0..n` values as the fields themselves, making it a bijective mapping, and more specifically a permutation (indeed, it's the permutation resulting from field reordering optimizations).
Each field has an unique "memory index", meaning a sort based on them, even an unstable one, will not put them in the wrong order. But offsets don't have that property, because of ZSTs (which do not increase the offset), so sorting based on the offset of fields alone can (and did) result in wrong orders.
Instead of going back to sorting based on (slices/subsets of) `memory_index`, or special-casing ZSTs to make sorting based on offsets produce the right results (presumably), as #62011 does, I opted to drop sorting altogether and focus on `O(n)` operations involving *permutations*:
* a permutation is easily inverted (see the `invert_mapping` `fn`)
* an `inverse_memory_index` was already employed in other parts of the `ty::layout` code (that is, a mapping from memory order to field indices)
* inverting twice produces the original permutation, so you can invert, modify, and invert again, if it's easier to modify the inverse mapping than the direct one
* you can modify/remove elements in a permutation, as long as the result remains dense (i.e. using every integer in `0..len`, without gaps)
* for splitting a `0..n` permutation into disjoint `0..x` and `x..n` ranges, you can pick the elements based on a `i < x` / `i >= x` predicate, and for the latter, also subtract `x` to compact the range to `0..n-x`
* in the general case, for taking an arbitrary subset of the permutation, you need a renumbering from that subset to a dense `0..subset.len()` - but notably, this is still `O(n)`!
* you can merge permutations, as long as the result remains disjoint (i.e. each element is unique)
* for concatenating two `0..n` and `0..m` permutations, you can renumber the elements in the latter to `n..n+m`
* some of these operations can be combined, and an inverse mapping (be it a permutation or not) can still be used instead of a forward one by changing the "domain" of the loop performing the operation
I wish I had a nicer / more mathematical description of the recombinations involved, but my focus was to fix the bug (in a way which preserves information more directly than sorting would), so I may have missed potential changes in the surrounding generator layout code, that would make this all more straight-forward.
r? @tmandry
Rollup of 7 pull requests
Successful merges:
- #61814 (Fix an ICE with uninhabited consts)
- #61987 (rustc: produce AST instead of HIR from `hir::lowering::Resolver` methods.)
- #62055 (Fix error counting)
- #62078 (Remove built-in derive macros `Send` and `Sync`)
- #62085 (Add test for issue-38591)
- #62091 (HirIdification: almost there)
- #62096 (Implement From<Local> for Place and PlaceBase)
Failed merges:
r? @ghost
Fix error counting
Count duplicate errors for `track_errors` and other error counting checks.
Add FIXMEs to make it clear that we should be moving away from this kind of logic.
Closes#61663
Fix HIR visit order
Fixes#61442
When rustc::middle::region::ScopeTree computes its yield_in_scope
field, it relies on the HIR visitor order to properly compute which
types must be live across yield points. In order for the computed scopes
to agree with the generated MIR, we must ensure that expressions
evaluated before a yield point are visited before the 'yield'
expression.
However, the visitor order for ExprKind::AssignOp
was incorrect. The left-hand side of a compund assignment expression is
evaluated before the right-hand side, but the right-hand expression was
being visited before the left-hand expression. If the left-hand
expression caused a new type to be introduced (e.g. through a
deref-coercion), the new type would be incorrectly seen as occuring
*after* the yield point, instead of before. This leads to a mismatch
between the computed generator types and the MIR, since the MIR will
correctly see the type as being live across the yield point.
To fix this, we correct the visitor order for ExprKind::AssignOp
to reflect the actual evaulation order.
Refactor miri pointer checks
Centralize bounds, alignment and NULL checking for memory accesses in one function: `memory.check_ptr_access`. That function also takes care of converting a `Scalar` to a `Pointer`, should that be needed. Not all accesses need that though: if the access has size 0, `None` is returned. Everyone accessing memory based on a `Scalar` should use this method to get the `Pointer` they need.
All operations on the `Allocation` work on `Pointer` inputs and expect all the checks to have happened (and will ICE if the bounds are violated). The operations on `Memory` work on `Scalar` inputs and do the checks themselves.
The only other public method to check pointers is `memory.ptr_may_be_null`, which is needed in a few places. No need for `check_align` or similar methods. That makes the public API surface much easier to use and harder to mis-use.
This should be largely no-functional-change, except that ZST accesses to a "true" pointer that is dangling or out-of-bounds are now considered UB. This is to be conservative wrt. whatever LLVM might be doing.
While I am at it, this also removes the assumption that the vtable part of a `dyn Trait`-fat-pointer is a `Pointer` (as opposed to a pointer cast to an integer, stored as raw bits).
r? @oli-obk