Implement token-based handling of attributes during expansion
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:
* Derives macros no longer lose spans when their input is modified
by eager cfg-expansion. This is accomplished by performing eager
cfg-expansion on the token stream that we pass to the derive
proc-macro
* Inner attributes now preserve spans in all cases, including when we
have multiple inner attributes in a row.
This is accomplished through the following changes:
* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
These are very similar to a normal `TokenTree`, but they also track
the position of attributes and attribute targets within the stream.
They are built when we collect tokens during parsing.
An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
`AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
is created during the parsing of a nested AST node to make the 'outer'
AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:
* Derives macros no longer lose spans when their input is modified
by eager cfg-expansion. This is accomplished by performing eager
cfg-expansion on the token stream that we pass to the derive
proc-macro
* Inner attributes now preserve spans in all cases, including when we
have multiple inner attributes in a row.
This is accomplished through the following changes:
* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
These are very similar to a normal `TokenTree`, but they also track
the position of attributes and attribute targets within the stream.
They are built when we collect tokens during parsing.
An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
`AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
is created during the parsing of a nested AST node to make the 'outer'
AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
Fix NixOS patching
Moving the `.nix-deps` has resulted in rpath links being broken and
therefore bootstrap on NixOS broken entirely.
This PR still produces a `.nix-deps` but only for the purposes of
producing a gc root. We rpath a symlink-resolved result instead.
For purposes of simplicity we also use joinSymlink to produce a single
merged output directory so that we don't need to update multiple
locations every time we add a library or something.
Fixes a regression from https://github.com/rust-lang/rust/pull/82739.
Expand derive invocations in left-to-right order
While derives were being collected in left-to-order order, the
corresponding `Invocation`s were being pushed in the wrong order.
Moving the `.nix-deps` has resulted in rpath links being broken and
therefore bootstrap on NixOS broken entirely.
This PR still produces a `.nix-deps` but only for the purposes of
producing a gc root. We rpath a symlink-resolved result instead.
For purposes of simplicity we also use joinSymlink to produce a single
merged output directory so that we don't need to update multiple
locations every time we add a library or something.
Avoid `;` -> `,` recovery and unclosed `}` recovery from being too verbose
Those two recovery attempts have a very bad interaction that causes too
unnecessary output. Add a simple gate to avoid interpreting a `;` as a
`,` when there are unclosed braces.
Fix#83498.
Those two recovery attempts have a very bad interaction that causes too
unnecessary output. Add a simple gate to avoid interpreting a `;` as a
`,` when there are unclosed braces.
reduce threads spawned by ui-tests
The test harness already spawns enough tests to keep all cores busy.
Individual tests should keep their own threading to a minimum to avoid context switch overhead.
When running ui tests with lld enabled this shaves about 10% off that testsuite on my machine.
Resolves#81946
rustdoc: Don't generate blanket impls when running --show-coverage
`get_blanket_impls` is the slowest part of rustdoc, and the coverage pass
completely ignores blanket impls. This stops running it at all, and also
removes some unnecessary checks in `calculate_doc_coverage` that ignored
the impl anyway.
We don't currently measure --show-coverage in perf.rlo, but I tested
this locally on cargo and it brought the time down from 2.9 to 1.6
seconds.
This also adds back a commented-out test; Rustdoc has been able to deal with `impl trait` for almost a year now.
r? `@GuillaumeGomez`
get_blanket_impls is the slowest part of rustdoc, and the coverage pass
completely ignores blanket impls. This stops running it at all, and also
removes some unnecessary checks in `calculate_doc_coverage` that ignored
the impl anyway.
We don't currently measure --show-coverage in perf.rlo, but I tested
this locally on cargo and it brought the time down from 2.9 to 1.6
seconds.
the test harness already spawns enough tests for all cores, individual
tests should keep their own threading to a minimum to avoid context switch
overhead
some tests fail with 1 CGU, so explicit compile flags have been added
to keep their old behavior
Don't concatenate binders across types
Partially addresses #83737
There's actually two issues that I uncovered in #83737. The first is that we are concatenating bound vars across types, i.e. in
```
F: Fn(&()) -> &mut (dyn Future<Output = ()> + Unpin)
```
the bound vars on `Future` get set as `for<anon>` since those are the binders on `Fn(&()`. This is obviously wrong, since we should only concatenate directly nested trait refs. This is solved here by introducing a new `TraitRefBoundary` scope, that we put around the "syntactical" trait refs and basically don't allow concatenation across.
Now, this alone *shouldn't* be a super terrible problem. At least not until you consider the other issue, which is a much more elusive and harder to design a "perfect" fix. A repro can be seen in:
```
use core::future::Future;
async fn handle<F>(slf: &F)
where
F: Fn(&()) -> &mut (dyn for<'a> Future<Output = ()> + Unpin),
{
(slf)(&()).await;
}
```
Notice the `for<'a>` around `Future`. Here, `'a` is unused, so the `for<'a>` Binder gets changed to a `for<>` Binder in the generator witness, but the "local decl" still has it. This has heavy intersections with region anonymization and erasing. Luckily, it's not *super* common to find this unique set of circumstances. It only became apparently because of the first issue mentioned here. However, this *is* still a problem, so I'm leaving #83737 open.
r? `@nikomatsakis`
Merge idents when generating source content
The idea here is to not have a span for each part of a path. Currently, for `a:🅱️:c` we generate `<span>a</span>::<span>b</span>::<span>c</span>`, with this change, we will generate `<span>a:🅱️:c</span>`.
A nice "side-effect" is that it reduces the size of the output HTML too. :)
cc `@notriddle`
The issue was that the resulting debuginfo was too complex for LLVM to
translate into CodeView records correctly. As a result, it simply
ignored the debuginfo which meant Windows debuggers could not display
any closed over variables when stepping inside a closure.
This fixes that by spilling additional variables to the stack so that
the resulting debuginfo is simple (just `*my_variable.dbg.spill`) and
LLVM can generate the correct CV records.
rustc: Add a new `wasm` ABI
This commit implements the idea of a new ABI for the WebAssembly target,
one called `"wasm"`. This ABI is entirely of my own invention
and has no current precedent, but I think that the addition of this ABI
might help solve a number of issues with the WebAssembly targets.
When `wasm32-unknown-unknown` was first added to Rust I naively
"implemented an abi" for the target. I then went to write `wasm-bindgen`
which accidentally relied on details of this ABI. Turns out the ABI
definition didn't match C, which is causing issues for C/Rust interop.
Currently the compiler has a "wasm32 bindgen compat" ABI which is the
original implementation I added, and it's purely there for, well,
`wasm-bindgen`.
Another issue with the WebAssembly target is that it's not clear to me
when and if the default C ABI will change to account for WebAssembly's
multi-value feature (a feature that allows functions to return multiple
values). Even if this does happen, though, it seems like the C ABI will
be guided based on the performance of WebAssembly code and will likely
not match even what the current wasm-bindgen-compat ABI is today. This
leaves a hole in Rust's expressivity in binding WebAssembly where given
a particular import type, Rust may not be able to import that signature
with an updated C ABI for multi-value.
To fix these issues I had the idea of a new ABI for WebAssembly, one
called `wasm`. The definition of this ABI is "what you write
maps straight to wasm". The goal here is that whatever you write down in
the parameter list or in the return values goes straight into the
function's signature in the WebAssembly file. This special ABI is for
intentionally matching the ABI of an imported function from the
environment or exporting a function with the right signature.
With the addition of a new ABI, this enables rustc to:
* Eventually remove the "wasm-bindgen compat hack". Once this multivalue
ABI is stable wasm-bindgen can switch to using it everywhere.
Afterwards the wasm32-unknown-unknown target can have its default ABI
updated to match C.
* Expose the ability to precisely match an ABI signature for a
WebAssembly function, regardless of what the C ABI that clang chooses
turns out to be.
* Continue to evolve the definition of the default C ABI to match what
clang does on all targets, since the purpose of that ABI will be
explicitly matching C rather than generating particular function
imports/exports.
Naturally this is implemented as an unstable feature initially, but it
would be nice for this to get stabilized (if it works) in the near-ish
future to remove the wasm32-unknown-unknown incompatibility with the C
ABI. Doing this, however, requires the feature to be on stable because
wasm-bindgen works with stable Rust.
This commit implements the idea of a new ABI for the WebAssembly target,
one called `"wasm"`. This ABI is entirely of my own invention
and has no current precedent, but I think that the addition of this ABI
might help solve a number of issues with the WebAssembly targets.
When `wasm32-unknown-unknown` was first added to Rust I naively
"implemented an abi" for the target. I then went to write `wasm-bindgen`
which accidentally relied on details of this ABI. Turns out the ABI
definition didn't match C, which is causing issues for C/Rust interop.
Currently the compiler has a "wasm32 bindgen compat" ABI which is the
original implementation I added, and it's purely there for, well,
`wasm-bindgen`.
Another issue with the WebAssembly target is that it's not clear to me
when and if the default C ABI will change to account for WebAssembly's
multi-value feature (a feature that allows functions to return multiple
values). Even if this does happen, though, it seems like the C ABI will
be guided based on the performance of WebAssembly code and will likely
not match even what the current wasm-bindgen-compat ABI is today. This
leaves a hole in Rust's expressivity in binding WebAssembly where given
a particular import type, Rust may not be able to import that signature
with an updated C ABI for multi-value.
To fix these issues I had the idea of a new ABI for WebAssembly, one
called `wasm`. The definition of this ABI is "what you write
maps straight to wasm". The goal here is that whatever you write down in
the parameter list or in the return values goes straight into the
function's signature in the WebAssembly file. This special ABI is for
intentionally matching the ABI of an imported function from the
environment or exporting a function with the right signature.
With the addition of a new ABI, this enables rustc to:
* Eventually remove the "wasm-bindgen compat hack". Once this
ABI is stable wasm-bindgen can switch to using it everywhere.
Afterwards the wasm32-unknown-unknown target can have its default ABI
updated to match C.
* Expose the ability to precisely match an ABI signature for a
WebAssembly function, regardless of what the C ABI that clang chooses
turns out to be.
* Continue to evolve the definition of the default C ABI to match what
clang does on all targets, since the purpose of that ABI will be
explicitly matching C rather than generating particular function
imports/exports.
Naturally this is implemented as an unstable feature initially, but it
would be nice for this to get stabilized (if it works) in the near-ish
future to remove the wasm32-unknown-unknown incompatibility with the C
ABI. Doing this, however, requires the feature to be on stable because
wasm-bindgen works with stable Rust.
Remove the insta-stable `cfg(wasm)`
The addition of `cfg(wasm)` was an oversight on my end that turns out to have a number
of downsides:
* It was introduced as an insta-stable addition, forgoing the usual
staging mechanism we use for potentially far-reaching changes;
* It is a breaking change for people who are using `--cfg wasm` either
directly or via cargo for other purposes;
* It is not entirely clear if a bare `wasm` cfg is a right option or
whether `wasm` family of targets are special enough to warrant
special-casing these targets specifically.
As for the last point, there appears to be a fair amount of support for
reducing the boilerplate in specifying architectures from the same
family, while ignoring their pointer width. The suggested way forward
would be to propose such a change as a separate RFC as it is potentially
a quite contentious addition.
cc #83879 `@devsnek`
rustdoc: Link to the docs on namespaces when an unknown disambiguator is found
cc https://github.com/rust-lang/rust/issues/83859
`@lopopolo` does this look about like what you expected?
r? `@camelid`
Rollup of 5 pull requests
Successful merges:
- #82497 (Fix handling of `--output-format json` flag)
- #83689 (Add more info for common trait resolution and async/await errors)
- #83952 (Account for `ExprKind::Block` when suggesting .into() and deref)
- #83965 (Add Debug implementation for hir::intravisit::FnKind)
- #83974 (Fix outdated crate names in `rustc_interface::callbacks`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup