proc_macro: Avoid cached TokenStream more often
This commit adds even more pessimization to use the cached `TokenStream` inside
of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary
AST node and transforming it back into a `TokenStream` to hand off to a
procedural macro. Such functionality isn't actually implemented in rustc today,
so the way `proc_macro` works today is that it stringifies an AST node and then
reparses for a list of tokens.
This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a `TokenStream`
cache representing the tokens they were originally parsed from. This
`TokenStream` cache, however, has turned out to not always reflect the current
state of the item when it's being tokenized. For example `#[cfg]` processing or
macro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.
This commit tweaks the usage of the cached `TokenStream` to compare it to our
lossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.
Closes#48644Closes#49846
Hygiene 2.0: Avoid comparing fields by name
There are two separate commits here (not counting tests):
- The first one unifies named (`obj.name`) and numeric (`obj.0`) field access expressions in AST and HIR. Before field references in these expressions are resolved it doesn't matter whether the field is named or numeric (it's just a symbol) and 99% of code is common. After field references are resolved we work with
them by index for all fields (see the second commit), so it's again not important whether the field was named or numeric (this includes MIR where all fields were already by index).
(This refactoring actually fixed some bugs in HIR-based borrow checker where borrows through names (`S {
0: ref x }`) and indices (`&s.0`) weren't considered overlapping.)
- The second commit removes all by-name field comparison and instead resolves field references to their indices once, and then uses those resolutions. (There are still a few name comparisons in save-analysis, because save-analysis is weird, but they are made correctly hygienic).
Thus we are fixing a bunch of "secondary" field hygiene bugs (in borrow checker, lints).
Fixes https://github.com/rust-lang/rust/issues/46314
Merge the std_unicode crate into the core crate
[The standard library facade](https://github.com/rust-lang/rust/issues/27783) has historically contained a number of crates with different roles, but that number has decreased over time. `rand` and `libc` have moved to crates.io, and [`collections` was merged into `alloc`](https://github.com/rust-lang/rust/pull/42648). Today we have `core` that applies everywhere, `std` that expects a full operating system, and `alloc` in-between that only requires a memory allocator (which can be provided by users)… and `std_unicode`, which doesn’t really have a reason to be separate anymore. It contains functionality based on Unicode data tables that can be large, but as long as relevant functions are not called the tables should be removed from binaries by linkers.
This deprecates the unstable `std_unicode` crate and moves all of its contents into `core`, replacing them with `pub use` reexports. The crate can be removed later. This also removes the `CharExt` trait (replaced with inherent methods in libcore) and `UnicodeStr` trait (merged into `StrExt`). There traits were both unstable and not intended to be used or named directly.
A number of new items are newly-available in libcore and instantly stable there, but only if they were already stable in libstd.
Fixes#49319.
Use sort_by_cached_key where appropriate
A follow-up to https://github.com/rust-lang/rust/pull/48639, converting various slice sorting calls to `sort_by_cached_key` when the key functions are more expensive.
This commit adds even more pessimization to use the cached `TokenStream` inside
of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary
AST node and transforming it back into a `TokenStream` to hand off to a
procedural macro. Such functionality isn't actually implemented in rustc today,
so the way `proc_macro` works today is that it stringifies an AST node and then
reparses for a list of tokens.
This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a `TokenStream`
cache representing the tokens they were originally parsed from. This
`TokenStream` cache, however, has turned out to not always reflect the current
state of the item when it's being tokenized. For example `#[cfg]` processing or
macro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.
This commit tweaks the usage of the cached `TokenStream` to compare it to our
lossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.
Closes#48644Closes#49846
Impressing confused Python users with magical diagnostics is perhaps
worth this not-grossly-unreasonable (only 40ish lines) extra complexity
in the parser?
Thanks to Vadim Petrochenkov for guidance.
This resolves#46836.
Correct a few stability attributes
* `const_indexing` language feature was stabilized in 1.26.0 by #46882
* `Display` impls for `PanicInfo` and `Location` were stabilized in 1.26.0 by #47687
* `TrustedLen` is still unstable so its impls should be as well even though `RangeInclusive` was stabilized by #47813
* `!Send` and `!Sync` for `Args` and `ArgsOs` were stabilized in 1.26.0 by #48005
* `EscapeDefault` has been stable since 1.0.0 so should continue to show that even though it was moved to core in #48735
This could be backported to beta like #49612
Expand macros in `extern {}` blocks
This permits macro and proc-macro and attribute invocations (the latter only with the `proc_macro` feature of course) in `extern {}` blocks, gated behind a new `macros_in_extern` feature.
A tracking issue is now open at #49476closes#48747
Rollup of 9 pull requests
Successful merges:
- #48658 (Add a generic CAS loop to std::sync::Atomic*)
- #49253 (Take the original extra-filename passed to a crate into account when resolving it as a dependency)
- #49345 (RFC 2008: Finishing Touches)
- #49432 (Flush executables to disk after linkage)
- #49496 (Add more vec![... ; n] optimizations)
- #49563 (add a dist builder to build rust-std components for the THUMB targets)
- #49654 (Host compiler documentation: Include private items)
- #49667 (Add more features to rust_2018_preview)
- #49674 (ci: Remove x86_64-gnu-incremental builder)
Failed merges:
Easy edition feature flag
We no longer gate features on epochs; instead we have a `#![feature(rust_2018_preview)]` that flips on a bunch of features (currently dyn_trait).
Based on #49001 to avoid merge conflicts
r? @nikomatsakis
Expand Attributes on Statements and Expressions
This enables attribute-macro expansion on statements and expressions while retaining the `stmt_expr_attributes` feature requirement for attributes on expressions.
closes#41475
cc #38356 @petrochenkov @jseyfried
r? @nrc
Fix escaped backslash in windows file not found message
When a module is declared, but no matching file exists, rustc gives
an error like `help: name the file either foo.rs or foo/mod.rs inside
the directory "src/bar"`. However, at on windows, the backslash was
double-escaped when naming the directory.
It did this because the string was printed in debug mode (`"{:?}"`) to
surround it with quotes. However, it should just be printed like any
other directory in an error message and surrounded by escaped quotes,
rather than relying on the debug print to add quotes (`"\"{}\""`).
I also checked the test suite to see if this output is being correctly tested. It's not - it only tests up to the word "directory". Presumably this is so that the test is not dependent on its exact position in the source tree. I don't know a better way to test this, unless the test suite supports regex?
When a module is declared, but no matching file exists, rustc gives
an error like 'help: name the file either foo.rs or foo/mod.rs inside
the directory "src/bar"'. However, at on windows, the backslash was
double-escaped when naming the directory.
It did this because the string was printed in debug mode ( "{:?}" ) to
surround it with quotes. However, it should just be printed like any
other directory in an error message and surrounded by escaped quotes,
rather than relying on the debug print to add quotes ( "\"{}\"" ).