Handle `macro_rules!` tokens consistently across crates
When we serialize a `macro_rules!` macro, we used a 'lowered' `TokenStream` for its body, which has all `Nonterminal`s expanded in-place via `nt_to_tokenstream`. This matters when an 'outer' `macro_rules!` macro expands to an 'inner' `macro_rules!` macro - the inner macro may use tokens captured from the 'outer' macro in its definition.
This means that invoking a foreign `macro_rules!` macro may use a different body `TokenStream` than when the same `macro_rules!` macro is invoked in the same crate. This difference is observable by proc-macros invoked by a `macro_rules!` macro - a `None`-delimited group will be seen in the same-crate case (inserted when convering `Nonterminal`s to the `proc_macro` crate's structs), but no `None`-delimited group in the cross-crate case.
To fix this inconsistency, we now insert `None`-delimited groups when 'lowering' a `Nonterminal` `macro_rules!` body, just as we do in `proc_macro_server`. Additionally, we no longer print extra spaces for `None`-delimited groups - as far as pretty-printing is concerned, they don't exist (only their contents do). This ensures that `Display` output of a `TokenStream` does not depend on which crate a `macro_rules!` macro was invoked from.
This PR is necessary in order to patch the `solana-genesis-programs` for the upcoming hygiene serialization breakage (https://github.com/rust-lang/rust/pull/72121#issuecomment-646924847). The `solana-genesis-programs` crate will need to use a proc macro to re-span certain tokens in a nested `macro_rules!`, which requires us to consistently use a `None`-delimited group.
See `src/test/ui/proc-macro/nested-macro-rules.rs` for an example of the kind of nested `macro_rules!` affected by this crate.
When a `macro_rules!` macro expands to another `macro_rules!` macro, we
may see `None`-delimited groups in odd places when another crate
deserializes the 'inner' macro. This commit 'unwraps' an outer
`None`-delimited group to avoid breaking existing code.
See https://github.com/rust-lang/rust/pull/73569#issuecomment-650860457
for more details.
The proper fix is to handle `None`-delimited groups systematically
throughout the parser, but that will require significant work. In the
meantime, this hack lets us fix important hygiene bugs in macros
The number of symbols we allocate (even early on) seems to be platform
dependent. We only care about hygiene for the purposes of this test,
so just set all of the symbol ids to zero
Normally, we encode a `Span` that references a foreign `SourceFile` by
encoding information about the foreign crate. When we decode this
`Span`, we lookup the foreign crate in order to decode the `SourceFile`.
However, this approach does not work for proc-macro crates. When we load
a proc-macro crate, we do not deserialzie any of its dependencies (since
a proc-macro crate can only export proc-macros). This means that we
cannot serialize a reference to an upstream crate, since the associated
metadata will not be available when we try to deserialize it.
This commit modifies foreign span handling so that we treat all foreign
`SourceFile`s as local `SourceFile`s when serializing a proc-macro.
All `SourceFile`s will be stored into the metadata of a proc-macro
crate, allowing us to cotinue to deserialize a proc-macro crate without
needing to load any of its dependencies.
Since the number of foreign `SourceFile`s that we load during a
compilation session may be very large, we only serialize a `SourceFile`
if we have also serialized a `Span` which requires it.
Show `SyntaxContext` in formatted `Span` debug output
This is only really useful in debug messages, so I've switched to
calling `span_to_string` in any place that causes a `Span` to end up in
user-visible output.
Don't lose empty `where` clause when pretty-printing
Previously, we would parse `struct Foo where;` and `struct Foo;`
identically, leading to an 'empty' `where` clause being omitted during
pretty printing. This will cause us to lose spans when proc-macros
involved, since we will have a collected `where` token that does not
appear in the pretty-printed item.
We now explicitly track the presence of a `where` token during parsing,
so that we can distinguish between `struct Foo where;` and `struct Foo;`
during pretty-printing
This is only really useful in debug messages, so I've switched to
calling `span_to_string` in any place that causes a `Span` to end up in
user-visible output.
Previously, we would parse `struct Foo where;` and `struct Foo;`
identically, leading to an 'empty' `where` clause being omitted during
pretty printing. This will cause us to lose spans when proc-macros
involved, since we will have a collected `where` token that does not
appear in the pretty-printed item.
We now explicitly track the presence of a `where` token during parsing,
so that we can distinguish between `struct Foo where;` and `struct Foo;`
during pretty-printing
Currently, the `Debug` impl for `proc_macro::Span` just prints out
the byte range. This can make debugging proc macros (either as a crate
author or as a compiler developer) very frustrating, since neither the
actual filename nor the `SyntaxContext` is displayed.
This commit adds a perma-unstable flag `-Z span-debug`. When enabled,
the `Debug` impl for `proc_macro::Span` simply forwards directly to
`rustc_span::Span`. Once #72618 is merged, this will start displaying
actual line numbers.
While `Debug` impls are not subject to Rust's normal stability
guarnatees, we probably shouldn't expose any additional information on
stable until `#![feature(proc_macro_span)]` is stabilized. Otherwise,
we would be providing a 'backdoor' way to access information that's
supposed be behind unstable APIs.
Fixes#68489
When checking two `TokenStreams` to see if they are 'probably equal',
we ignore the `IsJoint` information associated with each `TokenTree`.
However, the `IsJoint` information determines whether adjacent tokens
will be 'glued' (if possible) when construction the `TokenStream` - e.g.
`[Gt Gt]` can be 'glued' to `BinOp(Shr)`.
Since we are ignoring the `IsJoint` information, 'glued' and 'unglued'
tokens are equivalent for determining if two `TokenStreams` are
'probably equal'. Therefore, we need to 'unglue' all tokens in the
stream to avoid false negatives (which cause us to throw out the cached
tokens, losing span information).
Moving more build-pass tests to check-pass
One or two tests became build-pass without the FIXME because they really
needed build-pass (were failing without it).
Helps with #62277
---
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/rust-lang/rust/71340)
<!-- Reviewable:end -->
Increase verbosity when suggesting subtle code changes
Do not suggest changes that are actually quite small inline, to minimize the likelihood of confusion.
Fix#69243.