Properly handle attributes on statements
We now collect tokens for the underlying node wrapped by `StmtKind`
nstead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
Loading a macro from libstd causes us to load serialized
`SyntaxContext`s in a platform-dependent way, causing the printed spans
to differ between platforms.
When parsing a statement (e.g. inside a function body),
we now consider `struct Foo {};` and `$stmt;` to each consist
of two statements: `struct Foo {}` and `;`, and `$stmt` and `;`.
As a result, an attribute macro invoke as
`fn foo() { #[attr] struct Bar{}; }` will see `struct Bar{}` as its
input. Additionally, the 'unused semicolon' lint now fires in more
places.
We now collect tokens for the underlying node wrapped by `StmtKind`
instead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
Cache pretty-print/retokenize result to avoid compile time blowup
Fixes#79242
If a `macro_rules!` recursively builds up a nested nonterminal
(passing it to a proc-macro at each step), we will end up repeatedly
pretty-printing/retokenizing the same nonterminals. Unfortunately, the
'probable equality' check we do has a non-trivial cost, which leads to a
blowup in compilation time.
As a workaround, we cache the result of the 'probable equality' check,
which eliminates the compilation time blowup for the linked issue. This
commit only touches a single file (other than adding tests), so it
should be easy to backport.
The proper solution is to remove the pretty-print/retokenize hack
entirely. However, this will almost certainly break a large number of
crates that were relying on hygiene bugs created by using the reparsed
`TokenStream`. As a result, we will definitely not want to backport
such a change.
Fixes#79242
If a `macro_rules!` recursively builds up a nested nonterminal
(passing it to a proc-macro at each step), we will end up repeatedly
pretty-printing/retokenizing the same nonterminals. Unfortunately, the
'probable equality' check we do has a non-trivial cost, which leads to a
blowup in compilation time.
As a workaround, we cache the result of the 'probable equality' check,
which eliminates the compilation time blowup for the linked issue. This
commit only touches a single file (other than adding tests), so it
should be easy to backport.
The proper solution is to remove the pretty-print/retokenize hack
entirely. However, this will almost certainly break a large number of
crates that were relying on hygiene bugs created by using the reparsed
`TokenStream`. As a result, we will definitely not want to backport
such a change.
Use reparsed `TokenStream` if we captured any inner attributes
Fixes#78675
We now bail out of `prepend_attrs` if we ended up capturing any inner
attributes (which can happen in several places, due to token capturing
for `macro_rules!` arguments.
Treat trailing semicolon as a statement in macro call
See #61733 (comment)
We now preserve the trailing semicolon in a macro invocation, even if
the macro expands to nothing. As a result, the following code no longer
compiles:
```rust
macro_rules! empty {
() => { }
}
fn foo() -> bool { //~ ERROR mismatched
{ true } //~ ERROR mismatched
empty!();
}
```
Previously, `{ true }` would be considered the trailing expression, even
though there's a semicolon in `empty!();`
This makes macro expansion more token-based.
Fixes#78675
We now bail out of `prepend_attrs` if we ended up capturing any inner
attributes (which can happen in several places, due to token capturing
for `macro_rules!` arguments.
See https://github.com/rust-lang/rust/issues/61733#issuecomment-716188981
We now preserve the trailing semicolon in a macro invocation, even if
the macro expands to nothing. As a result, the following code no longer
compiles:
```rust
macro_rules! empty {
() => { }
}
fn foo() -> bool { //~ ERROR mismatched
{ true } //~ ERROR mismatched
empty!();
}
```
Previously, `{ true }` would be considered the trailing expression, even
though there's a semicolon in `empty!();`
This makes macro expansion more token-based.
This allows us to avoid synthesizing tokens in `prepend_attr`, since we
have the original tokens available.
We still need to synthesize tokens when expanding `cfg_attr`,
but this is an unavoidable consequence of the syntax of `cfg_attr` -
the user does not supply the `#` and `[]` tokens that a `cfg_attr`
expands to.
Fixes#74616
Makes progress towards #43081
Unblocks PR #76130
When pretty-printing an AST node, we may insert additional parenthesis
to ensure that precedence is properly preserved in code we output.
However, the proc macro implementation relies on comparing a
pretty-printed AST node to the captured `TokenStream`. Inserting extra
parenthesis changes the structure of the reparsed `TokenStream`, making
the comparison fail.
This PR refactors the AST pretty-printing code to allow skipping the
insertion of additional parenthesis. Several freestanding methods are
moved to trait methods on `PrintState`, which keep track of an internal
`insert_extra_parens` flag. This flag is normally `true`, but we expose
a public method which allows pretty-printing a nonterminal with
`insert_extra_parens = false`.
To avoid changing the public interface of `rustc_ast_pretty`, the
freestanding `_to_string` methods are changed to delegate to a
newly-crated `State`. The main pretty-printing code is moved to a new
`state` module to ensure that it does not accidentally call any of these
public helper functions (instead, the internal functions with the same
name should be used).
Makes progress towards #43081
In PR #73084, we started recursively expanded nonterminals during the
pretty-print/reparse check, allowing them to be properly compared
against the reparsed tokenstream.
Unfortunately, the recursive logic in that PR only handles the case
where a nonterminal appears inside a `TokenTree::Delimited`. If a
nonterminal appears directly in the expanded tokens of another
nonterminal, the inner nonterminal will not be expanded.
This PR fixes the recursive expansion of nonterminals, ensuring that
they are expanded wherever they occur.
Properly encode spans with a dummy location and non-root `SyntaxContext`
Previously, we would throw away the `SyntaxContext` of any span with a
dummy location during metadata encoding. This commit makes metadata Span
encoding consistent with incr-cache Span encoding - an 'invalid span'
tag is only used when it doesn't lose any information.
Ignore `|` and `+` tokens during proc-macro pretty-print check
Fixes#76182
This is an alternative to PR #76188
These tokens are not preserved in the AST in certain cases
(e.g. a leading `|` in a pattern or a trailing `+` in a trait bound).
This PR ignores them entirely during the pretty-print/reparse check
to avoid spuriously using the re-parsed tokenstream.
Previously, we would throw away the `SyntaxContext` of any span with a
dummy location during metadata encoding. This commit makes metadata Span
encoding consistent with incr-cache Span encoding - an 'invalid span'
tag is only used when it doesn't lose any information.
Fixes#76182
This is an alternative to PR #76188
These tokens are not preserved in the AST in certain cases
(e.g. a leading `|` in a pattern or a trailing `+` in a trait bound).
This PR ignores them entirely during the pretty-print/reparse check
to avoid spuriously using the re-parsed tokenstream.
Issue #74616 tracks a backwards-compatibility hack for certain macros.
This has is implemented by hard-coding the filenames and macro names of
certain code that we want to continue to compile.
However, the initial implementation of the hack was based on the
directory structure when building the crate from its repository (e.g.
`js-sys/src/lib.rs`). When the crate is build as a dependency, it will
include a version number from the clone from the cargo registry (e.g.
`js-sys-0.3.17/src/lib.rs`), which would fail the check.
This commit modifies the backwards-compatibility hack to check that
desired crate name (`js-sys` or `time-macros-impl`) is a prefix of the
proper part of the path.
See https://github.com/rust-lang/rust/issues/76070#issuecomment-687215646
for more details.
If a symbol name can only be imported from one place for a type, and
as long as it was not glob-imported anywhere in the current crate, we
can trim its printed path and print only the name.
This has wide implications on error messages with types, for example,
shortening `std::vec::Vec` to just `Vec`, as long as there is no other
`Vec` importable anywhere.
This adds a new '-Z trim-diagnostic-paths=false' option to control this
feature.
On the good path, with no diagnosis printed, we should try to avoid
issuing this query, so we need to prevent trimmed_def_paths query on
several cases.
This change also relies on a previous commit that differentiates
between `Debug` and `Display` on various rustc types, where the latter
is trimmed and presented to the user and the former is not.
Run cfg-stripping on generic parameters before invoking derive macros
Fixes#75930
This changes the tokens seen by a proc-macro. However, ising a `#[cfg]` attribute
on a generic paramter is unusual, and combining it with a proc-macro
derive is probably even more unusual. I don't expect this to cause any
breakage.
Fixes#75050
Previously, we would unconditionally suppress the panic hook during
proc-macro execution. This commit adds a new flag
-Z proc-macro-backtrace, which allows running the panic hook for
easier debugging.
Fixes#75930
This changes the tokens seen by a proc-macro. However, ising a `#[cfg]` attribute
on a generic paramter is unusual, and combining it with a proc-macro
derive is probably even more unusual. I don't expect this to cause any
breakage.
Use smaller def span for functions
Currently, the def span of a function encompasses the entire function
signature and body. However, this is usually unnecessarily verbose - when we are
pointing at an entire function in a diagnostic, we almost always want to
point at the signature. The actual contents of the body tends to be
irrelevant to the diagnostic we are emitting, and just takes up
additional screen space.
This commit changes the `def_span` of all function items (freestanding
functions, `impl`-block methods, and `trait`-block methods) to be the
span of the signature. For example, the function
```rust
pub fn foo<T>(val: T) -> T { val }
```
now has a `def_span` corresponding to `pub fn foo<T>(val: T) -> T`
(everything before the opening curly brace).
Trait methods without a body have a `def_span` which includes the
trailing semicolon. For example:
```rust
trait Foo {
fn bar();
}
```
the function definition `Foo::bar` has a `def_span` of `fn bar();`
This makes our diagnostic output much shorter, and emphasizes
information that is relevant to whatever diagnostic we are reporting.
We continue to use the full span (including the body) in a few of
places:
* MIR building uses the full span when building source scopes.
* 'Outlives suggestions' use the full span to sort the diagnostics being
emitted.
* The `#[rustc_on_unimplemented(enclosing_scope="in this scope")]`
attribute points the entire scope body.
All of these cases work only with local items, so we don't need to
add anything extra to crate metadata.
Currently, the def span of a funtion encompasses the entire function
signature and body. However, this is usually unnecessarily verbose - when we are
pointing at an entire function in a diagnostic, we almost always want to
point at the signature. The actual contents of the body tends to be
irrelevant to the diagnostic we are emitting, and just takes up
additional screen space.
This commit changes the `def_span` of all function items (freestanding
functions, `impl`-block methods, and `trait`-block methods) to be the
span of the signature. For example, the function
```rust
pub fn foo<T>(val: T) -> T { val }
```
now has a `def_span` corresponding to `pub fn foo<T>(val: T) -> T`
(everything before the opening curly brace).
Trait methods without a body have a `def_span` which includes the
trailing semicolon. For example:
```rust
trait Foo {
fn bar();
}```
the function definition `Foo::bar` has a `def_span` of `fn bar();`
This makes our diagnostic output much shorter, and emphasizes
information that is relevant to whatever diagnostic we are reporting.
We continue to use the full span (including the body) in a few of
places:
* MIR building uses the full span when building source scopes.
* 'Outlives suggestions' use the full span to sort the diagnostics being
emitted.
* The `#[rustc_on_unimplemented(enclosing_scope="in this scope")]`
attribute points the entire scope body.
* The 'unconditional recursion' lint uses the full span to show
additional context for the recursive call.
All of these cases work only with local items, so we don't need to
add anything extra to crate metadata.