Fix the start/end byte positions in the compiler JSON output
Track the changes made during normalization in the `SourceFile` and use this information to correct the `start_byte` and `end_byte` fields in the JSON output.
This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.
The rough plan was discussed with @matklad in rust-lang-nursery/rustfix#176 - although I ended up going with `u32` offset tracking so I wouldn't need to deal with `u32 + i32` arithmetics when applying the offset to the span byte positions.
Fixes#65029
Derive `Rustc{En,De}codable` for `TokenStream`.
`TokenStream` used to be a complex type, but it is now just a newtype
around a `Lrc<Vec<TreeAndJoint>>`. Currently it uses custom encoding
that discards the `IsJoint` and custom decoding that adds `NonJoint`
back in for every token tree. This requires building intermediate
`Vec<TokenTree>`s.
This commit makes `TokenStream` derive `Rustc{En,De}codable`. This
simplifies the code, and avoids the creation of the intermediate
vectors, saving up to 3% on various benchmarks. It also changes the AST
JSON output in one test.
r? @petrochenkov
Rollup of 14 pull requests
Successful merges:
- #64145 (Target-feature documented as unsafe)
- #65007 (Mention keyword closing policy)
- #65417 (Add more coherence tests)
- #65507 (Fix test style in unused parentheses lint test)
- #65591 (Add long error explanation for E0588)
- #65617 (Fix WASI sleep impl)
- #65656 (Add option to disable keyboard shortcuts in docs)
- #65678 (Add long error explanation for E0728)
- #65681 (Code cleanups following up on #65576.)
- #65686 (refactor and move `maybe_append` )
- #65688 (Add some tests for fixed ICEs)
- #65689 (bring back some Debug instances for Miri)
- #65695 (self-profiling: Remove module names from some event-ids in codegen backend.)
- #65706 (Add missing space in librustdoc)
Failed merges:
r? @ghost
`TokenStream` used to be a complex type, but it is now just a newtype
around a `Lrc<Vec<TreeAndJoint>>`. Currently it uses custom encoding
that discards the `IsJoint` and custom decoding that adds `NonJoint`
back in for every token tree. This requires building intermediate
`Vec<TokenTree>`s.
This commit makes `TokenStream` derive `Rustc{En,De}codable`. This
simplifies the code, and avoids the creation of the intermediate
vectors, saving up to 3% on various benchmarks. It also changes the AST
JSON output in one test.
Plugins deprecation: don’t suggest simply removing the attribute
Building Servo with a recent Nightly produces:
```rust
warning: use of deprecated attribute `plugin`: compiler plugins are deprecated. See https://github.com/rust-lang/rust/issues/29597
--> components/script/lib.rs:14:1
|
14 | #![plugin(script_plugins)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove this attribute
|
= note: `#[warn(deprecated)]` on by default
```
First, linking to https://github.com/rust-lang/rust/issues/29597 is not ideal since there is pretty much no discussion there of the deprecation and what can be used instead. This PR changes the link to the deprecation PR which does have more discussion.
Second, the “remove this attribute” suggestion is rather unhelpful. Just because a feature is deprecated doesn’t mean that simply removing its use without a replacement is acceptable.
In the case of custom lint, there is no replacement available. Prefixing a message with “help:” when telling users that they’re screwed honestly feels disrespectful.
This PR also changes the message to be more factual.
Avoid unnecessary `TokenTree` to `TokenStream` conversions
A `TokenStream` contains any number of `TokenTrees`. Therefore, a single `TokenTree` can be promoted to a `TokenStream`. But doing so costs two allocations: one for the single-element `Vec`, and one for the `Lrc`. (An `IsJoint` value also must be added; the default is `NonJoint`.)
The current code converts `TokenTree`s to `TokenStream`s unnecessarily in a few places. This PR removes some of these unnecessary conversions, both simplifying the code and speeding it up.
r? @petrochenkov
Likewise for `NestedMetaItem::tokens()`. Also, add
`MetaItemKind::token_trees_and_joints()`, which `MetaItemKind::tokens()`
now calls.
This avoids some unnecessary `TokenTree` to `TokenStream` conversions,
and removes the need for the clumsy
`TokenStream::append_to_tree_and_joint_vec()`.
The current code has this impl:
```
impl<T: Into<TokenStream>> iter::FromIterator<T> for TokenStream
```
If given an `IntoIterator<Item = TokenTree>`, it will convert each individual
`TokenTree` to a `TokenStream` (at the cost of two allocations: a `Vec`
and an `Lrc`). It will then merge those `TokenStream`s into a single
`TokenStream`. This is inefficient.
This commit changes the impl to this less general one:
```
impl iter::FromIterator<TokenTree> for TokenStream
```
It collects the `TokenTree`s into a single `Vec` first and then converts that
to a `TokenStream` by wrapping it in a single `Lrc`. The previous generality
was unnecessary; no other code needs changing.
This change speeds up several benchmarks by up to 4%.
Building Servo with a recent Nightly produces:
```rust
warning: use of deprecated attribute `plugin`: compiler plugins are deprecated. See https://github.com/rust-lang/rust/issues/29597
--> components/script/lib.rs:14:1
|
14 | #![plugin(script_plugins)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove this attribute
|
= note: `#[warn(deprecated)]` on by default
```
First, linking to https://github.com/rust-lang/rust/issues/29597 is not ideal
since there is pretty much no discussion there of the deprecation
and what can be used instead.
This PR changes the link to the deprecation PR which does have more discussion.
Second, the “remove this attribute” suggestion is rather unhelpful.
Just because a feature is deprecated doesn’t mean that simply removing its use
without a replacement is acceptable.
In the case of custom lint, there is no replacement available.
Prefixing a message with “help:” when telling users that they’re screwed
honestly feels disrespectful.
This PR also changes the message to be more factual.