Commit graph

3099 commits

Author SHA1 Message Date
Vadim Petrochenkov
6a4e0b3fae parser: Do not override syntactic context for dummy spans 2018-04-22 04:33:30 +03:00
bors
85f5dd489e Auto merge of #50052 - nnethercote:char_lit, r=Mark-Simulacrum
Avoid allocating when parsing \u{...} literals.

`char_lit` uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.

This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.

rustc-perf results, on a stage 2 build with jemalloc disabled:

<details>

```
regex-check
	avg: -5.4%	min: -6.5%	max: -2.7%
futures-check
	avg: -3.5%	min: -5.3%	max: -1.7%
regex-opt
	avg: -2.0%	min: -5.1%	max: -0.2%
regex
	avg: -2.3%	min: -5.0%	max: -0.6%
futures-opt
	avg: -3.0%	min: -4.8%	max: -1.1%
futures
	avg: -3.1%	min: -4.8%	max: -1.3%
clap-rs-check
	avg: -1.8%	min: -3.5%	max: -0.9%
coercions-check
	avg: -2.0%	min: -3.3%	max: -1.0%
hyper-check
	avg: -2.2%	min: -3.1%	max: -1.3%
hyper
	avg: -1.3%	min: -2.4%	max: -0.3%
hyper-opt
	avg: -0.9%	min: -2.3%	max: -0.1%
coercions
	avg: -1.1%	min: -2.2%	max: -0.4%
encoding-check
	avg: -1.7%	min: -2.2%	max: -0.9%
clap-rs-opt
	avg: -0.7%	min: -2.2%	max: 0.0%
coercions-opt
	avg: -1.2%	min: -2.1%	max: -0.3%
clap-rs
	avg: -0.8%	min: -1.9%	max: -0.4%
encoding-opt
	avg: -1.0%	min: -1.9%	max: -0.3%
encoding
	avg: -1.1%	min: -1.9%	max: -0.4%
piston-image-check
	avg: -0.7%	min: -1.3%	max: -0.3%
inflate-opt
	avg: -0.3%	min: -0.9%	max: -0.0%
piston-image
	avg: -0.3%	min: -0.8%	max: -0.1%
piston-image-opt
	avg: -0.3%	min: -0.7%	max: -0.1%
syn-check
	avg: -0.3%	min: -0.6%	max: -0.1%
deep-vector
	avg: 0.1%	min: -0.1%	max: 0.5%
syn-opt
	avg: -0.1%	min: -0.4%	max: 0.0%
html5ever
	avg: -0.2%	min: -0.4%	max: -0.0%
deep-vector-check
	avg: 0.0%	min: -0.3%	max: 0.3%
syn
	avg: -0.2%	min: -0.3%	max: -0.1%
html5ever-check
	avg: -0.3%	min: -0.3%	max: -0.2%
issue-46449-check
	avg: -0.1%	min: -0.2%	max: 0.2%
html5ever-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
deep-vector-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
issue-46449-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
unify-linearly-check
	avg: -0.0%	min: -0.2%	max: 0.1%
helloworld-check
	avg: 0.0%	min: -0.0%	max: 0.2%
parser-check
	avg: -0.0%	min: -0.2%	max: 0.0%
inflate
	avg: 0.0%	min: -0.0%	max: 0.1%
tokio-webpush-simple-check
	avg: -0.1%	min: -0.1%	max: -0.0%
regression-31157-check
	avg: 0.0%	min: -0.1%	max: 0.1%
issue-46449
	avg: 0.0%	min: -0.1%	max: 0.1%
tuple-stress-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
tuple-stress-check
	avg: -0.0%	min: -0.1%	max: 0.1%
tuple-stress
	avg: 0.0%	min: -0.0%	max: 0.1%
deeply-nested-check
	avg: 0.0%	min: -0.0%	max: 0.1%
regression-31157
	avg: -0.0%	min: -0.1%	max: 0.1%
deeply-nested-opt
	avg: -0.0%	min: -0.1%	max: 0.1%
parser-opt
	avg: -0.0%	min: -0.1%	max: 0.0%
parser
	avg: 0.1%	min: 0.0%	max: 0.1%
tokio-webpush-simple
	avg: -0.0%	min: -0.1%	max: 0.1%
regression-31157-opt
	avg: -0.0%	min: -0.1%	max: 0.1%
helloworld-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
unify-linearly-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
unused-warnings-check
	avg: 0.0%	min: 0.0%	max: 0.1%
tokio-webpush-simple-opt
	avg: -0.0%	min: -0.1%	max: 0.0%
helloworld
	avg: -0.0%	min: -0.0%	max: 0.1%
unused-warnings
	avg: 0.0%	min: -0.0%	max: 0.0%
deeply-nested
	avg: -0.0%	min: -0.0%	max: -0.0%
unused-warnings-opt
	avg: 0.0%	min: -0.0%	max: 0.0%
unify-linearly
	avg: 0.0%	min: -0.0%	max: 0.0%
inflate-check
	avg: 0.0%	min: -0.0%	max: 0.0%
```

</details>
2018-04-20 10:40:25 +00:00
Alex Crichton
e9348738fc proc_macro: Stay on the "use the cache" path more
Discovered in #50061 we're falling off the "happy path" of using a stringified
token stream more often than we should. This was due to the fact that a
user-written token like `0xf` is equality-different from the stringified token
of `15` (despite being semantically equivalent).

This patch updates the call to `eq_unspanned` with an even more awful solution,
`probably_equal_for_proc_macro`, which ignores the value of each token and
basically only compares the structure of the token stream, assuming that the AST
doesn't change just one token at a time.

While this is a step towards fixing #50061 there is still one regression
from #49154 which needs to be fixed.
2018-04-18 19:36:48 -07:00
Nicholas Nethercote
9f145022ef Avoid allocating when parsing \u{...} literals.
`char_lit` uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.

This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.
2018-04-19 09:17:40 +10:00
bors
3dfda16525 Auto merge of #49993 - nnethercote:shrink-Token, r=alexcrichton
Change the hashcounts in raw `Lit` variants from usize to u16.

This reduces the size of `Token` from 32 bytes to 24 bytes on 64-bit
platforms.
2018-04-18 14:44:54 +00:00
kennytm
95b7e6fe92
Rollup merge of #49852 - alexcrichton:fix-more-proc-macros, r=nrc
proc_macro: Avoid cached TokenStream more often

This commit adds even more pessimization to use the cached `TokenStream` inside
of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary
AST node and transforming it back into a `TokenStream` to hand off to a
procedural macro. Such functionality isn't actually implemented in rustc today,
so the way `proc_macro` works today is that it stringifies an AST node and then
reparses for a list of tokens.

This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a `TokenStream`
cache representing the tokens they were originally parsed from. This
`TokenStream` cache, however, has turned out to not always reflect the current
state of the item when it's being tokenized. For example `#[cfg]` processing or
macro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.

This commit tweaks the usage of the cached `TokenStream` to compare it to our
lossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.

Closes #48644
Closes #49846
2018-04-14 15:21:19 +08:00
Vadim Petrochenkov
7e1f73beb6 macros: Do not match on "complex" nonterminals requiring AST comparisons 2018-04-14 02:28:39 +03:00
Vadim Petrochenkov
4f69b7fb85 Avoid comparing fields by name when possible
Resolve them into field indices once and then use those resolutions

+ Fix rebase
2018-04-12 23:06:03 +03:00
Vadim Petrochenkov
44acea4d88 AST/HIR: Merge field access expressions for named and numeric fields 2018-04-12 23:02:09 +03:00
Nicholas Nethercote
4d34bfd00a Change the hashcounts in raw Lit variants from usize to u16.
This reduces the size of `Token` from 32 bytes to 24 bytes on 64-bit
platforms.
2018-04-12 20:12:42 +10:00
bors
d26f9e42df Auto merge of #49698 - SimonSapin:unicode-for-everyone, r=alexcrichton
Merge the std_unicode crate into the core crate

[The standard library facade](https://github.com/rust-lang/rust/issues/27783) has historically contained a number of crates with different roles, but that number has decreased over time. `rand` and `libc` have moved to crates.io, and [`collections` was merged into `alloc`](https://github.com/rust-lang/rust/pull/42648). Today we have `core` that applies everywhere, `std` that expects a full operating system, and `alloc` in-between that only requires a memory allocator (which can be provided by users)… and `std_unicode`, which doesn’t really have a reason to be separate anymore. It contains functionality based on Unicode data tables that can be large, but as long as relevant functions are not called the tables should be removed from binaries by linkers.

This deprecates the unstable `std_unicode` crate and moves all of its contents into `core`, replacing them with `pub use` reexports. The crate can be removed later. This also removes the `CharExt` trait (replaced with inherent methods in libcore) and `UnicodeStr` trait (merged into `StrExt`). There traits were both unstable and not intended to be used or named directly.

A number of new items are newly-available in libcore and instantly stable there, but only if they were already stable in libstd.

Fixes #49319.
2018-04-12 00:35:33 +00:00
Simon Sapin
b2027ef17c Deprecate the std_unicode crate 2018-04-12 00:13:51 +02:00
kennytm
77777b4528
Rollup merge of #49525 - varkor:sort_by_cached_key-conversion, r=scottmcm
Use sort_by_cached_key where appropriate

A follow-up to https://github.com/rust-lang/rust/pull/48639, converting various slice sorting calls to `sort_by_cached_key` when the key functions are more expensive.
2018-04-11 19:56:41 +08:00
Alex Crichton
6d7cfd4f1a proc_macro: Avoid cached TokenStream more often
This commit adds even more pessimization to use the cached `TokenStream` inside
of an AST node. As a reminder the `proc_macro` API requires taking an arbitrary
AST node and transforming it back into a `TokenStream` to hand off to a
procedural macro. Such functionality isn't actually implemented in rustc today,
so the way `proc_macro` works today is that it stringifies an AST node and then
reparses for a list of tokens.

This strategy unfortunately loses all span information, so we try to avoid it
whenever possible. Implemented in #43230 some AST nodes have a `TokenStream`
cache representing the tokens they were originally parsed from. This
`TokenStream` cache, however, has turned out to not always reflect the current
state of the item when it's being tokenized. For example `#[cfg]` processing or
macro expansion could modify the state of an item. Consequently we've seen a
number of bugs (#48644 and #49846) related to using this stale cache.

This commit tweaks the usage of the cached `TokenStream` to compare it to our
lossy stringification of the token stream. If the tokens that make up the cache
and the stringified token stream are the same then we return the cached version
(which has correct span information). If they differ, however, then we will
return the stringified version as the cache has been invalidated and we just
haven't figured that out.

Closes #48644
Closes #49846
2018-04-10 13:05:13 -07:00
bors
67712d7945 Auto merge of #49390 - Zoxc:sync-syntax, r=michaelwoerister
More thread-safety changes

r? @michaelwoerister
2018-04-10 09:00:27 +00:00
Zack M. Davis
ba0dd8eb02 in which ! is suggested for erroneous identifier not
Impressing confused Python users with magical diagnostics is perhaps
worth this not-grossly-unreasonable (only 40ish lines) extra complexity
in the parser?

Thanks to Vadim Petrochenkov for guidance.

This resolves #46836.
2018-04-09 08:45:12 -07:00
Zack M. Davis
944c401736 don't suggest placing code in block if next token is open-brace
Thanks to the inestimably inimitable Esteban "Estebank" Küber for
pointing this out.

This is relevant to #46836.
2018-04-09 08:45:12 -07:00
varkor
57eedbaaf8 Convert sort_by to sort_by_cached_key 2018-04-09 16:44:19 +01:00
Vadim Petrochenkov
3a30bad6de Use Ident instead of Name in MetaItem 2018-04-06 11:52:16 +03:00
Vadim Petrochenkov
bfaf4180ae Make lifetime nonterminals closer to identifier nonterminals 2018-04-06 11:52:16 +03:00
Vadim Petrochenkov
b3b5ef186c Remove more duplicated spans 2018-04-06 11:50:49 +03:00
Vadim Petrochenkov
62000c072e Rename ast::Variant_::name into ident + Fix rebase 2018-04-06 11:48:19 +03:00
Vadim Petrochenkov
e2afefd80b Get rid of SpannedIdent 2018-04-06 11:48:19 +03:00
Vadim Petrochenkov
8719d1ed05 Rename PathSegment::identifier to ident 2018-04-06 11:46:26 +03:00
Vadim Petrochenkov
baae274fb7 Use Span instead of SyntaxContext in Ident 2018-04-06 11:46:26 +03:00
Alex Crichton
46492ffabd
Rollup merge of #49350 - abonander:macros-in-extern, r=petrochenkov
Expand macros in `extern {}` blocks

This permits macro and proc-macro and attribute invocations (the latter only with the `proc_macro` feature of course) in `extern {}` blocks, gated behind a new `macros_in_extern` feature.

A tracking issue is now open at #49476

closes #48747
2018-04-05 10:49:14 -05:00
Austin Bonander
5d74990ceb expand macro invocations in extern {} blocks 2018-04-03 13:16:11 -07:00
Aidan Hobson Sayers
9b5859aea1 Remove all unstable placement features
Closes #22181, #27779
2018-04-03 11:02:34 +02:00
bors
097efa9a99 Auto merge of #49124 - abonander:attr-macro-stmt-expr, r=abonander
Expand Attributes on Statements and Expressions

This enables attribute-macro expansion on statements and expressions while retaining the `stmt_expr_attributes` feature requirement for attributes on expressions.

closes #41475
cc #38356  @petrochenkov @jseyfried
r? @nrc
2018-04-02 10:38:28 +00:00
Austin Bonander
7c0124dd35 Expand attribute macros on statements and expressions.
Retains the `stmt_expr_attributes` feature requirement for attributes on expressions.

closes #41475
cc #38356
2018-04-02 01:56:12 -07:00
bors
d2235f20b5 Auto merge of #49478 - Phlosioneer:fix-windows-file-not-found, r=petrochenkov
Fix escaped backslash in windows file not found message

When a module is declared, but no matching file exists, rustc gives
an error like `help: name the file either foo.rs or foo/mod.rs inside
the directory "src/bar"`. However, at on windows, the backslash was
double-escaped when naming the directory.

It did this because the string was printed in debug mode (`"{:?}"`) to
surround it with quotes. However, it should just be printed like any
other directory in an error message and surrounded by escaped quotes,
rather than relying on the debug print to add quotes (`"\"{}\""`).

I also checked the test suite to see if this output is being correctly tested. It's not - it only tests up to the word "directory". Presumably this is so that the test is not dependent on its exact position in the source tree. I don't know a better way to test this, unless the test suite supports regex?
2018-04-01 12:54:02 +00:00
Phlosioneer
19eedf98ff Fix escaped backslash in windows file not found message
When a module is declared, but no matching file exists, rustc gives
an error like 'help: name the file either foo.rs or foo/mod.rs inside
the directory "src/bar"'. However, at on windows, the backslash was
double-escaped when naming the directory.

It did this because the string was printed in debug mode ( "{:?}" ) to
surround it with quotes. However, it should just be printed like any
other directory in an error message and surrounded by escaped quotes,
rather than relying on the debug print to add quotes ( "\"{}\"" ).
2018-03-29 07:15:58 -04:00
John Kåre Alsaker
c979189867 Make ParseSess thread-safe 2018-03-28 01:27:58 +02:00
John Kåre Alsaker
11ccc4cac6 Make LazyTokenStream thread-safe 2018-03-28 01:27:58 +02:00
kennytm
5eb4689d1f
Rollup merge of #49395 - petrochenkov:obsolete, r=alexcrichton
libsyntax: Remove obsolete.rs

This little piece of infra is obsolete (ha-ha) and is unlikely to be used in the future, even if new obsolete syntax appears.
2018-03-27 10:47:51 +02:00
Vadim Petrochenkov
604bbee84c libsyntax: Remove obsolete.rs 2018-03-27 00:45:28 +03:00
Vadim Petrochenkov
a637dd00c8 Fix pretty-printing for raw identifiers 2018-03-27 00:07:16 +03:00
Hidehito Yabuuchi
3bfed9e43f Better diagnostics for '..' pattern fragment not in the last position 2018-03-24 07:54:20 +09:00
Alex Crichton
82bb41bdab Merge branch 'master' of https://github.com/Lymia/rust into rollup 2018-03-23 10:16:40 -07:00
Lymia Aluysia
bfb94ac5f4
Clean up raw identifier handling when recovering tokens from AST. 2018-03-22 10:34:51 -05:00
kennytm
8d3f3f0cac
Rollup merge of #49117 - nivkner:fixme_fixup3, r=estebank
address some FIXME whose associated issues were marked as closed

part of #44366
2018-03-22 22:43:37 +08:00
bors
75af15ee6c Auto merge of #49190 - kennytm:rollup, r=kennytm
Rollup of 17 pull requests

- Successful merges: #46518, #48810, #48834, #48902, #49004, #49092, #49096, #49099, #49104, #49125, #49139, #49152, #49157, #49161, #49166, #49176, #49184
- Failed merges:
2018-03-20 10:18:34 +00:00
Vadim Petrochenkov
7c90189e13 Stabilize slice patterns without ..
Merge `feature(advanced_slice_patterns)` into `feature(slice_patterns)`
2018-03-20 02:27:40 +03:00
kennytm
73846dca7b
Rollup merge of #49104 - csmoe:semicolon_error, r=petrochenkov
improve error message of inner attribute syntax

Fixes #49040
2018-03-20 07:15:23 +08:00
Lymia Aluysia
5c3d6320de
Return a is_raw parameter from Token::ident rather than having separate methods. 2018-03-18 12:16:02 -05:00
Lymia Aluysia
d2e7953d13
Move raw_identifiers check to the lexer. 2018-03-18 11:21:38 -05:00
Lymia Aluysia
7d5c29b9ea
Feature gate raw identifiers. 2018-03-18 10:07:19 -05:00
Lymia Aluysia
fad1648e0f
Initial implementation of RFC 2151, Raw Identifiers 2018-03-18 10:07:19 -05:00
bors
5e3ecdce4e Auto merge of #48917 - petrochenkov:import, r=oli-obk
syntax: Make imports in AST closer to the source and cleanup their parsing

This is a continuation of https://github.com/rust-lang/rust/pull/45846 in some sense.
2018-03-18 01:50:52 +00:00
Vadim Petrochenkov
636357b09a Cleanup import parsing
Fix spans of root segments
2018-03-17 22:12:21 +03:00