Commit graph

442 commits

Author SHA1 Message Date
Nicholas Nethercote
e2631208b1 Rename StringReader::curr as ch.
Likewise, rename StringReader::curr_is as ch_is.

This is a [breaking-change] for libsyntax.
2016-10-05 08:57:35 +11:00
Nicholas Nethercote
cb92f5c6d6 Rename StringReader::last_pos as pos.
This is a [breaking-change] for libsyntax.
2016-10-05 08:56:57 +11:00
Nicholas Nethercote
94565a4409 Rename StringReader::pos as next_pos.
This is a [breaking-change] for libsyntax.
2016-10-05 08:55:25 +11:00
Nicholas Nethercote
9e3dcb4549 Simplify start_bpos calculation in scan_comment().
The two branches of this `if` compute the same value. This commit gets
rid of the first branch, which makes this calculation identical to the
one in scan_block_comment().
2016-10-03 19:02:33 +11:00
Nicholas Nethercote
49960ad250 Streamline StringReader::bump.
First, assert! is redundant w.r.t. the unwrap() immediately afterwards.

Second, `byte_offset_diff` is effectively computed as
`current_byte_offset + ch.len_utf8() - current_byte_offset` (with `next`
as an intermediate) which is silly and can be simplified.
2016-10-03 18:57:18 +11:00
Nick Cameron
3863834d9c reviewer comments and rebasing 2016-09-23 07:19:31 +12:00
Nick Cameron
6a2d2c9495 Adds a ProcMacro form of syntax extension
This commit adds syntax extension forms matching the types for procedural macros 2.0 (RFC #1566), these still require the usual syntax extension boiler plate, but this is a first step towards proper implementation and should be useful for macros 1.1 stuff too.

Supports both attribute-like and function-like macros.
2016-09-22 08:47:57 +12:00
Jonathan Turner
c350ec7bb3 Fix old call in lexer tests 2016-08-07 07:50:27 -07:00
Jonathan Turner
8f044fae36 Remove BasicEmitter 2016-07-14 07:57:46 -04:00
Zack M. Davis
d37edef9dd prefer if let to match with None => {} arm in some places
This is a spiritual succesor to #34268/8531d581, in which we replaced a
number of matches of None to the unit value with `if let` conditionals
where it was judged that this made for clearer/simpler code (as would be
recommended by Manishearth/rust-clippy's `single_match` lint). The same
rationale applies to matches of None to the empty block.
2016-07-03 16:27:02 -07:00
Jonathan Turner
80f1c78752 make old school mode a bit more configurable 2016-06-23 15:19:40 -04:00
Jonathan Turner
51deb4fedb Address more travis errors 2016-06-23 08:07:35 -04:00
Jonathan Turner
6ae3502134 Move errors from libsyntax to its own crate 2016-06-23 08:07:35 -04:00
Ted Mielczarek
24e7491660 Add an abs_path member to FileMap, use it when writing debug info.
When items are inlined from extern crates, the filename in the debug info
is taken from the FileMap that's serialized in the rlib metadata.
Currently this is just FileMap.name, which is whatever path is passed to rustc.
Since libcore and libstd are built by invoking rustc with relative paths,
they wind up with relative paths in the rlib, and when linked into a binary
the debug info uses relative paths for the names, but since the compilation
directory for the final binary, tools trying to read source filenames
will wind up with bad paths. We noticed this in Firefox with source
filenames from libcore/libstd having bad paths.

This change stores an absolute path in FileMap.abs_path, and uses that
if available for writing debug info. This is not going to magically make
debuggers able to find the source, but it will at least provide sensible
paths.
2016-06-16 18:08:46 +01:00
bors
413bafdabf Auto merge of #33128 - xen0n:more-confusing-unicode-chars, r=nagisa
Add more aliases for Unicode confusable chars

Building upon #29837, this PR:

* added aliases for space characters,
* distinguished square brackets from parens, and
* added common CJK punctuation characters as aliases.

This will especially help CJK users who may have forgotten to switch off IME when coding.
2016-05-05 08:50:23 -07:00
Georg Brandl
9e2300015b lexer: do not display char confusingly in error message
Current code leads to messages like "... use a \xHH escape: \u{e4}"
which is confusing.

The printed span already points to the offending character, which
should be enough to identify the non-ASCII problem.

Fixes: #29088
2016-05-02 07:45:57 +02:00
mitaa
6887202ea3 Make some fatal lexer errors recoverable 2016-04-27 20:48:18 +02:00
Vadim Petrochenkov
6c44bea644 syntax: Check paths in visibilities for type parameters
syntax: Merge PathParsingMode::NoTypesAllowed and PathParsingMode::ImportPrefix
syntax: Rename PathParsingMode and its variants to better express their purpose
syntax: Remove obsolete error message about 'self lifetime
syntax: Remove ALLOW_MODULE_PATHS workaround
syntax/resolve: Adjust some error messages
resolve: Compare unhygienic (not renamed) names with keywords::Invalid, invalid identifiers may appear to be valid after renaming
2016-04-24 20:59:44 +03:00
Vadim Petrochenkov
546c052d22 syntax: Get rid of token::IdentStyle 2016-04-24 20:59:44 +03:00
Wang Xuerui
496081c5c7 add more confusable CJK square bracket aliases 2016-04-21 20:17:51 +08:00
Wang Xuerui
47d5c90fbe correct aliases for square brackets 2016-04-21 20:09:26 +08:00
Wang Xuerui
5f70e8f6cd add confusable space characters 2016-04-21 20:05:47 +08:00
Wang Xuerui
dbe700ef65 add more characters easily inputtable with CJK IMEs 2016-04-21 17:51:47 +08:00
bors
8b7c3f20e8 Auto merge of #29734 - Ryman:whitespace_consistency, r=Aatch
libsyntax: be more accepting of whitespace in lexer

Fixes #29590.

Perhaps this may need more thorough testing?

r? @Aatch
2016-03-07 20:06:17 -08:00
Kevin Butler
24578e0fe5 libsyntax: accept only whitespace with the PATTERN_WHITE_SPACE property
This aligns with unicode recommendations and should be stable for all future
unicode releases. See http://unicode.org/reports/tr31/#R3.

This renames `libsyntax::lexer::is_whitespace` to `is_pattern_whitespace`
so potentially breaks users of libsyntax.
2016-01-16 00:57:12 +00:00
Kevin Butler
8a27230102 libsyntax: use char::is_whitespace instead of custom implementations
Fixes #29590.
2016-01-14 22:47:50 +00:00
Greg Chapple
acc9428c6a Display better snippet for invalid char literal
Given this code:

    fn main() {
        let _ = 'abcd';
    }

The compiler would give a message like:

    error: character literal may only contain one codepoint: ';
    let _ = 'abcd';
                 ^~

With this change, the message now displays:

    error: character literal may only contain one codepoint: 'abcd'
    let _ = 'abcd'
            ^~~~~~

Fixes #30033
2016-01-14 17:34:42 +00:00
Tshepang Lekhonkhobe
aa3b4c668e re-instate comment that was mysteriously disappeared 2016-01-12 21:00:09 +02:00
Tshepang Lekhonkhobe
249b5c0b4a address review comment 2016-01-04 21:35:06 +02:00
Tshepang Lekhonkhobe
4a1062873e fix "make tidy" failure 2016-01-03 11:20:06 +02:00
Tshepang Lekhonkhobe
f20a139981 run rustfmt on syntax::parse::lexer 2016-01-03 11:14:09 +02:00
Nick Cameron
95dc7efad0 use structured errors 2015-12-30 14:27:59 +13:00
Nick Cameron
ff0c74f7d4 test errors 2015-12-17 10:00:16 +13:00
Nick Cameron
6309b0f5bb move error handling from libsyntax/diagnostics.rs to libsyntax/errors/*
Also split out emitters into their own module.
2015-12-17 09:35:50 +13:00
Huon Wilson
41f7f0c341 Add some unicode aliases for ". 2015-11-18 17:16:32 +11:00
Ravi Shankar
7f63c7cf4c Detect confusing unicode characters and show the alternative 2015-11-17 12:14:28 +05:30
Steve Klabnik
00e9ad1df8 Improve error message for char literals
If you try to put something that's bigger than a char into a char
literal, you get an error:

    fn main() {
        let c = 'ஶ்ரீ';
    }

    error: unterminated character constant:

This is a very compiler-centric message. Yes, it's technically
'unterminated', but that's not what you, the user did wrong.

Instead, this commit changes it to

    error: character literal may only contain one codepoint

As this actually tells you what went wrong.

Fixes #28851
2015-11-05 09:34:14 +01:00
Eli Friedman
329e487e58 Start pushing panics outward in lexer. 2015-10-27 20:09:10 -07:00
Barosl Lee
c7fa52df34 Prevent /**/ from being parsed as a doc comment
Previously, `/**/` was incorrectly regarded as a doc comment because it
starts with `/**` and ends with `*/`. However, this caused an ICE
because some code assumed that the length of a doc comment is at least
5. This commit adds an additional check to `is_block_doc_comment` that
tests the length of the input.

Fixes #28844.
2015-10-10 04:49:31 +09:00
Ms2ger
b093060c2a Stop re-exporting AttrStyle's variants and rename them. 2015-10-01 18:03:34 +02:00
Alex Crichton
48615a68fb std: Account for CRLF in {str, BufRead}::lines
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of
the `str::lines` and `BufRead::lines` iterators. Both iterators now account for
`\r\n` sequences in addition to `\n`, allowing for less surprising behavior
across platforms (especially in the `BufRead` case). Splitting *only* on the
`\n` character can still be achieved with `split('\n')` in both cases.

The `str::lines_any` function is also now deprecated as `str::lines` is a
drop-in replacement for it.

[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md

Closes #28032
2015-09-03 23:01:41 -07:00
Vadim Petrochenkov
405c616eaf Use consistent terminology for byte string literals
Avoid confusion with binary integer literals and binary operator expressions in libsyntax
2015-09-03 10:54:53 +03:00
Simonas Kazlauskas
cca0ea718d Replace illegal with invalid in most diagnostics 2015-07-29 01:59:31 +03:00
Nick Cameron
f47d20aecd Use a span from the correct file for the inner span of a module
This basically only affects modules which are empty (or only contain comments).

Closes #26755
2015-07-21 21:55:19 +12:00
bors
07be6299d8 Auto merge of #26947 - nagisa:unicode-escape-error, r=nrc
Inspired by the now-mysteriously-closed https://github.com/rust-lang/rust/pull/26782.

This PR introduces better error messages when unicode escapes have invalid format (e.g. `\uFFFF`). It also makes rustc always tell the user that escape may not be used in byte-strings and bytes and fixes some spans to not include unecessary characters and include escape backslash in some others.
2015-07-13 04:00:49 +00:00
Simonas Kazlauskas
4d65ef4549 Tell unicode escapes can’t be used as bytes earlier/more 2015-07-13 02:09:22 +03:00
Wesley Wiser
93ddee6cee Change some instances of .connect() to .join() 2015-07-10 19:40:46 -04:00
Simonas Kazlauskas
d22f189da1 Improve some of the string escape diagnostic spans 2015-07-10 22:28:51 +03:00
Simonas Kazlauskas
0bd5dd6449 Improve incomplete unicode escape reporting
This improves diagnostic messages when \u escape is used incorrectly and { is
missing. Instead of saying “unknown character escape: u”, it will now report
that unicode escape sequence is incomplete and suggest what the correct syntax
is.
2015-07-10 22:26:19 +03:00
Ariel Ben-Yehuda
a18d9842ed Make the unused_mut lint smarter with respect to locals.
Fixes #26332
2015-07-01 00:12:12 +03:00