Commit graph

170 commits

Author SHA1 Message Date
Vadim Petrochenkov
546c052d22 syntax: Get rid of token::IdentStyle 2016-04-24 20:59:44 +03:00
bors
8b7c3f20e8 Auto merge of #29734 - Ryman:whitespace_consistency, r=Aatch
libsyntax: be more accepting of whitespace in lexer

Fixes #29590.

Perhaps this may need more thorough testing?

r? @Aatch
2016-03-07 20:06:17 -08:00
Kevin Butler
24578e0fe5 libsyntax: accept only whitespace with the PATTERN_WHITE_SPACE property
This aligns with unicode recommendations and should be stable for all future
unicode releases. See http://unicode.org/reports/tr31/#R3.

This renames `libsyntax::lexer::is_whitespace` to `is_pattern_whitespace`
so potentially breaks users of libsyntax.
2016-01-16 00:57:12 +00:00
Kevin Butler
8a27230102 libsyntax: use char::is_whitespace instead of custom implementations
Fixes #29590.
2016-01-14 22:47:50 +00:00
Greg Chapple
acc9428c6a Display better snippet for invalid char literal
Given this code:

    fn main() {
        let _ = 'abcd';
    }

The compiler would give a message like:

    error: character literal may only contain one codepoint: ';
    let _ = 'abcd';
                 ^~

With this change, the message now displays:

    error: character literal may only contain one codepoint: 'abcd'
    let _ = 'abcd'
            ^~~~~~

Fixes #30033
2016-01-14 17:34:42 +00:00
Tshepang Lekhonkhobe
aa3b4c668e re-instate comment that was mysteriously disappeared 2016-01-12 21:00:09 +02:00
Tshepang Lekhonkhobe
249b5c0b4a address review comment 2016-01-04 21:35:06 +02:00
Tshepang Lekhonkhobe
4a1062873e fix "make tidy" failure 2016-01-03 11:20:06 +02:00
Tshepang Lekhonkhobe
f20a139981 run rustfmt on syntax::parse::lexer 2016-01-03 11:14:09 +02:00
Nick Cameron
95dc7efad0 use structured errors 2015-12-30 14:27:59 +13:00
Nick Cameron
ff0c74f7d4 test errors 2015-12-17 10:00:16 +13:00
Nick Cameron
6309b0f5bb move error handling from libsyntax/diagnostics.rs to libsyntax/errors/*
Also split out emitters into their own module.
2015-12-17 09:35:50 +13:00
Huon Wilson
41f7f0c341 Add some unicode aliases for ". 2015-11-18 17:16:32 +11:00
Ravi Shankar
7f63c7cf4c Detect confusing unicode characters and show the alternative 2015-11-17 12:14:28 +05:30
Steve Klabnik
00e9ad1df8 Improve error message for char literals
If you try to put something that's bigger than a char into a char
literal, you get an error:

    fn main() {
        let c = 'ஶ்ரீ';
    }

    error: unterminated character constant:

This is a very compiler-centric message. Yes, it's technically
'unterminated', but that's not what you, the user did wrong.

Instead, this commit changes it to

    error: character literal may only contain one codepoint

As this actually tells you what went wrong.

Fixes #28851
2015-11-05 09:34:14 +01:00
Eli Friedman
329e487e58 Start pushing panics outward in lexer. 2015-10-27 20:09:10 -07:00
Barosl Lee
c7fa52df34 Prevent /**/ from being parsed as a doc comment
Previously, `/**/` was incorrectly regarded as a doc comment because it
starts with `/**` and ends with `*/`. However, this caused an ICE
because some code assumed that the length of a doc comment is at least
5. This commit adds an additional check to `is_block_doc_comment` that
tests the length of the input.

Fixes #28844.
2015-10-10 04:49:31 +09:00
Ms2ger
b093060c2a Stop re-exporting AttrStyle's variants and rename them. 2015-10-01 18:03:34 +02:00
Alex Crichton
48615a68fb std: Account for CRLF in {str, BufRead}::lines
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of
the `str::lines` and `BufRead::lines` iterators. Both iterators now account for
`\r\n` sequences in addition to `\n`, allowing for less surprising behavior
across platforms (especially in the `BufRead` case). Splitting *only* on the
`\n` character can still be achieved with `split('\n')` in both cases.

The `str::lines_any` function is also now deprecated as `str::lines` is a
drop-in replacement for it.

[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md

Closes #28032
2015-09-03 23:01:41 -07:00
Vadim Petrochenkov
405c616eaf Use consistent terminology for byte string literals
Avoid confusion with binary integer literals and binary operator expressions in libsyntax
2015-09-03 10:54:53 +03:00
Simonas Kazlauskas
cca0ea718d Replace illegal with invalid in most diagnostics 2015-07-29 01:59:31 +03:00
Nick Cameron
f47d20aecd Use a span from the correct file for the inner span of a module
This basically only affects modules which are empty (or only contain comments).

Closes #26755
2015-07-21 21:55:19 +12:00
bors
07be6299d8 Auto merge of #26947 - nagisa:unicode-escape-error, r=nrc
Inspired by the now-mysteriously-closed https://github.com/rust-lang/rust/pull/26782.

This PR introduces better error messages when unicode escapes have invalid format (e.g. `\uFFFF`). It also makes rustc always tell the user that escape may not be used in byte-strings and bytes and fixes some spans to not include unecessary characters and include escape backslash in some others.
2015-07-13 04:00:49 +00:00
Simonas Kazlauskas
4d65ef4549 Tell unicode escapes can’t be used as bytes earlier/more 2015-07-13 02:09:22 +03:00
Wesley Wiser
93ddee6cee Change some instances of .connect() to .join() 2015-07-10 19:40:46 -04:00
Simonas Kazlauskas
d22f189da1 Improve some of the string escape diagnostic spans 2015-07-10 22:28:51 +03:00
Simonas Kazlauskas
0bd5dd6449 Improve incomplete unicode escape reporting
This improves diagnostic messages when \u escape is used incorrectly and { is
missing. Instead of saying “unknown character escape: u”, it will now report
that unicode escape sequence is incomplete and suggest what the correct syntax
is.
2015-07-10 22:26:19 +03:00
Ariel Ben-Yehuda
a18d9842ed Make the unused_mut lint smarter with respect to locals.
Fixes #26332
2015-07-01 00:12:12 +03:00
Yongqian Li
f21682ba2d fix minor indentation issues 2015-06-22 15:30:56 -07:00
bors
c23a9d42ea Auto merge of #25387 - eddyb:syn-file-loader, r=nikomatsakis
This allows compiling entire crates from memory or preprocessing source files before they are tokenized.

Minor API refactoring included, which is a [breaking-change] for libsyntax users:
* `ParseSess::{next_node_id, reserve_node_ids}` moved to rustc's `Session`
* `new_parse_sess` -> `ParseSess::new`
* `new_parse_sess_special_handler` -> `ParseSess::with_span_handler`
* `mk_span_handler` -> `SpanHandler::new`
* `default_handler` -> `Handler::new`
* `mk_handler` -> `Handler::with_emitter`
* `string_to_filemap(sess source, path)` -> `sess.codemap().new_filemap(path, source)`
2015-05-17 00:05:34 +00:00
Lee Jeffery
2dcc200be0 Fix stupid mistake from previous commit 2015-05-14 18:28:28 +01:00
Lee Jeffery
93af5f9b44 Make BytePos calculation same as original 2015-05-14 18:19:51 +01:00
Eduard Burtescu
f786437bd2 syntax: refactor (Span)Handler and ParseSess constructors to be methods. 2015-05-14 01:47:56 +03:00
Lee Jeffery
4f82c3151b Added test to check that newlines are stripped from comments 2015-05-13 22:06:26 +01:00
Lee Jeffery
aef0581513 Fix byte offset and error message inconsistencies 2015-05-13 22:05:01 +01:00
Lee Jeffery
a76244fcef Fix CRLF line-ending parsing for comments. 2015-05-08 20:33:58 +01:00
Geoffry Song
2d9831dea5 Interpolate AST nodes in quasiquote.
This changes the `ToTokens` implementations for expressions, statements,
etc. with almost-trivial ones that produce `Interpolated(*Nt(...))`
pseudo-tokens. In this way, quasiquote now works the same way as macros
do: already-parsed AST fragments are used as-is, not reparsed.

The `ToSource` trait is removed. Quasiquote no longer involves
pretty-printing at all, which removes the need for the
`encode_with_hygiene` hack. All associated machinery is removed.

A new `Nonterminal` is added, NtArm, which the parser now interpolates.
This is just for quasiquote, not macros (although it could be in the
future).

`ToTokens` is no longer implemented for `Arg` (although this could be
added again) and `Generics` (which I don't think makes sense).

This breaks any compiler extensions that relied on the ability of
`ToTokens` to turn AST fragments back into inspectable token trees. For
this reason, this closes #16987.

As such, this is a [breaking-change].

Fixes #16472.
Fixes #15962.
Fixes #17397.
Fixes #16617.
2015-04-25 21:42:10 -04:00
Johannes Oertel
07cc7d9960 Change name of unit test sub-module to "tests".
Changes the style guidelines regarding unit tests to recommend using a
sub-module named "tests" instead of "test" for unit tests as "test"
might clash with imports of libtest.
2015-04-24 23:06:41 +02:00
Erick Tryzelaar
19c8d70174 syntax: Copy unstable str::char_at into libsyntax 2015-04-21 10:23:53 -07:00
Erick Tryzelaar
2937cce70c syntax: Replace String::from_str with the stable String::from 2015-04-21 10:08:27 -07:00
Erick Tryzelaar
cfb9d286ea syntax: remove uses of .into_cow() 2015-04-21 10:08:26 -07:00
Tamir Duberstein
10f15e72e6 Negative case of len() -> is_empty()
`s/([^\(\s]+\.)len\(\) [(?:!=)>] 0/!$1is_empty()/g`
2015-04-14 20:26:03 -07:00
Manuel Hoffmann
4abade50d7 Added a help span which informs the user about the escaping of curly braces in a format string if a wrongly escaped one is detected in a string. 2015-04-13 15:56:10 +02:00
Phil Dawes
b2bcb7229a Work towards a non-panicing parser (libsyntax)
- Functions in parser.rs return PResult<> rather than panicing
- Other functions in libsyntax call panic! explicitly for now if they rely on panicing behaviour.
- 'panictry!' macro added as scaffolding while converting panicing functions.
  (This does the same as 'unwrap()' but is easier to grep for and turn into try!())
- Leaves panicing wrappers for the following functions so that the
  quote_* macros behave the same:
  - parse_expr, parse_item, parse_pat, parse_arm, parse_ty, parse_stmt
2015-04-05 09:52:50 +01:00
Alex Crichton
e3f2d45cb3 rollup merge of #23872: huonw/eager-lexing
Conflicts:
	src/libsyntax/parse/lexer/mod.rs
2015-03-31 15:53:10 -07:00
Aaron Turon
232424d995 Stabilize std::num
This commit stabilizes the `std::num` module:

* The `Int` and `Float` traits are deprecated in favor of (1) the
  newly-added inherent methods and (2) the generic traits available in
  rust-lang/num.

* The `Zero` and `One` traits are reintroduced in `std::num`, which
  together with various other traits allow you to recover the most
  common forms of generic programming.

* The `FromStrRadix` trait, and associated free function, is deprecated
  in favor of inherent implementations.

* A wide range of methods and constants for both integers and floating
  point numbers are now `#[stable]`, having been adjusted for integer
  guidelines.

* `is_positive` and `is_negative` are renamed to `is_sign_positive` and
  `is_sign_negative`, in order to address #22985

* The `Wrapping` type is moved to `std::num` and stabilized;
  `WrappingOps` is deprecated in favor of inherent methods on the
  integer types, and direct implementation of operations on
  `Wrapping<X>` for each concrete integer type `X`.

Closes #22985
Closes #21069

[breaking-change]
2015-03-31 07:50:25 -07:00
Huon Wilson
606f50c46d Lex binary and octal literals more eagerly.
Previously 0b12 was considered two tokens, 0b1 and 2, as 2 isn't a valid
base 2 digit. This patch changes that to collapse them into one (and
makes `0b12` etc. an error: 2 isn't a valid base 2 digit).

This may break some macro invocations of macros with `tt` (or syntax
extensions) that rely on adjacent digits being separate tokens and hence
is a

[breaking-change]

The fix is to separate the tokens, e.g. `0b12` -> `0b1 2`.

cc https://github.com/rust-lang/rfcs/pull/879
2015-03-31 12:16:42 +11:00
Manish Goregaokar
5eb4be4c56 Rollup merge of #23803 - richo:unused-braces, r=Manishearth
Pretty much what it says on the tin.
2015-03-28 18:12:06 +05:30
Richo Healey
cbce6bfbdb cleanup: Remove unused braces in use statements 2015-03-28 02:23:20 -07:00
Florian Hahn
afaa3b6a20 Prevent ICEs when parsing invalid escapes, closes #23620 2015-03-27 17:47:16 +01:00