libsyntax: accept only whitespace with the PATTERN_WHITE_SPACE property

This aligns with unicode recommendations and should be stable for all future
unicode releases. See http://unicode.org/reports/tr31/#R3.

This renames `libsyntax::lexer::is_whitespace` to `is_pattern_whitespace`
so potentially breaks users of libsyntax.
This commit is contained in:
Kevin Butler 2015-11-12 02:43:43 +00:00
parent 9e3e43f3f6
commit 24578e0fe5
9 changed files with 57 additions and 36 deletions

View file

@ -9,10 +9,14 @@
// except according to those terms.
// Beware editing: it has numerous whitespace characters which are important
// Beware editing: it has numerous whitespace characters which are important.
// It contains one ranges from the 'PATTERN_WHITE_SPACE' property outlined in
// http://unicode.org/Public/UNIDATA/PropList.txt
//
// The characters in the first expression of the assertion can be generated
// from: "4\u{0C}+\n\t\r7\t*\u{20}2\u{85}/\u{200E}3\u{200F}*\u{2028}2\u{2029}"
pub fn main() {
assert_eq!(4 +  7 * 2
assert_eq!(4 +
/3*2, 4 + 7 * 2 / 3 * 2);
7 * 2…/3*2, 4 + 7 * 2 / 3 * 2);
}