diff --git a/doc/rust.md b/doc/rust.md index ccdf43eb54a3..9b7e87c7c496 100644 --- a/doc/rust.md +++ b/doc/rust.md @@ -102,12 +102,12 @@ Productions](#special-unicode-productions). Some rules in the grammar -- notably [unary operators](#unary-operator-expressions), [binary -operators](#binary-operator-expressions), [keywords](#keywords) and [reserved -words](#reserved-words) -- are given in a simplified form: as a listing of a -table of unquoted, printable whitespace-separated strings. These cases form a -subset of the rules regarding the [token](#tokens) rule, and are assumed to be -the result of a lexical-analysis phase feeding the parser, driven by a DFA, -operating over the disjunction of all such string table entries. +operators](#binary-operator-expressions), and [keywords](#keywords) -- +are given in a simplified form: as a listing of a table of unquoted, +printable whitespace-separated strings. These cases form a subset of +the rules regarding the [token](#tokens) rule, and are assumed to be +the result of a lexical-analysis phase feeding the parser, driven by a +DFA, operating over the disjunction of all such string table entries. When such a string enclosed in double-quotes (`"`) occurs inside the grammar, it is an implicit reference to a single member of such a string table @@ -139,8 +139,7 @@ The `ident` production is any nonempty Unicode string of the following form: - The first character has property `XID_start` - The remaining characters have property `XID_continue` -that does _not_ occur in the set of [keywords](#keywords) or [reserved -words](#reserved-words). +that does _not_ occur in the set of [keywords](#keywords). Note: `XID_start` and `XID_continue` as character properties cover the character ranges used to form the more familiar C and Java language-family @@ -190,7 +189,7 @@ with any other legal whitespace element, such as a single space character. ## Tokens ~~~~~~~~ {.ebnf .gram} -simple_token : keyword | reserved | unop | binop ; +simple_token : keyword | unop | binop ; token : simple_token | ident | literal | symbol | whitespace token ; ~~~~~~~~ @@ -204,52 +203,32 @@ grammar as double-quoted strings. Other tokens have exact rules given. The keywords in [crate files](#crate-files) are the following strings: ~~~~~~~~ {.keyword} -import export use mod dir +import export use mod ~~~~~~~~ The keywords in [source files](#source-files) are the following strings: -*TODO* split these between type keywords and regular (value) keywords, - and define two different `identifier` productions for the different - contexts. - ~~~~~~~~ {.keyword} -alt any as assert -be bind block bool break -char check claim const cont +alt assert +be break +check claim class const cont copy do else enum export -f32 f64 fail false float fn for -i16 i32 i64 i8 if iface impl import in int +fail false fn for +if iface impl import let log mod mutable -native note -of -prove pure +native +pure resource ret -self str syntax true type -u16 u32 u64 u8 uint unchecked unsafe use -vec +unsafe use while ~~~~~~~~ Any of these have special meaning in their respective grammars, and are excluded from the `ident` rule. -### Reserved words - -The reserved words are the following strings: - -~~~~~~~~ {.reserved} -m32 m64 m128 -f80 f16 f128 -class trait -~~~~~~~~ - -Any of these may have special meaning in future versions of the language, so -are excluded from the `ident` rule. - ### Literals A literal is an expression consisting of a single token, rather than a @@ -389,10 +368,6 @@ literal. There are three floating-point suffixes: `f` (for the base `float` type), `f32`, and `f64` (the 32-bit and 64-bit floating point types). -A set of suffixes are also reserved to accommodate literal support for -types corresponding to reserved tokens. The reserved suffixes are `f16`, -`f80`, `f128`, `m`, `m32`, `m64` and `m128`. - Examples of floating-point literals of various forms: ~~~~ @@ -421,8 +396,7 @@ Symbols are a general class of printable [token](#tokens) that play structural roles in a variety of grammar productions. They are catalogued here for completeness as the set of remaining miscellaneous printable tokens that do not otherwise appear as [unary operators](#unary-operator-expressions), [binary -operators](#binary-operator-expressions), [keywords](#keywords) or [reserved -words](#reserved-words). +operators](#binary-operator-expressions), or [keywords](#keywords). ## Paths @@ -1431,9 +1405,9 @@ rec_expr : '{' ident ':' expr A _[record](#record-types) expression_ is one or more comma-separated name-value pairs enclosed by braces. A fieldname can be any identifier -(including reserved words), and is separated from its value expression -by a colon. To indicate that a field is mutable, the `mutable` keyword -is written before its name. +(including keywords), and is separated from its value expression by a +colon. To indicate that a field is mutable, the `mutable` keyword is +written before its name. ~~~~ {x: 10f, y: 20f}; diff --git a/src/comp/syntax/parse/parser.rs b/src/comp/syntax/parse/parser.rs index 65611c2f30f0..33b6432a09cb 100644 --- a/src/comp/syntax/parse/parser.rs +++ b/src/comp/syntax/parse/parser.rs @@ -147,12 +147,12 @@ fn new_parser(sess: parse_sess, cfg: ast::crate_cfg, rdr: reader, // interpreted as a specific kind of statement, which would be confusing. fn bad_expr_word_table() -> hashmap { let words = new_str_hash(); - for word in ["mod", "if", "else", "while", "do", "alt", "for", "break", - "cont", "ret", "be", "fail", "type", "resource", "check", - "assert", "claim", "native", "fn", "pure", - "unsafe", "import", "export", "let", "const", - "log", "copy", "impl", "iface", "enum", - "class", "trait"] { + for word in ["alt", "assert", "be", "break", "check", "claim", + "class", "const", "cont", "copy", "do", "else", "enum", + "export", "fail", "fn", "for", "if", "iface", "impl", + "import", "let", "log", "mod", "mutable", "native", "pure", + "resource", "ret", "trait", "type", "unchecked", "unsafe", + "while"] { words.insert(word, ()); } words