diff --git a/doc/rust.texi b/doc/rust.texi deleted file mode 100644 index 68aaa414a692..000000000000 --- a/doc/rust.texi +++ /dev/null @@ -1,3589 +0,0 @@ -\input texinfo @c -*-texinfo-*- -@c %**start of header -@setfilename rust.info -@settitle Rust Documentation -@setchapternewpage odd -@c %**end of header - -@include version.texi - -@ifinfo -This manual is for the ``Rust'' programming language. - - -@uref{http://www.rust-lang.org} - -Version: @gitversion - - -Copyright 2006-2010 Graydon Hoare - -Copyright 2009-2011 Mozilla Foundation - -See accompanying LICENSE.txt for terms. - -@end ifinfo - -@dircategory Programming -@direntry -* rust: (rust). Rust programming language -@end direntry - -@titlepage -@title Rust -@subtitle A safe, concurrent, practical language. -@author Graydon Hoare -@author Mozilla Foundation - -@page -@vskip 0pt plus 1filll - - -@uref{http://rust-lang.org} - -Version: @gitversion - -@sp 2 - -Copyright @copyright{} 2006-2010 Graydon Hoare - -Copyright @copyright{} 2009-2011 Mozilla Foundation - -See accompanying LICENSE.txt for terms. - -@end titlepage - -@everyfooting @| @emph{-- Draft @today --} @| - -@ifnottex -@node Top -@top Top - -Rust Documentation - -@end ifnottex - -@menu -* Disclaimer:: Notes on a work in progress. -* Introduction:: Background, intentions, lineage. -* Tutorial:: Gentle introduction to reading Rust code. -* Reference:: Systematic reference of language elements. -* Index:: Index -@end menu - -@ifnottex -Complete table of contents -@end ifnottex - -@contents - -@c ############################################################ -@c Disclaimer -@c ############################################################ - -@node Disclaimer -@chapter Disclaimer - -To the reader, - -Rust is a work in progress. The language continues to evolve as the design -shifts and is fleshed out in working code. Certain parts work, certain parts -do not, certain parts will be removed or changed. - -This manual is a snapshot written in the present tense. Some features -described do not yet exist in working code. Some may be temporary. It -is a @emph{draft}, and we ask that you not take anything you read here -as either definitive or final. The manual is to help you get a sense -of the language and its organization, not to serve as a complete -specification. At least not yet. - -If you have suggestions to make, please try to focus them on @emph{reductions} -to the language: possible features that can be combined or omitted. At this -point, every ``additive'' feature we're likely to support is already on the -table. The task ahead involves combining, trimming, and implementing. - - -@c ############################################################ -@c Introduction -@c ############################################################ - -@node Introduction -@chapter Introduction - -@quotation - We have to fight chaos, and the most effective way of doing that is - to prevent its emergence. -@flushright - - Edsger Dijkstra -@end flushright -@end quotation -@sp 2 - -Rust is a curly-brace, block-structured expression language. It visually -resembles the C language family, but differs significantly in syntactic and -semantic details. Its design is oriented toward concerns of ``programming in -the large'', that is, of creating and maintaining @emph{boundaries} -- both -abstract and operational -- that preserve large-system @emph{integrity}, -@emph{availability} and @emph{concurrency}. - -It supports a mixture of imperative procedural, concurrent actor, -object-oriented and pure functional styles. Rust also supports generic -programming and metaprogramming, in both static and dynamic styles. - -@menu -* Goals:: Intentions, motivations. -* Sales Pitch:: A summary for the impatient. -* Influences:: Relationship to past languages. -@end menu - - -@node Goals -@section Goals - -The language design pursues the following goals: - -@sp 1 -@itemize -@item Compile-time error detection and prevention. -@item Run-time fault tolerance and containment. -@item System building, analysis and maintenance affordances. -@item Clarity and precision of expression. -@item Implementation simplicity. -@item Run-time efficiency. -@item High concurrency. -@end itemize -@sp 1 - -Note that most of these goals are @emph{engineering} goals, not showcases for -sophisticated language technology. Most of the technology in Rust is -@emph{old} and has been seen decades earlier in other languages. - -All new languages are developed in a technological context. Rust's goals arise -from the context of writing large programs that interact with the internet -- -both servers and clients -- and are thus much more concerned with -@emph{safety} and @emph{concurrency} than older generations of program. Our -experience is that these two forces do not conflict; rather they drive system -design decisions toward extensive use of @emph{partitioning} and -@emph{statelessness}. Rust aims to make these a more natural part of writing -programs, within the niche of lower-level, practical, resource-conscious -languages. - - -@page -@node Sales Pitch -@section Sales Pitch - -The following comprises a brief ``sales pitch'' overview of the salient -features of Rust, relative to other languages. - -@itemize - -@sp 1 -@item No @code{null} pointers - -The initialization state of every slot is statically computed as part of the -typestate system (see below), and requires that all slots are initialized -before use. There is no @code{null} value; uninitialized slots are -uninitialized and can only be written to, not read. - -The common use for @code{null} in other languages -- as a sentinel value -- is -subsumed into the more general facility of disjoint union types. A program -must explicitly model its use of such types. - -@sp 1 -@item Lightweight tasks with no shared values - -Like many @emph{actor} languages, Rust provides an isolation (and concurrency) -model based on lightweight tasks scheduled by the language runtime. These -tasks are very inexpensive and statically unable to manipulate one another's -local memory. Breaking the rule of task isolation is possible only by calling -external (C/C++) code. - -Inter-task communication is typed, asynchronous, and simplex, based on passing -messages over channels to ports. - -@sp 1 -@item Predictable native code, simple runtime - -The meaning and cost of every operation within a Rust program is intended to -be easy to model for the reader. The code should not ``surprise'' the -programmer once it has been compiled. - -Rust compiles to native code. Rust compilation units are large and the -compilation model is designed around multi-file, whole-library or -whole-program optimization. The compiled units are standard loadable objects -(ELF, PE, Mach-O) containing standard debug information (DWARF) and are -compatible with existing, standard low-level tools (disassemblers, debuggers, -profilers, dynamic loaders). The compiled units include custom metadata that -carries full type and version information. - -The Rust runtime library is a small collection of support code for scheduling, -memory management, inter-task communication and logging. This library is -written in standard C++ and is quite straightforward. It presents a simple -interface to embeddings. No research-level virtual machine, JIT or garbage -collection technology is required. It should be relatively easy to adapt a -Rust front-end on to many existing native toolchains. - -@sp 1 -@item Integrated system-construction facility - -The units of compilation of Rust are multi-file amalgamations called -@emph{crates}. A crate is described by a separate, declarative type of source -file that guides the compilation of the crate, its packaging, its versioning, -and its external dependencies. Crates are also the units of distribution and -loading. Significantly: the dependency graph of crates is @emph{acyclic} and -@emph{anonymous}: there is no global namespace for crates, and module-level -recursion cannot cross crate barriers. - -Unlike many languages, individual modules do @emph{not} carry all the -mechanisms or restrictions of crates. Modules and crates serve different -roles. - -@sp 1 -@item Static control over memory allocation, packing and aliasing. - -Many values in Rust are allocated @emph{within} their containing stack-frame -or parent structure. Numbers, records, tuples and tags are all allocated this -way. To allocate such values in the heap, they must be explicitly -@emph{boxed}. A @dfn{box} is a pointer to a heap allocation that holds another -value, its @emph{content}. Boxes may be either shared or unique, depending -on which sort of storage management is desired. - -Boxing and unboxing in Rust is explicit, though in some cases (such as -name-component dereferencing) Rust will automatically dereference a -box to access its content. Box values can be passed and assigned -independently, like pointers in C; the difference is that in Rust they always -point to live contents, and are not subject to pointer arithmetic. - -In addition to boxes, Rust supports a kind of pass-by-pointer slot called a -reference. Forming or releasing a reference does not perform reference-count -operations; references can only be formed on values that will provably outlive -the reference. References are not ``general values'', in the sense that they -cannot be independently manipulated. They are a lot like C++'s references, -except that they are safe: the compiler ensures that they always point to live -values. - -In addition, every slot (stack-local allocation or reference) has a static -initialization state that is calculated by the typestate system. This permits -late initialization of slots in functions with complex control-flow, while -still guaranteeing that every use of a slot occurs after it has been -initialized. - -@sp 1 -@item Immutable data by default - -All types in Rust are immutable by default. A field within a type must be -declared as @code{mutable} in order to be modified. - -@sp 1 -@item Move semantics and unique pointers - -Rust differentiates copying values from moving them, and permits moving and -swapping values explicitly rather than copying. Moving can be more efficient and, -crucially, represents an indivisible transfer of ownership of a value from its -source to its destination. - -In addition, pointer types in Rust come in several varieties. One important -type of pointer related to move semantics is the @emph{unique} pointer, -denoted @code{~}, which is statically guaranteed to be the only pointer -pointing to its referent at any given time. - -Combining move-semantics and unique pointers, Rust permits a very lightweight -form of inter-task communication: values are sent between tasks by moving, and -only types composed of unique pointers can be sent. This statically ensures -there can never be sharing of data between tasks, while keeping the costs of -transferring data between tasks as cheap as moving a pointer. - -@sp 1 -@item Efficient closures - -Rust provides a variety of closure types, including a type that is guaranteed -not to escape to the heap. This is represented as just a stack-frame pointer -and a code pointer. Passing such ``downward'' closures into library code makes -for very efficient iteration and accessor-function patterns. - -@sp 1 -@item Direct interface to C code - -Rust can load and call many C library functions simply by declaring -them. Calling a C function is an ``unsafe'' action, and can only be taken -within a block marked with the @code{unsafe} keyword. Every unsafe block -in a Rust compilation unit must be explicitly authorized in the crate file. - -@sp 1 -@item Structural algebraic data types - -The Rust type system is primarily structural, and contains the standard -assortment of useful ``algebraic'' type constructors from functional -languages, such as function types, tuples, record types, vectors, and -nominally-tagged disjoint unions. Such values may be @emph{pattern-matched} in -an @code{alt} expression. - -@sp 1 -@item Generic code - -Rust supports a simple form of parametric polymorphism: functions, types and -objects can be parametrized by other types. - -@sp 1 -@item Argument binding - -Rust provides a mechanism of partially binding arguments to functions, -producing new functions that accept the remaining un-bound arguments. This -mechanism combines some of the features of lexical closures with some of the -features of currying, in a smaller and simpler package. - -@sp 1 -@item Local type inference - -To save some quantity of programmer key-pressing, Rust supports local type -inference: signatures of functions and objects always require type annotation, -but within the body of a function many slots can be declared without a type, -and Rust will infer the slot's type from its uses. - -@sp 1 -@item Structural object system - -Rust has a lightweight object system based on structural object types: there -is no ``class hierarchy'' nor any concept of inheritance. Method overriding -and object restriction are performed explicitly on object values, which are -little more than order-insensitive records of methods sharing a common private -value. - -@sp 1 -@item Static metaprogramming (syntactic extension) - -Rust supports a system for syntactic extensions that can be loaded into the -compiler, to implement user-defined notations, macros, program-generators and -the like. These notations are @emph{marked} using a special form of -bracketing, such that a reader unfamiliar with the extension can still parse -the surrounding text by skipping over the bracketed ``extension text''. - -@sp 1 -@item Idempotent failure - -If a task is killed by some external event, or if it evaluates the special -@code{fail} expression, it enters the @emph{failing} state. A failing task -unwinds its control stack, frees all of its owned resources (executing -destructors) and enters the @emph{dead} state. Failure is idempotent and -non-recoverable. - -@sp 1 -@item Supervision hierarchy - -Rust has a system for propagating task-failures, either directly to a -supervisor task, or indirectly by sending a message into a channel. - -@sp 1 -@item Resource types with deterministic destruction - -Rust includes a type constructor for @emph{resource} types, which have an -associated destructor and cannot be moved in memory. Resources types belong to -the kind of @emph{pinned} types, and any value that directly contains a -resource is implicitly pinned as well. - -Resources can only contain types from the pinned or unique kinds of type, -which means that unlike finalizers, there is always a deterministic, top-down -order to run the destructors of a resource and its sub-resources. - -@sp 1 -@item Typestate system - -Every storage slot in a Rust frame participates in not only a conventional -structural static type system, describing the interpretation of memory in the -slot, but also a @emph{typestate} system. The static typestates of a program -describe the set of @emph{pure, dynamic predicates} that provably hold over -some set of slots, at each point in the program's control-flow graph within -each frame. The static calculation of the typestates of a program is a -function-local dataflow problem, and handles user-defined predicates in a -similar fashion to the way the type system permits user-defined types. - -A short way of thinking of this is: types statically model values, -typestates statically model @emph{assertions that hold} before and -after statements and expressions. - -@end itemize - - -@page -@node Influences -@section Influences -@sp 2 - -@quotation - The essential problem that must be solved in making a fault-tolerant - software system is therefore that of fault-isolation. Different programmers - will write different modules, some modules will be correct, others will have - errors. We do not want the errors in one module to adversely affect the - behaviour of a module which does not have any errors. - -@flushright - - Joe Armstrong -@end flushright -@end quotation -@sp 2 - -@quotation - In our approach, all data is private to some process, and processes can - only communicate through communications channels. @emph{Security}, as used - in this paper, is the property which guarantees that processes in a system - cannot affect each other except by explicit communication. - - When security is absent, nothing which can be proven about a single module - in isolation can be guaranteed to hold when that module is embedded in a - system [...] -@flushright - - Robert Strom and Shaula Yemini -@end flushright -@end quotation -@sp 2 - -@quotation - Concurrent and applicative programming complement each other. The - ability to send messages on channels provides I/O without side effects, - while the avoidance of shared data helps keep concurrent processes from - colliding. -@flushright - - Rob Pike -@end flushright -@end quotation -@sp 2 - -@page -Rust is not a particularly original language. It may however appear unusual by -contemporary standards, as its design elements are drawn from a number of -``historical'' languages that have, with a few exceptions, fallen out of -favour. Five prominent lineages contribute the most: - -@itemize -@sp 1 -@item -The NIL (1981) and Hermes (1990) family. These languages were developed by -Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM -Watson Research Center (Yorktown Heights, NY, USA). - -@sp 1 -@item -The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes -Wikstr@"om, Mike Williams and others in their group at the Ericsson Computer -Science Laboratory (@"Alvsj@"o, Stockholm, Sweden) . - -@sp 1 -@item -The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim, -Heinz Schmidt and others in their group at The International Computer Science -Institute of the University of California, Berkeley (Berkeley, CA, USA). - -@sp 1 -@item -The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These languages -were developed by Rob Pike, Phil Winterbottom, Sean Dorward and others in -their group at Bell labs Computing Sciences Reserch Center (Murray Hill, NJ, -USA). - -@sp 1 -@item -The Napier (1985) and Napier88 (1988) family. These languages were developed -by Malcolm Atkinson, Ron Morrison and others in their group at the University -of St. Andrews (St. Andrews, Fife, UK). -@end itemize - -@sp 1 -Additional specific influences can be seen from the following languages: -@itemize -@item The structural algebraic types and compilation manager of SML. -@item The deterministic destructor system of C++. -@end itemize - -@c ############################################################ -@c Tutorial -@c ############################################################ - -@node Tutorial -@chapter Tutorial - -@emph{TODO}. - -@c ############################################################ -@c Reference -@c ############################################################ - -@node Reference -@chapter Reference - -@menu -* Ref.Lex:: Lexical structure. -* Ref.Path:: References to items. -* Ref.Gram:: Grammar. -* Ref.Comp:: Compilation and component model. -* Ref.Mem:: Semantic model of memory. -* Ref.Task:: Semantic model of tasks. -* Ref.Item:: The components of a module. -* Ref.Type:: The types of values held in memory. -* Ref.Typestate:: Predicates that hold at points in time. -* Ref.Stmt:: Components of an executable block. -* Ref.Expr:: Units of execution and evaluation. -* Ref.Run:: Organization of runtime services. -@end menu - -@node Ref.Lex -@section Ref.Lex -@c * Ref.Lex:: Lexical structure. -@cindex Lexical structure -@cindex Token - -The lexical structure of a Rust source file or crate file is defined in terms -of Unicode character codes and character properties. - -Groups of Unicode character codes and characters are organized into -@emph{tokens}. Tokens are defined as the longest contiguous sequence of -characters within the same token type (identifier, keyword, literal, symbol), -or interrupted by ignored characters. - -Most tokens in Rust follow rules similar to the C family. - -Most tokens (including whitespace, keywords, operators and structural symbols) -are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn -from Unicode characters specified by the @code{XID_start} and -@code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex -#31: Unicode Identifier and Pattern Syntax}. String and character literals may -include the full range of Unicode characters. - -@emph{TODO: formalize this section much more}. - -@menu -* Ref.Lex.Ignore:: Ignored characters. -* Ref.Lex.Ident:: Identifier tokens. -* Ref.Lex.Key:: Keyword tokens. -* Ref.Lex.Res:: Reserved tokens. -* Ref.Lex.Num:: Numeric tokens. -* Ref.Lex.Text:: String and character tokens. -* Ref.Lex.Syntax:: Syntactic extension tokens. -* Ref.Lex.Sym:: Special symbol tokens. -@end menu - -@node Ref.Lex.Ignore -@subsection Ref.Lex.Ignore -@c * Ref.Lex.Ignore:: Ignored tokens. - -Characters considered to be @emph{whitespace} or @emph{comment} are ignored, -and are not considered as tokens. They serve only to delimit tokens. Rust is -otherwise a free-form language. - -@dfn{Whitespace} is any of the following Unicode characters: U+0020 (space), -U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}). - -@dfn{Comments} are @emph{single-line comments} or @emph{multi-line comments}. - -A @dfn{single-line comment} is any sequence of Unicode characters beginning -with U+002F U+002F (@code{"//"}) and extending to the next U+000A character, -@emph{excluding} cases in which such a sequence occurs within a string literal -token. - -A @dfn{multi-line comments} is any sequence of Unicode characters beginning -with U+002F U+002A (@code{"/*"}) and ending with U+002A U+002F (@code{"*/"}), -@emph{excluding} cases in which such a sequence occurs within a string literal -token. Multi-line comments may be nested. - -@node Ref.Lex.Ident -@subsection Ref.Lex.Ident -@c * Ref.Lex.Ident:: Identifier tokens. -@cindex Identifier token - -Identifiers follow the rules given by Unicode Standard Annex #31, in the form -closed under NFKC normalization, @emph{excluding} those tokens that are -otherwise defined as keywords or reserved -tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}. - -That is: an identifier starts with any character having derived property -@code{XID_Start}, or the character U+005F (underscore, @code{_}), and -continues with zero or more characters having derived property -@code{XID_Continue}. An identifier is NFKC-normalized during lexing, such -that all subsequent comparison of identifiers is performed on the -NFKC-normalized forms. - -@emph{TODO: define relationship between Unicode and Rust versions}. - -@footnote{This identifier syntax is a superset of the identifier syntaxes of C -and Java, and is modeled on Python PEP #3131, which formed the definition of -identifiers in Python 3.0 and later.} - -@node Ref.Lex.Key -@subsection Ref.Lex.Key -@c * Ref.Lex.Key:: Keyword tokens. - -The keywords are: -@cindex Keywords - -@sp 2 - -@include keywords.texi - -@node Ref.Lex.Res -@subsection Ref.Lex.Res -@c * Ref.Lex.Res:: Reserved tokens. - -The reserved tokens are: -@cindex Reserved - -@sp 2 - -@multitable @columnfractions .15 .15 .15 .15 .15 -@item @code{f16} -@tab @code{f80} -@tab @code{f128} -@item @code{m32} -@tab @code{m64} -@tab @code{m128} -@tab @code{dec} -@end multitable - -@sp 2 - -At present these tokens have no defined meaning in the Rust language. - -These tokens may correspond, in some current or future implementation, -to additional built-in types for decimal floating-point, extended -binary and interchange floating-point formats, as defined in the IEEE -754-1985 and IEEE 754-2008 specifications. - - -@node Ref.Lex.Num -@subsection Ref.Lex.Num -@c * Ref.Lex.Num:: Numeric tokens. -@cindex Number token -@cindex Hex token -@cindex Decimal token -@cindex Binary token -@cindex Floating-point token - -@c FIXME: This discussion isn't quite right since 'f' and 'i' can be used as -@c suffixes - -A @dfn{number literal} is either an @emph{integer literal} or a -@emph{floating-point literal}. - -@sp 1 -An @dfn{integer literal} has one of three forms: -@enumerate -@item A @dfn{decimal literal} starts with a @emph{decimal digit} and continues -with any mixture of @emph{decimal digits} and @emph{underscores}. - -@item A @dfn{hex literal} starts with the character sequence U+0030 -U+0078 (@code{"0x"}) and continues as any mixture @emph{hex digits} -and @emph{underscores}. - -@item A @dfn{binary literal} starts with the character sequence U+0030 -U+0062 (@code{"0b"}) and continues as any mixture @emph{binary digits} -and @emph{underscores}. - -@end enumerate - -By default, an integer literal is of type @code{int}. An integer literal may -be followed (immediately, without any spaces) by a @dfn{integer suffix}, which -changes the type of the literal. There are three kinds of integer literal -suffix: - -@enumerate -@item The @code{u} suffix gives the literal type @code{uint}. -@item Each of the signed and unsigned machine types @code{u8}, @code{i8}, -@code{u16}, @code{i16}, @code{u32}, @code{i32}, @code{u64} and @code{i64} -give the literal the corresponding machine type. -@end enumerate - -@sp 1 -A @dfn{floating-point literal} has one of two forms: -@enumerate -@item Two @emph{decimal literals} separated by a period -character U+002E ('.'), with an optional @emph{exponent} trailing after the -second @emph{decimal literal}. -@item A single @emph{decimal literal} followed by an @emph{exponent}. -@end enumerate - -By default, a floating-point literal is of type @code{float}. A floating-point -literal may be followed (immediately, without any spaces) by a -@dfn{floating-point suffix}, which changes the type of the literal. There are -only two floating-point suffixes: @code{f32} and @code{f64}. Each of these -gives the floating point literal the associated type, rather than -@code{float}. - -A set of suffixes are also reserved to accommodate literal support for -types corresponding to reserved tokens. The reserved suffixes are @code{f16}, -@code{f80}, @code{f128}, @code{m}, @code{m32}, @code{m64} and @code{m128}. - -@sp 1 -A @dfn{hex digit} is either a @emph{decimal digit} or else a character in the -ranges U+0061-U+0066 and U+0041-U+0046 (@code{'a'}-@code{'f'}, -@code{'A'}-@code{'F'}). - -A @dfn{binary digit} is either the character U+0030 or U+0031 (@code{'0'} or -@code{'1'}). - -An @dfn{exponent} begins with either of the characters U+0065 or U+0045 -(@code{'e'} or @code{'E'}), followed by an optional @emph{sign character}, -followed by a trailing @emph{decimal literal}. - -A @dfn{sign character} is either U+002B or U+002D (@code{'+'} or @code{'-'}). - - -Examples of integer literals of various forms: -@example -123; // type int -123u; // type uint -123_u; // type uint -0xff00; // type int -0xffu8; // type u8 -0b1111_1111_1001_0000_i32; // type i32 -@end example - - -Examples of floating-point literals of various forms: -@example -123.0; // type float -0.1; // type float -0.1f32; // type f32 -12E+99_f64; // type f64 -@end example - - -@node Ref.Lex.Text -@subsection Ref.Lex.Text -@c * Ref.Lex.Key:: String and character tokens. -@cindex String token -@cindex Character token -@cindex Escape sequence -@cindex Unicode - -A @dfn{character literal} is a single Unicode character enclosed within two -U+0027 (single-quote) characters, with the exception of U+0027 itself, which -must be @emph{escaped} by a preceding U+005C character ('\'). - -A @dfn{string literal} is a sequence of any Unicode characters enclosed -within two U+0022 (double-quote) characters, with the exception of U+0022 -itself, which must be @emph{escaped} by a preceding U+005C character -('\'). - -Some additional @emph{escapes} are available in either character or string -literals. An escape starts with a U+005C ('\') and continues with one -of the following forms: -@itemize -@item An @dfn{8-bit codepoint escape} escape starts with U+0078 ('x') and is -followed by exactly two @dfn{hex digits}. It denotes the Unicode codepoint -equal to the provided hex value. -@item A @dfn{16-bit codepoint escape} starts with U+0075 ('u') and is followed - by exactly four @dfn{hex digits}. It denotes the Unicode codepoint equal to -the provided hex value. -@item A @dfn{32-bit codepoint escape} starts with U+0055 ('U') and is followed - by exactly eight @dfn{hex digits}. It denotes the Unicode codepoint equal to -the provided hex value. -@item A @dfn{whitespace escape} is one of the characters U+006E, U+0072, or -U+0074, denoting the unicode values U+000A (LF), U+000D (CR) or U+0009 (HT) -respectively. -@item The @dfn{backslash escape} is the character U+005C ('\') which must be -escaped in order to denote @emph{itself}. -@end itemize - -@node Ref.Lex.Syntax -@subsection Ref.Lex.Syntax -@c * Ref.Lex.Syntax:: Syntactic extension tokens. - -Syntactic extensions are marked with the @emph{pound} sigil U+0023 (@code{#}), -followed by an identifier, one of @code{fmt}, @code{env}, -@code{concat_idents}, @code{ident_to_str}, @code{log_syntax}, @code{macro}, or -the name of a user-defined macro. This is followed by a vector literal. (Its -value will be interpreted syntactically; in particular, it need not be -well-typed.) - -@emph{TODO: formalize those terms more}. - -@node Ref.Lex.Sym -@subsection Ref.Lex.Sym -@c * Ref.Lex.Sym:: Special symbol tokens. - -@cindex Symbol -@cindex Operator - -The special symbols are: - -@sp 2 - -@multitable @columnfractions .1 .1 .1 .1 .1 .1 - -@item @code{@@} -@tab @code{_} -@item @code{#} -@tab @code{:} -@tab @code{.} -@tab @code{;} -@tab @code{,} -@item @code{[} -@tab @code{]} -@tab @code{@{} -@tab @code{@}} -@tab @code{(} -@tab @code{)} -@item @code{=} -@tab @code{<-} -@tab @code{<->} -@tab @code{->} -@item @code{+} -@tab @code{++} -@tab @code{+=} -@tab @code{-} -@tab @code{--} -@tab @code{-=} -@item @code{*} -@tab @code{/} -@tab @code{%} -@tab @code{*=} -@tab @code{/=} -@tab @code{%=} -@item @code{&} -@tab @code{|} -@tab @code{!} -@tab @code{~} -@tab @code{^} -@item @code{&=} -@tab @code{|=} -@tab @code{^=} -@tab @code{!=} -@item @code{>>} -@tab @code{>>>} -@tab @code{<<} -@tab @code{<<=} -@tab @code{>>=} -@tab @code{>>>=} -@item @code{<} -@tab @code{<=} -@tab @code{==} -@tab @code{>=} -@tab @code{>} -@item @code{&&} -@tab @code{||} -@end multitable - -@page -@page -@node Ref.Path -@section Ref.Path -@c * Ref.Path:: References to items. -@cindex Names of items or slots -@cindex Path name -@cindex Type parameters - -A @dfn{path} is a sequence of one or more path components separated by a -namespace qualifier (@code{::}). If a path consists of only one component, it -may refer to either an item or a slot in a local control -scope. @xref{Ref.Mem.Slot}. @xref{Ref.Item}. If a path has multiple -components, it refers to an item. - -Every item has a @emph{canonical path} within its crate, but the path naming -an item is only meaningful within a given crate. There is no global namespace -across crates; an item's canonical path merely identifies it within the -crate. @xref{Ref.Comp.Crate}. - -Path components are usually identifiers. @xref{Ref.Lex.Ident}. The last -component of a path may also have trailing explicit type arguments. - -Two examples of simple paths consisting of only identifier components: -@example -x; -x::y::z; -@end example - -In most contexts, the Rust grammar accepts a general @emph{path}, but -subsequent passes may restrict paths occurring in various contexts to refer to -slots or items, depending on the semantics of the occurrence. In other words: -in some contexts a slot is required (for example, on the left hand side of the -copy operator, @pxref{Ref.Expr.Copy}) and in other contexts an item is -required (for example, as a type parameter, @pxref{Ref.Item}). In no case is -the grammar made ambiguous by accepting a general path and interpreting the -reference in later passes. @xref{Ref.Gram}. - -An example of a path with type parameters: -@example -m::map; -@end example - -@page -@node Ref.Gram -@section Ref.Gram -@c * Ref.Gram:: Grammar. - -@emph{TODO: mostly LL(1), it reads like C++, Alef and bits of Napier; -formalize here}. - -@page -@node Ref.Comp -@section Ref.Comp -@c * Ref.Comp:: Compilation and component model. -@cindex Compilation model - -Rust is a @emph{compiled} language. Its semantics are divided along a -@emph{phase distinction} between compile-time and run-time. Those semantic -rules that have a @emph{static interpretation} govern the success or failure -of compilation. A program that fails to compile due to violation of a -compile-time rule has no defined semantics at run-time; the compiler should -halt with an error report, and produce no executable artifact. - -The compilation model centres on artifacts called @emph{crates}. Each -compilation is directed towards a single crate in source form, and if -successful produces a single crate in executable form. - -@menu -* Ref.Comp.Crate:: Units of compilation and linking. -* Ref.Comp.Attr:: Attributes of crates, modules and items. -* Ref.Comp.Syntax:: Syntax extensions. -@end menu - -@node Ref.Comp.Crate -@subsection Ref.Comp.Crate -@c * Ref.Comp.Crate:: Units of compilation and linking. -@cindex Crate - -A @dfn{crate} is a unit of compilation and linking, as well as versioning, -distribution and runtime loading. Crates are defined by @emph{crate source -files}, which are a type of source file written in a special declarative -language: @emph{crate language}.@footnote{A crate is somewhat analogous to an -@emph{assembly} in the ECMA-335 CLI model, a @emph{library} in the SML/NJ -Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a -@emph{configuration} in Mesa.} A crate source file describes: - -@itemize -@item Metadata about the crate, such as author, name, version, and copyright. -@item The source-file and directory modules that make up the crate. -@item Any external crates or native modules that the crate imports to its top level. -@item The organization of the crate's internal namespace. -@item The set of names exported from the crate. -@end itemize - -A single crate source file may describe the compilation of a large number of -Rust source files; it is compiled in its entirety, as a single indivisible -unit. The compilation phase attempts to transform a single crate source file, -and its referenced contents, into a single compiled crate. Crate source files -and compiled crates have a 1:1 relationship. - -The syntactic form of a crate is a sequence of @emph{directives}, some of -which have nested sub-directives. - -A crate defines an implicit top-level module: within this module, -all members of the crate have canonical path names. @xref{Ref.Path}. The -@code{mod} directives within a crate file specify sub-modules to include in -the crate: these are either directory modules, corresponding to directories in -the filesystem of the compilation environment, or file modules, corresponding -to Rust source files. The names given to such modules in @code{mod} directives -become prefixes of the paths of items defined within any included Rust source -files. - -If a .rs file exists in the filesystem alongside the .rc crate file, then it -will be used to provide the top-level module of the crate. Similarly, -directory modules may be paired with .rs files of the same name as the -directory to provide the code for those modules. These source files are never -mentioned explicitly in the crate file; they are simply used if they are -present. - -The @code{use} directives within the crate specify @emph{other crates} to scan -for, locate, import into the crate's module namespace during compilation, and -link against at runtime. Use directives may also occur independently in rust -source files. These directives may specify loose or tight ``matching -criteria'' for imported crates, depending on the preferences of the crate -developer. In the simplest case, a @code{use} directive may only specify a -symbolic name and leave the task of locating and binding an appropriate crate -to a compile-time heuristic. In a more controlled case, a @code{use} directive -may specify any metadata as matching criteria, such as a URI, an author name -or version number, a checksum or even a cryptographic signature, in order to -select an an appropriate imported crate. @xref{Ref.Comp.Attr}. - -The compiled form of a crate is a loadable and executable object file full of -machine code, in a standard loadable operating-system format such as ELF, PE -or Mach-O. The loadable object contains metadata, describing: -@itemize -@item The publicly exported module structure of the crate. -@item Any metadata about the crate, defined by attributes. -@item The crates to dynamically link with at run-time, with matching criteria -derived from the same @code{use} directives that guided compile-time imports. -@end itemize - -@c This might come along sometime in the future. - -@c The @code{syntax} directives of a crate are similar to the @code{use} -@c directives, except they govern the syntax extension namespace (accessed -@c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax}) -@c available only at compile time. A @code{syntax} directive also makes its -@c extension available to all subsequent directives in the crate file. - -An example of a crate: - -@example -// Linkage attributes -#[ link(name = "projx" - vers = "2.5", - uuid = "9cccc5d5-aceb-4af5-8285-811211826b82") ]; - -// Additional metadata attributes -#[ desc = "Project X", - license = "BSD" ]; - author = "Jane Doe" ]; - -// Import a module. -use std (ver = "1.0"); - -// Define some modules. -#[path = "foo.rs"] -mod foo; -mod bar @{ - #[path = "quux.rs"] - mod quux; -@} -@end example - -@node Ref.Comp.Attr -@subsection Ref.Comp.Attr -@cindex Attributes - -Static entities in Rust -- crates, modules and items -- may have attributes -applied to them.@footnote{Attributes in Rust are modeled on Attributes in -ECMA-335, C#} An attribute is a general, free-form piece of metadata that is -interpreted according to name, convention, and language and compiler version. -Attributes may appear as any of: -@itemize -@item A single identifier, the attribute name -@item An identifier followed by the equals sign '=' and a literal, providing a key/value pair -@item An identifier followed by a parenthesized list of sub-attribute arguments -@end itemize - -Attributes are applied to an entity by placing them within a hash-list -(@code{#[...]}) as either a prefix to the entity or as a semicolon-delimited -declaration within the entity body. - -An example of attributes: - -@example -// A function marked as a unit test -#[test] -fn test_foo() @{ - ... -@} - -// General metadata applied to the enclosing module or crate. -#[license = "BSD"]; - -// A conditionally-compiled module -#[cfg(target_os="linux")] -module bar @{ - ... -@} - -@end example - -In future versions of Rust, user-provided extensions to the compiler will be able -to interpret attributes. When this facility is provided, a distinction will be -made between language-reserved and user-available attributes. - -At present, only the Rust compiler interprets attributes, so all attribute -names are effectively reserved. Some significant attributes include: - -@itemize -@item The @code{cfg} attribute, for conditional-compilation by build-configuration -@item The @code{link} attribute, describing linkage metadata for a crate -@item The @code{test} attribute, for marking functions as unit tests. -@end itemize - -Other attributes may be added or removed during development of the language. - -@node Ref.Comp.Syntax -@subsection Ref.Comp.Syntax -@c * Ref.Comp.Syntax:: Syntax extension. -@cindex Syntax extension - -Rust provides a notation for @dfn{syntax extension}. The notation for invoking -a syntax extension is a marked syntactic form that can appear as an expression -in the body of a Rust program. @xref{Ref.Lex.Syntax}. - -After parsing, a syntax-extension incovation is expanded into a Rust -expression. The name of the extension determines the translation performed. In -future versions of Rust, user-provided syntax extensions aside from macros -will be provided via external crates. - -At present, only a set of built-in syntax extensions, as well as macros -introduced inline in source code using the @code{macro} extension, may be -used. The current built-in syntax extensions are: - -@itemize -@item @code{fmt} expands into code to produce a formatted string, similar to - @code{printf} from C. -@item @code{env} expands into a string literal containing the value of that - environment variable at compile-time. -@item @code{concat_idents} expands into an identifier which is the - concatenation of its arguments. -@item @code{ident_to_str} expands into a string literal containing the name of - its argument (which must be a literal). -@item @code{log_syntax} causes the compiler to pretty-print its arguments. -@end itemize - -Finally, @code{macro} is used to define a new macro. A macro can abstract over -second-class Rust concepts that are present in syntax. The arguments to -@code{macro} are a bracketed list of pairs (two-element lists). The pairs -consist of an invocation and the syntax to expand into. An example: - -@example -#macro[[#apply[fn, [args, ...]], fn(args, ...)]]; -@end example - -In this case, the invocation @code{#apply[sum, 5, 8, 6]} expands to -@code{sum(5,8,6)}. If @code{...} follows an expression (which need not be as -simple as a single identifier) in the input syntax, the matcher will expect an -arbitrary number of occurences of the thing preceeding it, and bind syntax to -the identifiers it contains. If it follows an expression in the output syntax, -it will transcribe that expression repeatedly, according to the identifiers -(bound to syntax) that it contains. - -The behavior of @code{...} is known as Macro By Example. It allows you to -write a macro with arbitrary repetition by specifying only one case of that -repetition, and following it by @code{...}, both where the repeated input is -matched, and where the repeated output must be transcribed. A more -sophisticated example: - -@example -#macro[#zip_literals[[x, ...], [y, ...]], - [[x, y], ...]]; -#macro[#unzip_literals[[x, y], ...], - [[x, ...], [y, ...]]]; -@end example - -In this case, @code{#zip_literals[[1,2,3], [1,2,3]]} expands to -@code{[[1,1],[2,2],[3,3]]}, and @code{#unzip_literals[[1,1], [2,2], [3,3]]} -expands to @code{[[1,2,3],[1,2,3]]}. - -Macro expansion takes place outside-in: that is, -@code{#unzip_literals[#zip_literals[[1,2,3],[1,2,3]]]} will fail because -@code{unzip_literals} expects a list, not a macro invocation, as an -argument. - -@c -The macro system currently has some limitations. It's not possible to -destructure anything other than vector literals (therefore, the arguments to -complicated macros will tend to be an ocean of square brackets). Macro -invocations and @code{...} can only appear in expression positions. Finally, -macro expansion is currently unhygienic. That is, name collisions between -macro-generated and user-written code can cause unintentional capture. - - -@page -@node Ref.Mem -@section Ref.Mem -@c * Ref.Mem:: Semantic model of memory. -@cindex Memory model -@cindex Box -@cindex Slot - -A Rust task's memory consists of a static set of @emph{items}, a set of tasks -each with its own @emph{stack}, and a @emph{heap}. Immutable portions of the -heap may be shared between tasks, mutable portions may not. - -Allocations in the stack consist of @emph{slots}, and allocations in the heap -consist of @emph{boxes}. - -@menu -* Ref.Mem.Alloc:: Memory allocation model. -* Ref.Mem.Own:: Memory ownership model. -* Ref.Mem.Slot:: Stack memory model. -* Ref.Mem.Box:: Heap memory model. -@end menu - -@node Ref.Mem.Alloc -@subsection Ref.Mem.Alloc -@c * Ref.Mem.Alloc:: Memory allocation model. -@cindex Item -@cindex Stack -@cindex Heap -@cindex Shared box -@cindex Task-local box - -The @dfn{items} of a program are those functions, objects, modules and types -that have their value calculated at compile-time and stored uniquely in the -memory image of the rust process. Items are neither dynamically allocated nor -freed. - -A task's @dfn{stack} consists of activation frames automatically allocated on -entry to each function as the task executes. A stack allocation is reclaimed -when control leaves the frame containing it. - -The @dfn{heap} is a general term that describes two separate sets of boxes: -shared boxes -- which may be subject to garbage collection -- and unique -boxes. The lifetime of an allocation in the heap depends on the lifetime of -the box values pointing to it. Since box values may themselves be passed in -and out of frames, or stored in the heap, heap allocations may outlive the -frame they are allocated within. - - -@node Ref.Mem.Own -@subsection Ref.Mem.Own -@c * Ref.Mem.Own:: Memory ownership model. -@cindex Ownership - -A task owns all memory it can @emph{safely} reach through local variables, -shared or unique boxes, and/or references. Sharing memory between tasks can -only be accomplished using @emph{unsafe} constructs, such as raw pointer -operations or calling C code. - -When a task sends a value of @emph{unique} kind over a channel, it loses -ownership of the value sent and can no longer refer to it. This is statically -guaranteed by the combined use of ``move semantics'' and unique kinds, within -the communication system. - -When a stack frame is exited, its local allocations are all released, and its -references to boxes (both shared and owned) are dropped. - -A shared box may (in the case of a recursive, mutable shared type) be cyclic; -in this case the release of memory inside the shared structure may be deferred -until task-local garbage collection can reclaim it. Code can ensure no such -delayed deallocation occurs by restricting itself to unique boxes and similar -unshared kinds of data. - -When a task finishes, its stack is necessarily empty and it therefore has no -references to any boxes; the remainder of its heap is immediately freed. - -@node Ref.Mem.Slot -@subsection Ref.Mem.Slot -@c * Ref.Mem.Slot:: Stack memory model. -@cindex Stack -@cindex Slot -@cindex Local slot -@cindex Reference slot - -A task's stack contains slots. - -A @dfn{slot} is a component of a stack frame. A slot is either @emph{local} or -an @emph{alias}. - -A @dfn{local} slot (or @emph{stack-local} allocation) holds a value directly, -allocated within the stack's memory. The value is a part of the stack frame. - -A @dfn{reference} references a value outside the frame. It may refer to a -value allocated in another frame @emph{or} a boxed value in the heap. The -reference-formation rules ensure that the referent will outlive the reference. - -Local slots are always implicitly mutable. - -Local slots are not initialized when allocated; the entire frame worth of -local slots are allocated at once, on frame-entry, in an uninitialized -state. Subsequent statements within a function may or may not initialize the -local slots. Local slots can be used only after they have been initialized; -this condition is guaranteed by the typestate system. - -References are created for function arguments. If the compiler can not prove -that the referred-to value will outlive the reference, it will try to set -aside a copy of that value to refer to. If this is not sematically safe (for -example, if the referred-to value contains mutable fields), it will reject the -program. If the compiler deems copying the value expensive, it will warn. - -A function can be declared to take an argument by mutable reference. This -allows the function to write to the slot that the reference refers to. - -An example function that accepts an value by mutable reference: -@example -fn incr(&i: int) @{ - i = i + 1; -@} -@end example - -@node Ref.Mem.Box -@subsection Ref.Mem.Box -@c * Ref.Mem.Box:: Heap memory model. -@cindex Box -@cindex Dereference operator - -A @dfn{box} is a reference to a heap allocation holding another value. There -are two kinds of boxes: @emph{shared boxes} and @emph{unique boxes}. - -A @dfn{shared box} type or value is constructed by the prefix @emph{at} sigil @code{@@}. - -A @dfn{unique box} type or value is constructed by the prefix @emph{tilde} sigil @code{~}. - -Multiple shared box values can point to the same heap allocation; copying a -shared box value makes a shallow copy of the pointer (optionally incrementing -a reference count, if the shared box is implemented through -reference-counting). - -Unique box values exist in 1:1 correspondence with their heap allocation; -copying a unique box value makes a deep copy of the heap allocation and -produces a pointer to the new allocation. - -An example of constructing one shared box type and value, and one unique box type and value: -@example -let x: @@int = @@10; -let x: ~int = ~10; -@end example - -Some operations implicitly dereference boxes. Examples of such @dfn{implicit -dereference} operations are: -@itemize -@item arithmetic operators (@code{x + y - z}) -@item field selection (@code{x.y.z}) -@end itemize - -An example of an implicit-dereference operation performed on box values: -@example -let x: @@int = @@10; -let y: @@int = @@12; -assert (x + y == 22); -@end example - -Other operations act on box values as single-word-sized address values. For -these operations, to access the value held in the box requires an explicit -dereference of the box value. Explicitly dereferencing a box is indicated with -the unary @emph{star} operator @code{*}. Examples of such @dfn{explicit -dereference} operations are: -@itemize -@item copying box values (@code{x = y}) -@item passing box values to functions (@code{f(x,y)}) -@end itemize - -An example of an explicit-dereference operation performed on box values: -@example -fn takes_boxed(b: @@int) @{ -@} - -fn takes_unboxed(b: int) @{ -@} - -fn main() @{ - let x: @@int = @@10; - takes_boxed(x); - takes_unboxed(*x); -@} -@end example - - -@page -@node Ref.Task -@section Ref.Task -@c * Ref.Task:: Semantic model of tasks. -@cindex Task -@cindex Process - -An executing Rust program consists of a tree of tasks. A Rust @dfn{task} -consists of an entry function, a stack, a set of outgoing communication -channels and incoming communication ports, and ownership of some portion of -the heap of a single operating-system process. - -Multiple Rust tasks may coexist in a single operating-system -process. Execution of multiple Rust tasks in a single operating-system process -may be either truly concurrent or interleaved by the runtime scheduler. Rust -tasks are lightweight: each consumes less memory than an operating-system -process, and switching between Rust tasks is faster than switching between -operating-system processes. - -@menu -* Ref.Task.Comm:: Inter-task communication. -* Ref.Task.Life:: Task lifecycle and state transitions. -* Ref.Task.Sched:: Task scheduling model. -* Ref.Task.Spawn:: Library interface for making new tasks. -* Ref.Task.Send:: Library interface for sending messages. -* Ref.Task.Recv:: Library interface for receiving messages. -@end menu - -@node Ref.Task.Comm -@subsection Ref.Task.Comm -@c * Ref.Task.Comm:: Inter-task communication. - -@cindex Communication -@cindex Port -@cindex Channel -@cindex Message passing -@cindex Send expression -@cindex Receive expression - -With the exception of @emph{unsafe} blocks, Rust tasks are isolated from -interfering with one another's memory directly. Instead of manipulating shared -storage, Rust tasks communicate with one another using a typed, asynchronous, -simplex message-passing system. - -A @dfn{port} is a communication endpoint that can @emph{receive} -messages. Ports receive messages from channels. - -A @dfn{channel} is a communication endpoint that can @emph{send} -messages. Channels send messages to ports. - -Each port is implicitly boxed and mutable; as such a port has a unique -per-task identity and cannot be replicated or transmitted. If a port value is -copied, both copies refer to the @emph{same} port. New ports can be -constructed dynamically and stored in data structures. - -Each channel is bound to a port when the channel is constructed, so the -destination port for a channel must exist before the channel itself. A channel -cannot be rebound to a different port from the one it was constructed with. - -Channels are weak: a channel does not keep the port it is bound to -alive. Ports are owned by their allocating task and cannot be sent over -channels; if a task dies its ports die with it, and all channels bound to -those ports no longer function. Messages sent to a channel connected to a dead -port will be dropped. - -Channels are immutable types with meaning known to the runtime; channels can -be sent over channels. - -Many channels can be bound to the same port, but each channel is bound to a -single port. In other words, channels and ports exist in an N:1 relationship, -N channels to 1 port. @footnote{It may help to remember nautical terminology -when differentiating channels from ports. Many different waterways -- -channels -- may lead to the same port.} - -Each port and channel can carry only one type of message. The message type is -encoded as a parameter of the channel or port type. The message type of a -channel is equal to the message type of the port it is bound to. The types of -messages must be of @emph{unique} kind. - -Messages are generally sent asynchronously, with optional rate-limiting on the -transmit side. A channel contains a message queue and asynchronously sending a -message merely inserts it into the sending channel's queue; message receipt is -the responsibility of the receiving task. - -Messages are sent on channels and received on ports using standard library -functions. - -@node Ref.Task.Life -@subsection Ref.Task.Life -@c * Ref.Task.Life:: Task lifecycle and state transitions. - -@cindex Lifecycle of task -@cindex Scheduling -@cindex Running, task state -@cindex Blocked, task state -@cindex Failing, task state -@cindex Dead, task state -@cindex Soft failure -@cindex Hard failure - -The @dfn{lifecycle} of a task consists of a finite set of states and events -that cause transitions between the states. The lifecycle states of a task are: - -@itemize -@item running -@item blocked -@item failing -@item dead -@end itemize - -A task begins its lifecycle -- once it has been spawned -- in the -@emph{running} state. In this state it executes the statements of its entry -function, and any functions called by the entry function. - -A task may transition from the @emph{running} state to the @emph{blocked} -state any time it evaluates a communication expression on a port or channel that -cannot be immediately completed. When the communication expression can be -completed -- when a message arrives at a sender, or a queue drains -sufficiently to complete a semi-synchronous send -- then the blocked task will -unblock and transition back to @emph{running}. - -A task may transition to the @emph{failing} state at any time, due being -killed by some external event or internally, from the evaluation of a -@code{fail} expression. Once @emph{failing}, a task unwinds its stack and -transitions to the @emph{dead} state. Unwinding the stack of a task is done by -the task itself, on its own control stack. If a value with a destructor is -freed during unwinding, the code for the destructor is run, also on the task's -control stack. Running the destructor code causes a temporary transition to a -@emph{running} state, and allows the destructor code to cause any subsequent -state transitions. The original task of unwinding and failing thereby may -suspend temporarily, and may involve (recursive) unwinding of the stack of a -failed destructor. Nonetheless, the outermost unwinding activity will continue -until the stack is unwound and the task transitions to the @emph{dead} -state. There is no way to ``recover'' from task failure. Once a task has -temporarily suspended its unwinding in the @emph{failing} state, failure -occurring from within this destructor results in @emph{hard} failure. The -unwinding procedure of hard failure frees resources but does not execute -destructors. The original (soft) failure is still resumed at the point where -it was temporarily suspended. - -A task in the @emph{dead} state cannot transition to other states; it exists -only to have its termination status inspected by other tasks, and/or to await -reclamation when the last reference to it drops. - -@node Ref.Task.Sched -@subsection Ref.Task.Sched -@c * Ref.Task.Sched:: Task scheduling model. - -@cindex Scheduling -@cindex Preemption -@cindex Yielding control - -The currently scheduled task is given a finite @emph{time slice} in which to -execute, after which it is @emph{descheduled} at a loop-edge or similar -preemption point, and another task within is scheduled, pseudo-randomly. - -An executing task can @code{yield} control at any time, which deschedules it -immediately. Entering any other non-executing state (blocked, dead) similarly -deschedules the task. - - - -@node Ref.Task.Spawn -@subsection Ref.Task.Spawn -@c * Ref.Task.Spawn:: Calls for creating new tasks. -@cindex Spawn expression - -A call to @code{std::task::spawn}, passing a 0-argument function as its single -argument, causes the runtime to construct a new task executing the passed -function. The passed function is referred to as the @dfn{entry function} for -the spawned task, and any captured environment is carries is moved from the -spawning task to the spawned task before the spawned task begins execution. - -The result of a @code{spawn} call is a @code{std::task::task} value. - -An example of a @code{spawn} call: -@example -import std::task::*; -import std::comm::*; - -fn helper(c: chan) @{ - // do some work. - let result = ...; - send(c, result); -@} - -let p: port; - -spawn(bind helper(chan(p))); -// let task run, do other things. -// ... -let result = recv(p); - -@end example - -@node Ref.Task.Send -@subsection Ref.Task.Send -@c * Ref.Task.Send:: Calls for sending a value into a channel. -@cindex Send call -@cindex Messages -@cindex Communication - -Sending a value into a channel is done by a library call to -@code{std::comm::send}, which takes a channel and a value to send, and moves -the value into the channel's outgoing buffer. - -An example of a send: -@example -import std::comm::*; -let c: chan = @dots{}; -send(c, "hello, world"); -@end example - -@node Ref.Task.Recv -@subsection Ref.Task.Recv -@c * Ref.Task.Recv:: Calls for receiving a value from a channel. -@cindex Receive call -@cindex Messages -@cindex Communication - -Receiving a value is done by a call to the @code{recv} method, on an object of -type @code{std::comm::port}. This call causes the receiving task to enter the -@emph{blocked reading} state until a task is sending a value to the port, at -which point the runtime pseudo-randomly selects a sending task and moves a -value from the head of one of the task queues to the call's return value, and -un-blocks the receiving task. @xref{Ref.Run.Comm}. - -An example of a @emph{receive}: -@example -import std::comm::*; -let p: port = @dots{}; -let s: str = recv(p); -@end example - - - -@page -@node Ref.Item -@section Ref.Item -@c * Ref.Item:: The components of a module. - -@cindex Item -@cindex Type parameters -@cindex Module item - -An @dfn{item} is a component of a module. Items are entirely determined at -compile-time, remain constant during execution, and may reside in read-only -memory. - -There are five primary kinds of item: modules, functions, objects and type -definitions. - -All items form an implicit scope for the declaration of sub-items. In other -words, within a function or object, declarations of items can (in many cases) -be mixed with the statements, control blocks, and similar artifacts that -otherwise compose the item body. The meaning of these scoped items is the same -as if the item was declared outside the scope, except that the item's -@emph{path name} within the module namespace is qualified by the name of the -enclosing item. The exact locations in which sub-items may be declared is -given by the grammar. @xref{Ref.Gram}. - -Functions, objects and type definitions may be @emph{parametrized} by -type. Type parameters are given as a comma-separated list of identifiers -enclosed in angle brackets (@code{<>}), after the name of the item and before -its definition. The type parameters of an item are part of the name, not the -type of the item; in order to refer to the type-parametrized item, a -referencing name must in general provide type arguments as a list of -comma-separated types enclosed within angle brackets. In practice, the -type-inference system can usually infer such argument types from -context. There are no general parametric types. - -@menu -* Ref.Item.Mod:: Items defining modules. -* Ref.Item.Fn:: Items defining functions. -* Ref.Item.Pred:: Items defining predicates for typestates. -* Ref.Item.Obj:: Items defining objects. -* Ref.Item.Type:: Items defining the types of values and slots. -* Ref.Item.Tag:: Items defining the constructors of a tag type. -@end menu - -@node Ref.Item.Mod -@subsection Ref.Item.Mod -@c * Ref.Item.Mod:: Items defining sub-modules. - -@cindex Module item -@cindex Importing names -@cindex Exporting names -@cindex Visibility control - -A @dfn{module item} contains declarations of other @emph{items}. The items -within a module may be functions, modules, objects or types. These -declarations have both static and dynamic interpretation. The purpose of a -module is to organize @emph{names} and control @emph{visibility}. Modules are -declared with the keyword @code{mod}. - -An example of a module: -@example -mod math @{ - type complex = (f64,f64); - fn sin(f64) -> f64 @{ - @dots{} - @} - fn cos(f64) -> f64 @{ - @dots{} - @} - fn tan(f64) -> f64 @{ - @dots{} - @} - @dots{} -@} -@end example - -Modules may also include any number of @dfn{import and export -declarations}. These declarations must precede any module item declarations -within the module, and control the visibility of names both within the module -and outside of it. - -@menu -* Ref.Item.Mod.Import:: Declarations for module-local synonyms. -* Ref.Item.Mod.Export:: Declarations for restricting visibility. -@end menu - -@node Ref.Item.Mod.Import -@subsubsection Ref.Item.Mod.Import -@c * Ref.Item.Mod.Import:: Declarations for module-local synonyms. - -@cindex Importing names -@cindex Visibility control - -An @dfn{import declaration} creates one or more local name bindings synonymous -with some other name. Usually an import declaration is used to shorten the -path required to refer to a module item. - -@emph{Note}: unlike many languages, Rust's @code{import} declarations do -@emph{not} declare linkage-dependency with external crates. Linkage -dependencies are independently declared with @code{use} -declarations. @xref{Ref.Comp.Crate}. - -An example of imports: -@example -import std::math::sin; -import std::option::*; -import std::str::@{char_at, hash@}; - -fn main() @{ - // Equivalent to 'log(info, std::math::sin(1.0));' - log(info, sin(1.0)); - - // Equivalent to 'log(info, std::option::some(1.0));' - log(info, some(1.0)); - - // Equivalent to 'log(info, std::str::hash(std::str::char_at("foo")));' - log(info, hash(char_at("foo"))); -@} -@end example - -@node Ref.Item.Mod.Export -@subsubsection Ref.Item.Mod.Export -@c * Ref.Item.Mod.Import:: Declarations for restricting visibility. - -@cindex Exporting names -@cindex Visibility control - -An @dfn{export declaration} restricts the set of local declarations within a -module that can be accessed from code outside the module. By default, all -local declarations in a module are exported. If a module contains an export -declaration, this declaration replaces the default export with the export -specified. - -An example of an export: -@example -mod foo @{ - export primary; - - fn primary() @{ - helper(1, 2); - helper(3, 4); - @} - - fn helper(x: int, y: int) @{ - @dots{} - @} -@} - -fn main() @{ - foo::primary(); // Will compile. - foo::helper(2,3) // ERROR: will not compile. -@} -@end example - -Multiple items may be exported from a single export declaration: - -@example -mod foo @{ - export primary, secondary; - - fn primary() @{ - helper(1, 2); - helper(3, 4); - @} - - fn secondary() @{ - @dots{} - @} - - fn helper(x: int, y: int) @{ - @dots{} - @} -@} -@end example - - -@node Ref.Item.Fn -@subsection Ref.Item.Fn -@c * Ref.Item.Fn:: Items defining functions. -@cindex Functions -@cindex Slots, function input and output - -A @dfn{function item} defines a sequence of statements associated with a name -and a set of parameters. Functions are declared with the keyword -@code{fn}. Functions declare a set of @emph{input slots} as parameters, -through which the caller passes arguments into the function, and an -@emph{output slot} through which the function passes results back to the -caller. - -A function may also be copied into a first class @emph{value}, in which case -the value has the corresponding @emph{function type}, and can be used -otherwise exactly as a function item (with a minor additional cost of calling -the function, as such a call is indirect). @xref{Ref.Type.Fn}. - -Every control path in a function ends with a @code{ret} expression or with a -diverging expression (described later in this section). If a control path -lacks a @code{ret} expression in source code, an implicit @code{ret} -expression is appended to the end of the control path during compilation, -returning the implicit @code{()} value. - -An example of a function: -@example -fn add(x: int, y: int) -> int @{ - ret x + y; -@} -@end example - -A special kind of function can be declared with a @code{!} character where the -output slot type would normally be. For example: -@example -fn my_err(s: str) -> ! @{ - log(info, s); - fail; -@} -@end example - -We call such functions ``diverging'' because they never return a value to the -caller. Every control path in a diverging function must end with a @code{fail} -or a call to another diverging function on every control path. The @code{!} -annotation does @emph{not} denote a type. Rather, the result type -of a diverging function is a special type called @math{\bot} (``bottom'') that -unifies with any type. Rust has no syntax for @math{\bot}. - -It might be necessary to declare a diverging function because as mentioned -previously, the typechecker checks that every control path in a function ends -with a @code{ret} or diverging expression. So, if @code{my_err} were declared -without the @code{!} annotation, the following code would not typecheck: -@example -fn f(i: int) -> int @{ - if i == 42 @{ - ret 42; - @} - else @{ - my_err("Bad number!"); - @} -@} -@end example - -The typechecker would complain that @code{f} doesn't return a value in the -@code{else} branch. Adding the @code{!} annotation on @code{my_err} would -express that @code{f} requires no explicit @code{ret}, as if it returns -control to the caller, it returns a value (true because it never returns -control). - -@node Ref.Item.Pred -@subsection Ref.Item.Pred -@c * Ref.Item.Pred:: Items defining predicates. -@cindex Predicate - -Any pure boolean function is called a @emph{predicate}, and may be used -as part of the static typestate system. @xref{Ref.Typestate.Constr}. A -predicate declaration is identical to a function declaration, except that it -is declared with the additional keyword @code{pure}. In addition, -the typechecker checks the body of a predicate with a restricted set of -typechecking rules. A predicate -@itemize -@item may not contain an assignment or -self-call expression; and -@item may only call other predicates, not general functions. -@end itemize - -An example of a predicate: -@example -pure fn lt_42(x: int) -> bool @{ - ret (x < 42); -@} -@end example - -A non-boolean function may also be declared with @code{pure fn}. This allows -predicates to call non-boolean functions as long as they are pure. For example: -@example -pure fn pure_length<@@T>(ls: list) -> uint @{ /* ... */ @} - -pure fn nonempty_list<@@T>(ls: list) -> bool @{ pure_length(ls) > 0u @} -@end example - -In this example, @code{nonempty_list} is a predicate---it can be used in a -typestate constraint---but the auxiliary function @code{pure_length}@ is -not. - -@emph{ToDo:} should actually define referential transparency. - -The effect checking rules previously enumerated are a restricted set of -typechecking rules meant to approximate the universe of observably -referentially transparent Rust procedures conservatively. Sometimes, these -rules are @emph{too} restrictive. Rust allows programmers to violate these -rules by writing predicates that the compiler cannot prove to be referentially -transparent, using an escape-hatch feature called ``unchecked blocks''. When -writing code that uses unchecked blocks, programmers should always be aware -that they have an obligation to show that the code @emph{behaves} referentially -transparently at all times, even if the compiler cannot @emph{prove} -automatically that the code is referentially transparent. In the presence of -unchecked blocks, the compiler provides no static guarantee that the code will -behave as expected at runtime. Rather, the programmer has an independent -obligation to verify the semantics of the predicates they write. - -@emph{ToDo:} last two sentences are vague. - -An example of a predicate that uses an unchecked block: -@example -fn pure_foldl<@@T, @@U>(ls: list, u: U, f: block(&T, &U) -> U) -> U @{ - alt ls @{ - nil. @{ u @} - cons(hd, tl) @{ f(hd, pure_foldl(*tl, f(hd, u), f)) @} - @} -@} - -pure fn pure_length<@@T>(ls: list) -> uint @{ - fn count(_t: T, u: uint) -> uint @{ u + 1u @} - unchecked @{ - pure_foldl(ls, 0u, count) - @} -@} -@end example - -Despite its name, @code{pure_foldl} is a @code{fn}, not a @code{pure fn}, -because there is no way in Rust to specify that the higher-order function -argument @code{f} is a pure function. So, to use @code{foldl} in a pure list -length function that a predicate could then use, we must use an -@code{unchecked} block wrapped around the call to @code{pure_foldl} in the -definition of @code{pure_length}. - - -@node Ref.Item.Obj -@subsection Ref.Item.Obj -@c * Ref.Item.Obj:: Items defining objects. -@cindex Objects -@cindex Object constructors - -An @dfn{object item} defines the @emph{state} and @emph{methods} of a set of -@emph{object values}. Object values have object types. @xref{Ref.Type.Obj}. - -An @emph{object item} declaration -- in addition to providing a scope for -state and method declarations -- implicitly declares a static function called -the @emph{object constructor}, as well as a named @emph{object type}. The name -given to the object item is resolved to a type when used in type context, or a -constructor function when used in value context (such as a call). - -Example of an object item: -@example -obj counter(state: @@mutable int) @{ - fn incr() @{ - *state += 1; - @} - fn get() -> int @{ - ret *state; - @} -@} - -let c: counter = counter(@@mutable 1); - -c.incr(); -c.incr(); -assert c.get() == 3; -@end example - -Inside an object's methods, you can make @emph{self-calls} using the -@code{self} keyword. -@example -obj my_obj() @{ - fn get() -> int @{ - ret 3; - @} - fn foo() -> int @{ - let c = self.get(); - ret c + 2; - @} -@} - -let o = my_obj(); -assert o.foo() == 5; -@end example - -Rust objects are extendable with additional methods and fields using -@emph{anonymous object} expressions. @xref{Ref.Expr.AnonObj}. - -@node Ref.Item.Type -@subsection Ref.Item.Type -@c * Ref.Item.Type:: Items defining the types of values and slots. -@cindex Type definitions - -A @dfn{type definition} defines a set of possible values in -memory. @xref{Ref.Type}. Type definitions are declared with the keyword -@code{type}. Every value has a single, specific type; the type-specified -aspects of a value include: - -@itemize -@item Whether the value is composed of sub-values or is indivisible. -@item Whether the value represents textual or numerical information. -@item Whether the value represents integral or floating-point information. -@item The sequence of memory operations required to access the value. -@item The @emph{kind} of the type (pinned, unique or shared). -@end itemize - -For example, the type @code{@{x: u8, y: u8@}} defines the set of immutable -values that are composite records, each containing two unsigned 8-bit integers -accessed through the components @code{x} and @code{y}, and laid out in memory -with the @code{x} component preceding the @code{y} component. This type is of -@emph{unique} kind, meaning that there is no shared substructure with other -types, but it can be copied and moved freely. - -@node Ref.Item.Tag -@subsection Ref.Item.Tag -@c * Ref.Item.Type:: Items defining the constructors of a tag type. -@cindex Tag types - -A tag item simultaneously declares a new nominal tag type -(@pxref{Ref.Type.Tag}) as well as a set of @emph{constructors} that can be -used to create or pattern-match values of the corresponding tag type. - -The constructors of a @code{tag} type may be recursive: that is, each constructor -may take an argument that refers, directly or indirectly, to the tag type the constructor -is a member of. Such recursion has restrictions: -@itemize -@item Recursive types can be introduced only through @code{tag} constructors. -@item A recursive @code{tag} item must have at least one non-recursive -constructor (in order to give the recursion a basis case). -@item The recursive argument of recursive tag constructors must be @emph{box} -values (in order to bound the in-memory size of the constructor). -@item Recursive type definitions can cross module boundaries, but not module -@emph{visibility} boundaries, nor crate boundaries (in order to simplify the -module system). -@end itemize - -An example of a @code{tag} item and its use: -@example -tag animal @{ - dog; - cat; -@} - -let a: animal = dog; -a = cat; -@end example - -An example of a @emph{recursive} @code{tag} item and its use: -@example -tag list @{ - nil; - cons(T, @@list); -@} - -let a: list = cons(7, @@cons(13, @@nil)); -@end example - - -@page -@node Ref.Type -@section Ref.Type -@cindex Types - -Every slot and value in a Rust program has a type. The @dfn{type} of a -@emph{value} defines the interpretation of the memory holding it. The type of -a @emph{slot} may also include constraints. @xref{Ref.Type.Constr}. - -Built-in types and type-constructors are tightly integrated into the language, -in nontrivial ways that are not possible to emulate in user-defined -types. User-defined types have limited capabilities. In addition, every -built-in type or type-constructor name is reserved as a @emph{keyword} in -Rust; they cannot be used as user-defined identifiers in any context. - -@menu -* Ref.Type.Mach:: Machine-level types. -* Ref.Type.Int:: The machine-dependent integer types. -* Ref.Type.Float:: The machine-dependent floating-point types. -* Ref.Type.Prim:: Primitive types. -* Ref.Type.Text:: Strings and characters. -* Ref.Type.Rec:: Labeled products of heterogeneous types. -* Ref.Type.Tup:: Unlabeled products of heterogeneous types. -* Ref.Type.Vec:: Open products of homogeneous types. -* Ref.Type.Tag:: Disjoint unions of heterogeneous types. -* Ref.Type.Fn:: Subroutine types. -* Ref.Type.Obj:: Abstract types. -* Ref.Type.Constr:: Constrained types. -* Ref.Type.Type:: Types describing types. -@end menu - -@node Ref.Type.Mach -@subsection Ref.Type.Mach -@cindex Machine types -@cindex Floating-point types -@cindex Integer types -@cindex Word types - -The machine types are the following: - -@itemize -@item -The unsigned word types @code{u8}, @code{u16}, @code{u32} and @code{u64}, -with values drawn from the integer intervals -@iftex -@math{[0, 2^8 - 1]}, -@math{[0, 2^{16} - 1]}, -@math{[0, 2^{32} - 1]} and -@math{[0, 2^{64} - 1]} -@end iftex -@ifhtml -@html -[0, 28-1], -[0, 216-1], -[0, 232-1] and -[0, 264-1] -@end html -@end ifhtml - respectively. -@item -The signed two's complement word types @code{i8}, @code{i16}, @code{i32} and -@code{i64}, with values drawn from the integer intervals -@iftex -@math{[-(2^7),(2^7)-1)]}, -@math{[-(2^{15}),2^{15}-1)]}, -@math{[-(2^{31}),2^{31}-1)]} and -@math{[-(2^{63}),2^{63}-1)]} -@end iftex -@ifhtml -@html -[-(27), 27-1], -[-(215), 215-1], -[-(231), 231-1] and -[-(263), 263-1] -@end html -@end ifhtml - respectively. -@item -The IEEE 754-2008 @code{binary32} and @code{binary64} floating-point types: -@code{f32} and @code{f64}, respectively. -@end itemize - -@node Ref.Type.Int -@subsection Ref.Type.Int -@cindex Machine-dependent types -@cindex Integer types -@cindex Word types - - -The Rust type @code{uint}@footnote{A Rust @code{uint} is analogous to a C99 -@code{uintptr_t}.} is an unsigned integer type with with -target-machine-dependent size. Its size, in bits, is equal to the number of -bits required to hold any memory address on the target machine. - -The Rust type @code{int}@footnote{A Rust @code{int} is analogous to a C99 -@code{intptr_t}.} is a two's complement signed integer type with -target-machine-dependent size. Its size, in bits, is equal to the size of the -rust type @code{uint} on the same target machine. - -@node Ref.Type.Float -@subsection Ref.Type.Float -@cindex Machine-dependent types -@cindex Floating-point types - -The Rust type @code{float} is a machine-specific type equal to one of the -supported Rust floating-point machine types (@code{f32} or @code{f64}). It is -the largest floating-point type that is directly supported by hardware on the -target machine, or if the target machine has no floating-point hardware -support, the largest floating-point type supported by the software -floating-point library used to support the other floating-point machine types. - -Note that due to the preference for hardware-supported floating-point, the -type @code{float} may not be equal to the largest @emph{supported} -floating-point type. - - -@node Ref.Type.Prim -@subsection Ref.Type.Prim -@cindex Primitive types -@cindex Integer types -@cindex Floating-point types -@cindex Character type -@cindex Boolean type - -The primitive types are the following: - -@itemize -@item -The ``nil'' type @code{()}, having the single ``nil'' value -@code{()}.@footnote{The ``nil'' value @code{()} is @emph{not} a sentinel -``null pointer'' value for reference slots; the ``nil'' type is the implicit -return type from functions otherwise lacking a return type, and can be used in -other contexts (such as message-sending or type-parametric code) as a -zero-size type.} -@item -The boolean type @code{bool} with values @code{true} and @code{false}. -@item -The machine types. -@item -The machine-dependent integer and floating-point types. -@end itemize - -@node Ref.Type.Text -@subsection Ref.Type.Text -@cindex Text types -@cindex String type -@cindex Character type -@cindex Unicode -@cindex UCS-4 -@cindex UTF-8 - -The types @code{char} and @code{str} hold textual data. - -A value of type @code{char} is a Unicode character, represented as a 32-bit -unsigned word holding a UCS-4 codepoint. - -A value of type @code{str} is a Unicode string, represented as a vector of -8-bit unsigned bytes holding a sequence of UTF-8 codepoints. - -@node Ref.Type.Rec -@subsection Ref.Type.Rec -@cindex Record types -@cindex Structure types, see @i{Record types} - -The record type-constructor forms a new heterogeneous product of -values.@footnote{The record type-constructor is analogous to the @code{struct} -type-constructor in the Algol/C family, the @emph{record} types of the ML -family, or the @emph{structure} types of the Lisp family.} Fields of a record -type are accessed by name and are arranged in memory in the order specified by -the record type. - -An example of a record type and its use: -@example -type point = @{x: int, y: int@}; -let p: point = @{x: 10, y: 11@}; -let px: int = p.x; -@end example - -@node Ref.Type.Tup -@subsection Ref.Type.Tup -@cindex Tuple types - -The tuple type-constructor forms a new heterogeneous product of -values similar to the record type-constructor. The differences are as follows: - -@itemize -@item tuple elements cannot be mutable, unlike record fields -@item tuple elements are not named and can be accessed only by pattern-matching -@end itemize - -Tuple types and values are denoted by listing the types or values of -their elements, respectively, in a parenthesized, comma-separated -list. Single-element tuples are not legal; all tuples have two or more values. - -The members of a tuple are laid out in memory contiguously, like a record, in -order specified by the tuple type. - -An example of a tuple type and its use: -@example -type pair = (int,str); -let p: pair = (10,"hello"); -let (a, b) = p; -assert (b == "world"); -@end example - - -@node Ref.Type.Vec -@subsection Ref.Type.Vec -@cindex Vector types -@cindex Array types, see @i{Vector types} - -The vector type-constructor represents a homogeneous array of values of a -given type. A vector has a fixed size. The kind of a vector type depends on -the kind of its member type, as with other simple structural types. - -An example of a vector type and its use: -@example -let v: [int] = [7, 5, 3]; -let i: int = v[2]; -assert (i == 3); -@end example - -Vectors always @emph{allocate} a storage region sufficient to store the first -power of two worth of elements greater than or equal to the size of the -vector. This behaviour supports idiomatic in-place ``growth'' of a mutable -slot holding a vector: - -@example -let v: mutable [int] = [1, 2, 3]; -v += [4, 5, 6]; -@end example - -Normal vector concatenation causes the allocation of a fresh vector to hold -the result; in this case, however, the slot holding the vector recycles the -underlying storage in-place (since the reference-count of the underlying -storage is equal to 1). - -All accessible elements of a vector are always initialized, and access to a -vector is always bounds-checked. - - -@node Ref.Type.Tag -@subsection Ref.Type.Tag -@cindex Tag types -@cindex Union types, see @i{Tag types} - -A @emph{tag type} is a nominal, heterogeneous disjoint union -type.@footnote{The @code{tag} type is analogous to a @code{data} constructor -declaration in ML or a @emph{pick ADT} in Limbo.} A @code{tag} @emph{item} -consists of a number of @emph{constructors}, each of which is independently -named and takes an optional tuple of arguments. - -Tag types cannot be denoted @emph{structurally} as types, but must be denoted -by named reference to a @emph{tag item} declaration. @xref{Ref.Item.Tag}. - -@node Ref.Type.Fn -@subsection Ref.Type.Fn -@cindex Function types - -The function type-constructor @code{fn} forms new function types. A function -type consists of a sequence of input slots, an optional set of input -constraints (@pxref{Ref.Typestate.Constr}) and an output -slot. @xref{Ref.Item.Fn}. - -An example of a @code{fn} type: -@example -fn add(x: int, y: int) -> int @{ - ret x + y; -@} - -let int x = add(5,7); - -type binop = fn(int,int) -> int; -let bo: binop = add; -x = bo(5,7); -@end example - -@node Ref.Type.Obj -@subsection Ref.Type.Obj -@c * Ref.Type.Obj:: Object types. -@cindex Object types - -A @dfn{object type} describes values of abstract type, that carry some hidden -@emph{fields} and are accessed through a set of un-ordered -@emph{methods}. Every object item (@pxref{Ref.Item.Obj}) implicitly declares -an object type carrying methods with types derived from all the methods of the -object item. - -Object types can also be declared in isolation, independent of any object item -declaration. Such a ``plain'' object type can be used to describe an interface -that a variety of particular objects may conform to, by supporting a superset -of the methods. - -The kind of an object type serves as a restriction to the kinds of fields that -may be stored in it. Unique objects, for example, can only carry unique values -in their fields. - -An example of an object type with two separate object items supporting it, and -a client function using both items via the object type: - -@example - -type taker = - obj @{ - fn take(int); - @}; - -obj adder(x: @@mutable int) @{ - fn take(y: int) @{ - *x += y; - @} -@} - -obj sender(c: chan) @{ - fn take(z: int) @{ - std::comm::send(c, z); - @} -@} - -fn give_ints(t: taker) @{ - t.take(1); - t.take(2); - t.take(3); -@} - -let p: port = std::comm::mk_port(); - -let t1: taker = adder(@@mutable 0); -let t2: taker = sender(p.mk_chan()); - -give_ints(t1); -give_ints(t2); - -@end example - - - -@node Ref.Type.Constr -@subsection Ref.Type.Constr -@c * Ref.Type.Constr:: Constrained types. -@cindex Constrained types - -A @dfn{constrained type} is a type that carries a @emph{formal constraint} -(@pxref{Ref.Typestate.Constr}), which is similar to a normal constraint except -that the @emph{base name} of any slots mentioned in the constraint must be the -special @emph{formal symbol} @emph{*}. - -When a constrained type is instantiated in a particular slot declaration, the -formal symbol in the constraint is replaced with the name of the declared slot -and the resulting constraint is checked immediately after the slot is -declared. @xref{Ref.Expr.Check}. - -An example of a constrained type with two separate instantiations: -@example -type ordered_range = @{low: int, high: int@} : less_than(*.low, *.high); - -let rng1: ordered_range = @{low: 5, high: 7@}; -// implicit: 'check less_than(rng1.low, rng1.high);' - -let rng2: ordered_range = @{low: 15, high: 17@}; -// implicit: 'check less_than(rng2.low, rng2.high);' -@end example - -@node Ref.Type.Type -@subsection Ref.Type.Type -@c * Ref.Type.Type:: Types describing types. -@cindex Type type - -@emph{TODO}. - - - -@node Ref.Typestate -@section Ref.Typestate -@c * Ref.Typestate:: The static system of predicate analysis. -@cindex Typestate system - -Rust programs have a static semantics that determine the types of values -produced by each expression, as well as the @emph{predicates} that hold over -slots in the environment at each point in time during execution. - -The latter semantics -- the dataflow analysis of predicates holding over slots --- is called the @emph{typestate} system. - -@menu -* Ref.Typestate.Point:: Discrete positions in execution. -* Ref.Typestate.CFG:: The control-flow graph formed by points. -* Ref.Typestate.Constr:: Predicates applied to slots. -* Ref.Typestate.Cond:: Constraints required and implied by a point. -* Ref.Typestate.State:: Constraints that hold at points. -* Ref.Typestate.Check:: Relating dynamic state to static typestate. -@end menu - -@node Ref.Typestate.Point -@subsection Ref.Typestate.Point -@c * Ref.Typestate.Point:: Discrete positions in execution. -@cindex Points - -Control flows from statement to statement in a block, and through the -evaluation of each expression, from one sub-expression to another. This -sequential control flow is specified as a set of @dfn{points}, each of which -has a set of points before and after it in the implied control flow. - -For example, this code: - -@example - s = "hello, world"; - print(s); -@end example - -Consists of 2 statements, 3 expressions and 12 points: - -@itemize -@item the point before the first statement -@item the point before evaluating the static initializer @code{"hello, world"} -@item the point after evaluating the static initializer @code{"hello, world"} -@item the point after the first statement -@item the point before the second statement -@item the point before evaluating the function value @code{print} -@item the point after evaluating the function value @code{print} -@item the point before evaluating the arguments to @code{print} -@item the point before evaluating the symbol @code{s} -@item the point after evaluating the symbol @code{s} -@item the point after evaluating the arguments to @code{print} -@item the point after the second statement -@end itemize - -Whereas this code: - -@example - print(x() + y()); -@end example - -Consists of 1 statement, 7 expressions and 14 points: - -@itemize -@item the point before the statement -@item the point before evaluating the function value @code{print} -@item the point after evaluating the function value @code{print} -@item the point before evaluating the arguments to @code{print} -@item the point before evaluating the arguments to @code{+} -@item the point before evaluating the function value @code{x} -@item the point after evaluating the function value @code{x} -@item the point before evaluating the arguments to @code{x} -@item the point after evaluating the arguments to @code{x} -@item the point before evaluating the function value @code{y} -@item the point after evaluating the function value @code{y} -@item the point before evaluating the arguments to @code{y} -@item the point after evaluating the arguments to @code{y} -@item the point after evaluating the arguments to @code{+} -@item the point after evaluating the arguments to @code{print} -@end itemize - - -The typestate system reasons over points, rather than statements or -expressions. This may seem counter-intuitive, but points are the more -primitive concept. Another way of thinking about a point is as a set of -@emph{instants in time} at which the state of a task is fixed. By contrast, a -statement or expression represents a @emph{duration in time}, during which the -state of the task changes. The typestate system is concerned with constraining -the possible states of a task's memory at @emph{instants}; it is meaningless -to speak of the state of a task's memory ``at'' a statement or expression, as -each statement or expression is likely to change the contents of memory. - -@node Ref.Typestate.CFG -@subsection Ref.Typestate.CFG -@c * Ref.Typestate.CFG:: The control-flow graph formed by points. -@cindex Control-flow graph - -Each @emph{point} can be considered a vertex in a directed @emph{graph}. Each -kind of expression or statement implies a number of points @emph{and edges} in -this graph. The edges connect the points within each statement or expression, -as well as between those points and those of nearby statements and expressions -in the program. The edges between points represent @emph{possible} indivisible -control transfers that might occur during execution. - -This implicit graph is called the @dfn{control-flow graph}, or @dfn{CFG}. - -@node Ref.Typestate.Constr -@subsection Ref.Typestate.Constr -@c * Ref.Typestate.Constr:: Predicates applied to slots. -@cindex Predicate -@cindex Constraint - -A @dfn{predicate} is a pure boolean function declared with the keyword -@code{pred}. @xref{Ref.Item.Pred}. - -A @dfn{constraint} is a predicate applied to specific slots. - -For example, consider the following code: - -@example -pure fn is_less_than(int a, int b) -> bool @{ - ret a < b; -@} - -fn test() @{ - let x: int = 10; - let y: int = 20; - check is_less_than(x,y); -@} -@end example - -This example defines the predicate @code{is_less_than}, and applies it to the -slots @code{x} and @code{y}. The constraint being checked on the third line of -the function is @code{is_less_than(x,y)}. - -Predicates can only apply to slots holding immutable values. The slots a -predicate applies to can themselves be mutable, but the types of values held -in those slots must be immutable. - -@node Ref.Typestate.Cond -@subsection Ref.Typestate.Cond -@c * Ref.Typestate.Cond:: Constraints required and implied by a point. -@cindex Condition -@cindex Precondition -@cindex Postcondition - -A @dfn{condition} is a set of zero or more constraints. - -Each @emph{point} has an associated @emph{condition}: - -@itemize -@item The @dfn{precondition} of a statement or expression is the condition -required at in the point before it. -@item The @dfn{postcondition} of a statement or expression is the condition -enforced in the point after it. -@end itemize - -Any constraint present in the precondition and @emph{absent} in the -postcondition is considered to be @emph{dropped} by the statement or -expression. - -@node Ref.Typestate.State -@subsection Ref.Typestate.State -@c * Ref.Typestate.State:: Constraints that hold at points. -@cindex Typestate -@cindex Prestate -@cindex Poststate - -The typestate checking system @emph{calculates} an additional condition for -each point called its typestate. For a given statement or expression, we call -the two typestates associated with its two points the prestate and a -poststate. - -@itemize -@item The @dfn{prestate} of a statement or expression is the typestate of the -point before it. -@item The @dfn{poststate} of a statement or expression is the typestate of the -point after it. -@end itemize - -A @dfn{typestate} is a condition that has @emph{been determined by the -typestate algorithm} to hold at a point. This is a subtle but important point -to understand: preconditions and postconditions are @emph{inputs} to the -typestate algorithm; prestates and poststates are @emph{outputs} from the -typestate algorithm. - -The typestate algorithm analyses the preconditions and postconditions of every -statement and expression in a block, and computes a condition for each -typestate. Specifically: - -@itemize -@item Initially, every typestate is empty. -@item Each statement or expression's poststate is given the union of the its -prestate, precondition, and postcondition. -@item Each statement or expression's poststate has the difference between its -precondition and postcondition removed. -@item Each statement or expression's prestate is given the intersection of the -poststates of every predecessor point in the CFG. -@item The previous three steps are repeated until no typestates in the -block change. -@end itemize - -The typestate algorithm is a very conventional dataflow calculation, and can -be performed using bit-set operations, with one bit per predicate and one -bit-set per condition. - -After the typestates of a block are computed, the typestate algorithm checks -that every constraint in the precondition of a statement is satisfied by its -prestate. If any preconditions are not satisfied, the mismatch is considered a -static (compile-time) error. - - -@node Ref.Typestate.Check -@subsection Ref.Typestate.Check -@c * Ref.Typestate.Check:: Relating dynamic state to static typestate. -@cindex Check statement -@cindex Assertions, see @i{Check statement} - -The key mechanism that connects run-time semantics and compile-time analysis -of typestates is the use of @code{check} expressions. @xref{Ref.Expr.Check}. A -@code{check} expression guarantees that @emph{if} control were to proceed past -it, the predicate associated with the @code{check} would have succeeded, so -the constraint being checked @emph{statically} holds in subsequent -points.@footnote{A @code{check} expression is similar to an @code{assert} -call in a C program, with the significant difference that the Rust compiler -@emph{tracks} the constraint that each @code{check} expression -enforces. Naturally, @code{check} expressions cannot be omitted from a -``production build'' of a Rust program the same way @code{asserts} are -frequently disabled in deployed C programs.} - -It is important to understand that the typestate system has @emph{no insight} -into the meaning of a particular predicate. Predicates and constraints are not -evaluated in any way at compile time. Predicates are treated as specific (but -unknown) functions applied to specific (also unknown) slots. All the typestate -system does is track which of those predicates -- whatever they calculate -- -@emph{must have been checked already} in order for program control to reach a -particular point in the CFG. The fundamental building block, therefore, is the -@code{check} statement, which tells the typestate system ``if control passes -this point, the checked predicate holds''. - -From this building block, constraints can be propagated to function signatures -and constrained types, and the responsibility to @code{check} a constraint -pushed further and further away from the site at which the program requires it -to hold in order to execute properly. - - -@page -@node Ref.Stmt -@section Ref.Stmt -@c * Ref.Stmt:: Components of an executable block. -@cindex Statements - -A @dfn{statement} is a component of a block, which is in turn a component of -an outer block-expression or function. When a function is spawned into a task, -the task @emph{executes} statements in an order determined by the body of the -enclosing structure. Each statement causes the task to perform certain -actions. - -Rust has two kinds of statement: declarations and expressions. - -A declaration serves to introduce a @emph{name} that can be used in the block -@emph{scope} enclosing the statement: all statements before and after the -name, from the previous opening curly-brace (@code{@{}) up to the next closing -curly-brace (@code{@}}). - -An expression serves the dual roles of causing side effects and producing a -@emph{value}. Expressions are said to @emph{evaluate to} a value, and the side -effects are caused during @emph{evaluation}. Many expressions contain -sub-expressions as operands; the definition of each kind of expression -dictates whether or not, and in which order, it will evaluate its -sub-expressions, and how the expression's value derives from the value of its -sub-expressions. - -In this way, the structure of execution -- both the overall sequence of -observable side effects and the final produced value -- is dictated by the -structure of expressions. Blocks themselves are expressions, so the nesting -sequence of block, statement, expression, and block can repeatedly nest to an -arbitrary depth. - -@menu -* Ref.Stmt.Decl:: Statement declaring an item or slot. -* Ref.Stmt.Expr:: Statement evaluating an expression. -@end menu - -@node Ref.Stmt.Decl -@subsection Ref.Stmt.Decl -@c * Ref.Stmt.Decl:: Statement declaring an item or slot. -@cindex Declaration statement - -A @dfn{declaration statement} is one that introduces a @emph{name} into the -enclosing statement block. The declared name may denote a new slot or a new -item. The scope of the name extends to the entire containing block, both -before and after the declaration. Tag names may not be shadowed by variable names. - -@menu -* Ref.Stmt.Decl.Item:: Statement declaring an item. -* Ref.Stmt.Decl.Slot:: Statement declaring a slot. -@end menu - -@node Ref.Stmt.Decl.Item -@subsubsection Ref.Stmt.Decl.Item -@c * Ref.Stmt.Decl.Item:: Statement declaring an item. - -An @dfn{item declaration statement} has a syntactic form identical to an item -declaration within a module. Declaring an item -- a function, object, type or -module -- locally within a statement block is simply a way of restricting its -scope to a narrow region containing all of its uses; it is otherwise identical -in meaning to declaring the item outside the statement block. - -Note: there is no implicit capture of the function's dynamic environment when -declaring a function-local item. - -@node Ref.Stmt.Decl.Slot -@subsubsection Ref.Stmt.Decl.Slot -@c * Ref.Stmt.Decl.Slot:: Statement declaring an slot. -@cindex Local slot -@cindex Variable, see @i{Local slot} -@cindex Type inference - -A @code{slot declaration statement} has one one of two forms: - -@itemize -@item @code{let} @var{pattern} @var{optional-init}; -@item @code{let} @var{pattern} : @var{type} @var{optional-init}; -@end itemize - -Where @var{type} is a type expression, @var{pattern} is an irrefutable pattern -(often just the name of a single slot), and @var{optional-init} is an optional -initializer. If present, the initializer consists of either an equals sign -(@code{=}) or move operator (@code{<-}), followed by an expression. - -Both forms introduce a new slot into the containing block scope. The new slot -is visible across the entire scope, but is initialized only at the point -following the declaration statement. - -The former form, with no type annotation, causes the compiler to infer the -static type of the slot through unification with the types of values assigned -to the slot in the remaining code in the block scope. Inference only occurs on -frame-local slots, not argument slots. Function and object signatures must -always declared types for all argument slots. @xref{Ref.Mem.Slot}. - -@node Ref.Stmt.Expr -@subsection Ref.Stmt.Expr -@c * Ref.Stmt.Expr:: Statement evaluating an expression -@cindex Expression statement - -An @dfn{expression statement} is one that evaluates an expression and drops -its result. The purpose of an expression statement is often to cause the side -effects of the expression's evaluation. - -@page -@node Ref.Expr -@section Ref.Expr -@c * Ref.Expr:: Parsed and primitive expressions. -@cindex Expressions - - -@menu -* Ref.Expr.Copy:: Expression for copying a value. -* Ref.Expr.Move:: Expression for moving a value. -* Ref.Expr.Swap:: Expression for swapping two values. -* Ref.Expr.Assign:: Expression for moving a copy of a value. -* Ref.Expr.Call:: Expression for calling a function. -* Ref.Expr.Bind:: Expression for binding arguments to functions. -* Ref.Expr.Ret:: Expression for stopping and producing a value. -* Ref.Expr.As:: Expression for casting a value to a different type. -* Ref.Expr.Fail:: Expression for causing task failure. -* Ref.Expr.Log:: Expression for logging values to diagnostic buffers. -* Ref.Expr.Note:: Expression for logging values during failure. -* Ref.Expr.While:: Expression for simple conditional looping. -* Ref.Expr.Break:: Expression for terminating a loop. -* Ref.Expr.Cont:: Expression for terminating a single loop iteration. -* Ref.Expr.For:: Expression for looping over strings and vectors. -* Ref.Expr.If:: Expression for simple conditional branching. -* Ref.Expr.Alt:: Expression for branching on pattern matches. -* Ref.Expr.Prove:: Expression for static assertion of typestate. -* Ref.Expr.Check:: Expression for dynamic assertion of typestate. -* Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate. -* Ref.Expr.Assert:: Expression for halting the program if a boolean condition fails to hold. -* Ref.Expr.IfCheck:: Expression for dynamic testing of typestate. -* Ref.Expr.AnonObj:: Expression for extending objects with additional methods. -@end menu - -@node Ref.Expr.Copy -@subsection Ref.Expr.Copy -@c * Ref.Expr.Copy:: Expression for copying a value. -@cindex Copy expression -@cindex Copy operator, see @i{Move expression} - -A unary @dfn{copy expression} consists of the unary @code{copy} operator -applied to some argument expression. - -Evaluating a copy expression first evaluates the argument expression, then -performs a copy of the resulting value, allocating any memory necessary to -hold the new copy. - -Shared boxes (type @code{@@}) are, as usual, shallow-copied, as they may be -cyclic. Unique boxes, vectors and similar unique types are deep-copied. - -Since the assignment operator @code{=} performs a copy implicitly, the unary -copy operator is typically only used to cause an argument to a function should -be copied, and the copy passed by-value. @xref{Ref.Expr.Assign}. - -An example of a copy expression: -@example -fn mutate(vec: [mutable int]) @{ - vec[0] = 10; -@} - -let v = [mutable 1,2,3]; - -mutate(copy v); // Pass a copy - -assert v[0] == 1; // Original was not modified -@end example - - -@node Ref.Expr.Move -@subsection Ref.Expr.Move -@c * Ref.Expr.Move:: Expression for moving a value. -@cindex Move expression -@cindex Move operator, see @i{Move expression} - -A @dfn{move experssion} consists of an @emph{lval} followed by a left-pointing -arrow (@code{<-}) and an @emph{rval} expression. @xref{Ref.Expr}. - -Evaluating a move expression causes, as a side effect, the @emph{rval} to be -@emph{moved} into the @emph{lval}. If the @emph{rval} was itself an @emph{lval}, -it must be a local variable, as it will be de-initialized in the process. - -Evaluating a move expression does not effect reference counts nor does it -cause a deep copy of any unique structure pointed-to by the moved -@emph{rval}. Instead, the move expression represents an indivisible -@emph{transfer of ownership} from the right-hand-side to the left-hand-side of -the expression. No allocation or destruction is entailed. - -An example of three different move expressions: -@example -x <- a; -x[i] <- b; -x.y <- c; -@end example - - -@node Ref.Expr.Swap -@subsection Ref.Expr.Swap -@c * Ref.Expr.Swap:: Expression for swapping two values. -@cindex Swap expression -@cindex Swap operator, see @i{Move expression} - -A @dfn{swap experssion} consists of an @emph{lval} followed by a bi-directional -arrow (@code{<->}) and another @emph{lval} expression. @xref{Ref.Expr}. - -Evaluating a swap expression causes, as a side effect, the vales held in the -left-hand-side and right-hand-side @emph{lvals} to be exchanged indivisibly. - -Evaluating a move expression does not effect reference counts nor does it -cause a deep copy of any unique structure pointed-to by the moved -@emph{rval}. Instead, the move expression represents an indivisible -@emph{exchange of ownership} between the right-hand-side to the left-hand-side -of the expression. No allocation or destruction is entailed. - -An example of three different swap expressions: -@example -x <-> a; -x[i] <-> b[i]; -x.y <-> a.b; -@end example - - -@node Ref.Expr.Assign -@subsection Ref.Expr.Assign -@c * Ref.Expr.Copy:: Expression for moving a copy of a value. -@cindex Assignment expression -@cindex Assignment operator, see @i{Assignment expression} - -An @dfn{assignment expression} consists of an @emph{lval} expression followed -by an equals-sign (@code{=}) and an @emph{rval} expression. @xref{Ref.Expr}. - -Evaluating an assignment expression is equivalent to evaluating a move -expression combined with a unary copy expression. For example, the following -two expressions have the same effect: - -@example -x = y -x <- copy y -@end example - -The former is simply more terse and familiar. @xref{Ref.Expr.Copy}, -@xref{Ref.Expr.Move}. - - -@node Ref.Expr.Call -@subsection Ref.Expr.Call -@c * Ref.Expr.Call:: Expression for calling a function. -@cindex Call expression -@cindex Function calls - -A @dfn{call expression} invokes a function, providing a tuple of input slots -and a reference slot to serve as the function's output, bound to the -@var{lval} on the right hand side of the call. If the function eventually -returns, then the expression completes. - -A call expression statically requires that the precondition declared in the -callee's signature is satisfied by the expression prestate. In this way, -typestates propagate through function boundaries. @xref{Ref.Typestate}. - -An example of a call expression: -@example -let x: int = add(1, 2); -@end example - -@node Ref.Expr.Bind -@subsection Ref.Expr.Bind -@c * Ref.Expr.Bind:: Expression for binding arguments to functions. -@cindex Bind expression -@cindex Closures -@cindex Currying - -A @dfn{bind expression} constructs a new function from an existing -function.@footnote{The @code{bind} expression is analogous to the @code{bind} -expression in the Sather language.} The new function has zero or more of its -arguments @emph{bound} into a new, hidden boxed tuple that holds the -bindings. For each concrete argument passed in the @code{bind} expression, the -corresponding parameter in the existing function is @emph{omitted} as a -parameter of the new function. For each argument passed the placeholder symbol -@code{_} in the @code{bind} expression, the corresponding parameter of the -existing function is @emph{retained} as a parameter of the new function. - -Any subsequent invocation of the new function with residual arguments causes -invocation of the existing function with the combination of bound arguments -and residual arguments that was specified during the binding. - -An example of a @code{bind} expression: -@example -fn add(x: int, y: int) -> int @{ - ret x + y; -@} -type single_param_fn = fn(int) -> int; - -let add4: single_param_fn = bind add(4, _); - -let add5: single_param_fn = bind add(_, 5); - -assert (add(4,5) == add4(5)); -assert (add(4,5) == add5(4)); - -@end example - -A @code{bind} expression generally stores a copy of the bound arguments in the -hidden, boxed tuple, owned by the resulting first-class function. For each -bound slot in the bound function's signature, space is allocated in the hidden -tuple and populated with a copy of the bound value. - -The @code{bind} expression is a lightweight mechanism for simulating the more -elaborate construct of @emph{lexical closures} that exist in other -languages. Rust has no support for lexical closures, but many realistic uses -of them can be achieved with @code{bind} expressions. - - -@node Ref.Expr.Ret -@subsection Ref.Expr.Ret -@c * Ref.Expr.Ret:: Expression for stopping and producing a value. -@cindex Return expression - -Executing a @code{ret} expression@footnote{A @code{ret} expression is analogous -to a @code{return} expression in the C family.} copies a value into the output -slot of the current function, destroys the current function activation frame, -and transfers control to the caller frame. - -An example of a @code{ret} expression: -@example -fn max(a: int, b: int) -> int @{ - if a > b @{ - ret a; - @} - ret b; -@} -@end example - - -@node Ref.Expr.As -@subsection Ref.Expr.As -@c * Ref.Expr.As:: Expression for casting a value to a different type. -@cindex As expression -@cindex Cast -@cindex Typecast -@cindex Trivial cast - -Executing an @code{as} expression casts the value on the left-hand side to the -type on the right-hand side. - -A numeric value can be cast to any numeric type. A native pointer value can -be cast to or from any integral type or native pointer type. Any other cast -is unsupported and will fail to compile. - -An example of an @code{as} expression: -@example -fn avg(v: [float]) -> float @{ - let sum: float = sum(v); - let sz: float = std::vec::len(v) as float; - ret sum / sz; -@} -@end example - -A cast is a @emph{trivial cast} iff the type of the casted expression and the -target type are identical after replacing all occurences of @code{int}, -@code{uint}, @code{float} with their machine type equivalents of the -target architecture in both types. - -@node Ref.Expr.Fail -@subsection Ref.Expr.Fail -@c * Ref.Expr.Fail:: Expression for causing task failure. -@cindex Fail expression -@cindex Failure -@cindex Unwinding - -Executing a @code{fail} expression causes a task to enter the @emph{failing} -state. In the @emph{failing} state, a task unwinds its stack, destroying all -frames and freeing all resources until it reaches its entry frame, at which -point it halts execution in the @emph{dead} state. - -@node Ref.Expr.Log -@subsection Ref.Expr.Log -@c * Ref.Expr.Log:: Expression for logging values to diagnostic buffers. -@cindex Log expression -@cindex Logging - -Executing a @code{log} expression may, depending on runtime configuration, -cause a value to be appended to an internal diagnostic logging buffer provided -by the runtime or emitted to a system console. Log expressions are enabled or -disabled dynamically at run-time on a per-task and per-item -basis. @xref{Ref.Run.Log}. - -Each @code{log} expression must be provided with a @emph{level} argument in -addition to the value to log. The logging level is a @code{u32} value, where -lower levels indicate more-urgent levels of logging. By default, the lowest -four logging levels (@code{0_u32 ... 3_u32}) are predefined as the constants -@code{error}, @code{warn}, @code{info} and @code{debug} in the @code{core} -library. - -Additionally, the macros @code{#error}, @code{#warn}, @code{#info} and -@code{#debug} are defined in the default syntax-extension namespace. These -expand into calls to the logging facility composed with calls to the -@code{#fmt} string formatting syntax-extension. - -The following examples all produce the same output, logged at the @code{error} -logging level: - -@example -// Full version, logging a value. -log(core::error, "file not found: " + filename); - -// Log-level abbreviated, since core::* is imported by default. -log(error, "file not found: " + filename); - -// Formatting the message using a format-string and #fmt -log(error, #fmt("file not found: %s", filename)); - -// Using the #error macro, that expands to the previous call. -#error("file not found: %s", filename); -@end example - -A @code{log} expression is @emph{not evaluated} when logging at the specified -logging-level, module or task is disabled at runtime. This makes inactive -@code{log} expressions very cheap; they should be used extensively in Rust -code, as diagnostic aids, as they add little overhead beyond a single -integer-compare and branch at runtime. - -Logging is presently implemented as a language built-in feature, as it makes -use of compiler-provided logic for allocating the associated per-module -logging-control structures visible to the runtime, and lazily evaluating -arguments. In the future, as more of the supporting compiler-provided logic is -moved into libraries, logging is likely to move to a component of the core -library. It is best to use the macro forms of logging (@emph{#error}, -@emph{#debug}, etc.) to minimize disruption to code using the logging facility -when it is changed. - -@example -@end example - -@node Ref.Expr.Note -@subsection Ref.Expr.Note -@c * Ref.Expr.Note:: Expression for logging values during failure. -@cindex Note expression -@cindex Logging -@cindex Unwinding -@cindex Failure - -A @code{note} expression has no effect during normal execution. The purpose of -a @code{note} expression is to provide additional diagnostic information to the -logging subsystem during task failure. @xref{Ref.Expr.Log}. Using @code{note} -expressions, normal diagnostic logging can be kept relatively sparse, while -still providing verbose diagnostic ``back-traces'' when a task fails. - -When a task is failing, control frames @emph{unwind} from the innermost frame -to the outermost, and from the innermost lexical block within an unwinding -frame to the outermost. When unwinding a lexical block, the runtime processes -all the @code{note} expressions in the block sequentially, from the first -expression of the block to the last. During processing, a @code{note} -expression has equivalent meaning to a @code{log} expression: it causes the -runtime to append the argument of the @code{note} to the internal logging -diagnostic buffer. - -An example of a @code{note} expression: -@example -fn read_file_lines(path: str) -> [str] @{ - note path; - let r: [str]; - let f: file = open_read(path); - lines(f) @{|s| - r += [s]; - @} - ret r; -@} -@end example - -In this example, if the task fails while attempting to open or read a file, -the runtime will log the path name that was being read. If the function -completes normally, the runtime will not log the path. - -A value that is marked by a @code{note} expression is @emph{not} copied aside -when control passes through the @code{note}. In other words, if a @code{note} -expression notes a particular @var{lval}, and code after the @code{note} -mutates that slot, and then a subsequent failure occurs, the @emph{mutated} -value will be logged during unwinding, @emph{not} the original value that was -denoted by the @var{lval} at the moment control passed through the @code{note} -expression. - -@node Ref.Expr.While -@subsection Ref.Expr.While -@c * Ref.Expr.While:: Expression for simple conditional looping. -@cindex While expression -@cindex Loops -@cindex Control-flow - -A @code{while} expression is a loop construct. A @code{while} loop may be -either a simple @code{while} or a @code{do}-@code{while} loop. - -In the case of a simple @code{while}, the loop begins by evaluating the -boolean loop conditional expression. If the loop conditional expression -evaluates to @code{true}, the loop body block executes and control returns to -the loop conditional expression. If the loop conditional expression evaluates -to @code{false}, the @code{while} expression completes. - -In the case of a @code{do}-@code{while}, the loop begins with an execution of -the loop body. After the loop body executes, it evaluates the loop conditional -expression. If it evaluates to @code{true}, control returns to the beginning -of the loop body. If it evaluates to @code{false}, control exits the loop. - -An example of a simple @code{while} expression: -@example -while (i < 10) @{ - print("hello\n"); - i = i + 1; -@} -@end example - -An example of a @code{do}-@code{while} expression: -@example -do @{ - print("hello\n"); - i = i + 1; -@} while (i < 10); -@end example - -@node Ref.Expr.Break -@subsection Ref.Expr.Break -@c * Ref.Expr.Break:: Expression for terminating a loop. -@cindex Break expression -@cindex Loops -@cindex Control-flow - -Executing a @code{break} expression immediately terminates the innermost loop -enclosing it. It is only permitted in the body of a loop. - -@node Ref.Expr.Cont -@subsection Ref.Expr.Cont -@c * Ref.Expr.Cont:: Expression for terminating a single loop iteration. -@cindex Continue expression -@cindex Loops -@cindex Control-flow - -Executing a @code{cont} expression immediately terminates the current -iteration of the innermost loop enclosing it, returning control to the loop -@emph{head}. In the case of a @code{while} loop, the head is the conditional -expression controlling the loop. In the case of a @code{for} loop, the head is -the vector-element increment controlling the loop. - -A @code{cont} expression is only permitted in the body of a loop. - - -@node Ref.Expr.For -@subsection Ref.Expr.For -@c * Ref.Expr.For:: Expression for looping over strings and vectors. -@cindex For expression -@cindex Loops -@cindex Control-flow - -A @dfn{for loop} is controlled by a vector or string. The for loop -bounds-checks the underlying sequence @emph{once} when initiating the loop, -then repeatedly copies each value of the underlying sequence into the element -variable, executing the loop body once per copy. - -Example a for loop: -@example -let v: [foo] = [a, b, c]; - -for e: foo in v @{ - bar(e); -@} -@end example - - -@node Ref.Expr.If -@subsection Ref.Expr.If -@c * Ref.Expr.If:: Expression for simple conditional branching. -@cindex If expression -@cindex Control-flow - -An @code{if} expression is a conditional branch in program control. The form of -an @code{if} expression is a condition expression, followed by a consequent -block, any number of @code{else if} conditions and blocks, and an optional -trailing @code{else} block. The condition expressions must have type -@code{bool}. If a condition expression evaluates to @code{true}, the -consequent block is executed and any subsequent @code{else if} or @code{else} -block is skipped. If a condition expression evaluates to @code{false}, the -consequent block is skipped and any subsequent @code{else if} condition is -evaluated. If all @code{if} and @code{else if} conditions evaluate to @code{false} -then any @code{else} block is executed. - -@node Ref.Expr.Alt -@subsection Ref.Expr.Alt -@c * Ref.Expr.Pat:: Expression for branching on pattern matches. -@cindex Pattern alt expression -@cindex Control-flow - -An @code{alt} expression branches on a @emph{pattern}. The exact form of -matching that occurs depends on the pattern. Patterns consist of some -combination of literals, destructured tag constructors, records and tuples, -variable binding specifications and placeholders (@code{_}). An @code{alt} -expression has a @emph{head expression}, which is the value to compare to the -patterns. The type of the patterns must equal the type of the head expression. - -To execute an @code{alt} expression, first the head expression is evaluated, -then its value is sequentially compared to the patterns in the arms until a -match is found. The first arm with a matching pattern is chosen as the branch -target of the @code{alt}, any variables bound by the pattern are assigned to -local slots in the arm's block, and control enters the block. - -An example of an @code{alt} expression: - -@example -tag list @{ nil; cons(X, @@list); @} - -let x: list = cons(10, @@cons(11, @@nil)); - -alt x @{ - cons(a, @@cons(b, _)) @{ - process_pair(a,b); - @} - cons(10, _) @{ - process_ten(); - @} - nil @{ - ret; - @} - _ @{ - fail; - @} -@} -@end example - -Records can also be pattern-matched and their fields bound to variables. -When matching fields of a record, the fields being matched are specified -first, then a placeholder (@code{_}) represents the remaining fields. - -@example -fn main() @{ - let r = @{ - player: "ralph", - stats: load_stats(), - options: @{ - choose: true, - size: "small" - @} - @}; - - alt r @{ - @{options: @{choose: true, _@}, _@} @{ - choose_player(r) - @} - @{player: p, options: @{size: "small", _@}, _@} @{ - log(info, p + " is small"); - @} - _ @{ - next_player(); - @} - @} -@} -@end example - -Multiple alternative patterns may be joined with the @code{|} operator. A -range of values may be specified with @code{to}. For example: - -@example -let message = alt x @{ - 0 | 1 @{ "not many" @} - 2 to 9 @{ "a few" @} - _ @{ "lots" @} -@} -@end example - -Finally, alt patterns can accept @emph{pattern guards} to further refine the -criteria for matching a case. Pattern guards appear after the pattern and -consist of a bool-typed expression following the @emph{if} keyword. A pattern -guard may refer to the variables bound within the pattern they follow. - -@example -let message = alt maybe_digit @{ - some(x) if x < 10 @{ process_digit(x) @} - some(x) @{ process_other(x) @} -@} -@end example - -@node Ref.Expr.Prove -@subsection Ref.Expr.Prove -@c * Ref.Expr.Prove:: Expression for static assertion of typestate. -@cindex Prove expression -@cindex Typestate system - -A @code{prove} expression has no run-time effect. Its purpose is to statically -check (and document) that its argument constraint holds at its expression entry -point. If its argument typestate does not hold, under the typestate algorithm, -the program containing it will fail to compile. - -@node Ref.Expr.Check -@subsection Ref.Expr.Check -@c * Ref.Expr.Check:: Expression for dynamic assertion of typestate. -@cindex Check expression -@cindex Typestate system - -A @code{check} expression connects dynamic assertions made at run-time to the -static typestate system. A @code{check} expression takes a constraint to check -at run-time. If the constraint holds at run-time, control passes through the -@code{check} and on to the next expression in the enclosing block. If the -condition fails to hold at run-time, the @code{check} expression behaves as a -@code{fail} expression. - -The typestate algorithm is built around @code{check} expressions, and in -particular the fact that control @emph{will not pass} a check expression with a -condition that fails to hold. The typestate algorithm can therefore assume -that the (static) postcondition of a @code{check} expression includes the -checked constraint itself. From there, the typestate algorithm can perform -dataflow calculations on subsequent expressions, propagating conditions forward -and statically comparing implied states and their -specifications. @xref{Ref.Typestate}. - -@example -pure fn even(x: int) -> bool @{ - ret x & 1 == 0; -@} - -fn print_even(x: int) : even(x) @{ - print(x); -@} - -fn test() @{ - let y: int = 8; - - // Cannot call print_even(y) here. - - check even(y); - - // Can call print_even(y) here, since even(y) now holds. - print_even(y); -@} -@end example - -@node Ref.Expr.Claim -@subsection Ref.Expr.Claim -@c * Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate. -@cindex Claim expression -@cindex Typestate system - -A @code{claim} expression is an unsafe variant on a @code{check} expression -that is not actually checked at runtime. Thus, using a @code{claim} implies a -proof obligation to ensure---without compiler assistance---that an assertion -always holds. - -Setting a runtime flag can turn all @code{claim} expressions -into @code{check} expressions in a compiled Rust program, but the default is to not check the assertion -contained in a @code{claim}. The idea behind @code{claim} is that performance profiling might identify a -few bottlenecks in the code where actually checking a given callee's predicate -is too expensive; @code{claim} allows the code to typecheck without removing -the predicate check at every other call site. - -@node Ref.Expr.IfCheck -@subsection Ref.Expr.IfCheck -@c * Ref.Expr.IfCheck:: Expression for dynamic testing of typestate. -@cindex If check expression -@cindex Typestate system -@cindex Control-flow - -An @code{if check} expression combines a @code{if} expression and a @code{check} -expression in an indivisible unit that can be used to build more complex -conditional control-flow than the @code{check} expression affords. - -In fact, @code{if check} is a ``more primitive'' expression than @code{check}; -instances of the latter can be rewritten as instances of the former. The -following two examples are equivalent: - -@sp 1 -Example using @code{check}: -@example -check even(x); -print_even(x); -@end example - -@sp 1 -Equivalent example using @code{if check}: -@example -if check even(x) @{ - print_even(x); -@} else @{ - fail; -@} -@end example - -@node Ref.Expr.Assert -@subsection Ref.Expr.Assert -@c * Ref.Expr.Assert:: Expression that halts the program if a boolean condition fails to hold. -@cindex Assertions - -An @code{assert} expression is similar to a @code{check} expression, except -the condition may be any boolean-typed expression, and the compiler makes no -use of the knowledge that the condition holds if the program continues to -execute after the @code{assert}. - -@node Ref.Expr.AnonObj -@subsection Ref.Expr.AnonObj -@c * Ref.Expr.AnonObj:: Expression that extends an object with additional methods. -@cindex Anonymous objects - -An @emph{anonymous object} expression extends an existing object with methods. - -@page -@node Ref.Run -@section Ref.Run -@c * Ref.Run:: Organization of runtime services. -@cindex Runtime library - -The Rust @dfn{runtime} is a relatively compact collection of C and Rust code -that provides fundamental services and datatypes to all Rust tasks at -run-time. It is smaller and simpler than many modern language runtimes. It is -tightly integrated into the language's execution model of memory, tasks, -communication and logging. - -@menu -* Ref.Run.Mem:: Runtime memory management service. -* Ref.Run.Type:: Runtime built-in type services. -* Ref.Run.Comm:: Runtime communication service. -* Ref.Run.Log:: Runtime logging system. -@end menu - -@node Ref.Run.Mem -@subsection Ref.Run.Mem -@c * Ref.Run.Mem:: Runtime memory management service. -@cindex Memory allocation - -The runtime memory-management system is based on a @emph{service-provider -interface}, through which the runtime requests blocks of memory from its -environment and releases them back to its environment when they are no longer -in use. The default implementation of the service-provider interface consists -of the C runtime functions @code{malloc} and @code{free}. - -The runtime memory-management system in turn supplies Rust tasks with -facilities for allocating, extending and releasing stacks, as well as -allocating and freeing boxed values. - -@node Ref.Run.Type -@subsection Ref.Run.Type -@c * Ref.Run.Mem:: Runtime built-in type services. -@cindex Built-in types - -The runtime provides C and Rust code to assist with various built-in types, -such as vectors, strings, and the low level communication system (ports, -channels, tasks). - -Support for other built-in types such as simple types, tuples, records, and -tags is open-coded by the Rust compiler. - -@node Ref.Run.Comm -@subsection Ref.Run.Comm -@c * Ref.Run.Comm:: Runtime communication service. -@cindex Communication -@cindex Process -@cindex Thread - -The runtime provides code to manage inter-task communication. This includes -the system of task-lifecycle state transitions depending on the contents of -queues, as well as code to copy values between queues and their recipients and -to serialize values for transmission over operating-system inter-process -communication facilities. - -@node Ref.Run.Log -@subsection Ref.Run.Log -@c * Ref.Run.Log:: Runtime logging system. -@cindex Logging - -The runtime contains a system for directing logging expressions to a logging -console and/or internal logging buffers. @xref{Ref.Expr.Log}. Logging -expressions can be enabled per module. - -Logging output is enabled by setting the @code{RUST_LOG} environment variable. -@code{RUST_LOG} accepts a logging specification that is a comma-separated list -of paths. For each module containing log expressions, if @code{RUST_LOG} -contains the path to that module or a parent of that module, then its logs -will be output to the console. The path to an module consists of the crate -name, any parent modules, then the module itself, all separated by double -colons (@code{::}). - -As an example, to see all the logs generated by the compiler, you would set -@code{RUST_LOG} to @code{rustc}, which is the crate name (as specified in its -@code{link} attribute). @xref{Ref.Comp.Crate}. To narrow down the logs to -just crate resolution, you would set it to @code{rustc::metadata::creader}. - -Note that when compiling either .rs or .rc files that don't specifiy a crate -name the crate is given a default name that matches the source file, with the -extension removed. In that case, to turn on logging for a program compiled -from, e.g. @code{helloworld.rs}, @code{RUST_LOG} should be set to -@code{helloworld}. - -As a convenience, the logging spec can also be set to a special psuedo-crate, -@code{::help}. In this case, when the application starts, the runtime will -simply output a list of loaded modules containing log expressions, then exit. - -The Rust runtime itself generates logging information. The runtime's logs are -generated for a number of artificial modules in the @code{::rt} psuedo-crate, -and can be enabled just like the logs for any standard module. The full list -of runtime logging modules follows. - -@itemize -@item @code{::rt::mem} Memory management -@item @code{::rt::comm} Messaging and task communication -@item @code{::rt::task} Task management -@item @code{::rt::dom} Task scheduling -@item @code{::rt::trace} Unused -@item @code{::rt::cache} Type descriptor cache -@item @code{::rt::upcall} Compiler-generated runtime calls -@item @code{::rt::timer} The scheduler timer -@item @code{::rt::gc} Garbage collection -@item @code{::rt::stdlib} Functions used directly by the standard library -@item @code{::rt::kern} The runtime kernel -@item @code{::rt::backtrace} Unused -@item @code{::rt::callback} Unused -@end itemize - -@c ############################################################ -@c end main body of nodes -@c ############################################################ - -@page -@node Index -@chapter Index - -@printindex cp - -@bye - -@c Local Variables: -@c mode: texinfo -@c fill-column: 78; -@c indent-tabs-mode: nil -@c buffer-file-coding-system: utf-8-unix -@c compile-command: "make -C $RBUILD -k 2>&1 | sed -e 's/\\/x\\//x:\\//g'"; -@c End: