old design the TLS held the scheduler struct, and the scheduler struct
held the active task. This posed all sorts of weird problems due to
how we wanted to use the contents of TLS. The cleaner approach is to
leave the active task in TLS and have the task hold the scheduler. To
make this work out the scheduler has to run inside a regular task, and
then once that is the case the context switching code is massively
simplified, as instead of three possible paths there is only one. The
logical flow is also easier to follow, as the scheduler struct acts
somewhat like a "token" indicating what is active.
These changes also necessitated changing a large number of runtime
tests, and rewriting most of the runtime testing helpers.
Polish level is "low", as I will very soon start on more scheduler
changes that will require wiping the polish off. That being said there
should be sufficient comments around anything complex to make this
entirely respectable as a standalone commit.
Change the former repetition::
for 5.times { }
to::
do 5.times { }
.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
Change all users of old-style for with internal iterators to using
`do`-loops.
The code in stackwalk.rs does not actually implement the
looping protocol (no break on return false).
The code in gc.rs does not use loop breaks, nor does any code using it.
We remove the capacity to break from the loops in std::gc and implement
the walks using `do { .. }` expressions.
No behavior change.
.intersection(), .union() etc methods in trait std::container::Set use
internal iters. Remove these methods from the trait.
I reported issue #8154 for the reinstatement of iterator-based set algebra
methods to the Set trait.
For bitv and treemap, that lack Iterator implementations of set
operations, preserve them as methods directly on the types themselves.
For HashSet, these methods are replaced by the present .union_iter()
etc.
This removes a bunch of options from the task builder interface that are irrelevant to the new scheduler and were generally unused anyway. It also bumps the stack size of new scheduler tasks so that there's enough room to run rustc and changes the interface to `Thread` to not implicitly join threads on destruction, but instead require an explicit, and mandatory, call to `join`.
Main logic in ```Implement select() for new runtime pipes.```. The guts of the ```PortOne::try_recv()``` implementation are now split up across several functions, ```optimistic_check```, ```block_on```, and ```recv_ready```.
There is one weird FIXME I left open here, in the "implement select" commit -- an assertion I couldn't get to work in the receive path, on an invariant that for some reason doesn't hold with ```SharedPort```. Still investigating this.
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.
An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.
Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.
Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.
The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.