Mark libserialize functions as inline
Got to thinking: "what if that big pile of tiny functions isn't inlining as it should?"
So a few `replace-regex` later the local perf run says this:
<details>

</details>
Not huge, but still a win, which is interesting. Want to verify with the real perf run, but I understand there's a backlog.
I didn't notice any increase in compile time or binary sizes for rustc/libs.
Two small improvements
In `librustc_apfloat/ieee.rs`, use the iterator.[r]find methods to simplify the code. In `libserialize/json.rs`, make use of the fact that `Vec.last` on an empty `Vec` returns `None` to simplify the code to a single match.
Documented impl From on line 367 of libserialize/json.rs
This is for the impl From mentioned in #51430 assigned to @skade .
Hopefully I didn't miss anything/get anything wrong. I looked over another PR for another part of this same issue to see what the proper formatting was, etc.
Thanks!
In `librustc_apfloat/ieee.rs`, use the iterator.[r]find methods to
simplify the code. In `libserialize/json.rs`, make use of the fact
that `Vec.last` on an empty `Vec` returns `None` to simplify the
code to a single match.
A few cleanups
- change `skip(1).next()` to `nth(1)`
- collapse some `if-else` expressions
- remove a few explicit `return`s
- remove an unnecessary field name
- dereference once instead of matching on multiple references
- prefer `iter().enumerate()` to indexing with `for`
- remove some unnecessary lifetime annotations
- use `writeln!()` instead of `write!()`+`\n`
- remove redundant parentheses
- shorten some enum variant names
- a few other cleanups suggested by `clippy`
Replace push loops with extend() where possible
Or set the vector capacity where I couldn't do it.
According to my [simple benchmark](https://gist.github.com/ljedrz/568e97621b749849684c1da71c27dceb) `extend`ing a vector can be over **10 times** faster than `push`ing to it in a loop:
10 elements (6.1 times faster):
```
test bench_extension ... bench: 75 ns/iter (+/- 23)
test bench_push_loop ... bench: 458 ns/iter (+/- 142)
```
100 elements (11.12 times faster):
```
test bench_extension ... bench: 87 ns/iter (+/- 26)
test bench_push_loop ... bench: 968 ns/iter (+/- 3,528)
```
1000 elements (11.04 times faster):
```
test bench_extension ... bench: 311 ns/iter (+/- 9)
test bench_push_loop ... bench: 3,436 ns/iter (+/- 233)
```
Seems like a good idea to use `extend` as much as possible.
Speed up leb128 encoding and decoding for unsigned values.
Make the implementation for some leb128 functions potentially faster.
@Mark-Simulacrum, could you please trigger a perf.rlo run?
This saves the storage space used by about 32 bits per `Fingerprint`.
On average, this reduces the size of the `/target/{mode}/incremental`
folder by roughly 5%.
Fixes#45875
Replaced by adding extra imports, adding hidden code (`# ...`), modifying
examples to be runnable (sorry Homura), specifying non-Rust code, and
converting to should_panic, no_run, or compile_fail.
Remaining "```ignore"s received an explanation why they are being ignored.
This commit updates the version number to 1.17.0 as we're not on that version of
the nightly compiler, and at the same time this updates src/stage0.txt to
bootstrap from freshly minted beta compiler and beta Cargo.