Introduce private function from_raw_parts_mut for str to factor out the logic.
We want to use raw pointers here instead of duplicating a &mut str, to
be on safer ground w.r.t rust aliasing rules.
This commit stabilizes and deprecates the FCP (final comment period) APIs for
the upcoming 1.7 beta release. The specific APIs which changed were:
Stabilized
* `Path::strip_prefix` (renamed from `relative_from`)
* `path::StripPrefixError` (new error type returned from `strip_prefix`)
* `Ipv4Addr::is_loopback`
* `Ipv4Addr::is_private`
* `Ipv4Addr::is_link_local`
* `Ipv4Addr::is_multicast`
* `Ipv4Addr::is_broadcast`
* `Ipv4Addr::is_documentation`
* `Ipv6Addr::is_unspecified`
* `Ipv6Addr::is_loopback`
* `Ipv6Addr::is_unique_local`
* `Ipv6Addr::is_multicast`
* `Vec::as_slice`
* `Vec::as_mut_slice`
* `String::as_str`
* `String::as_mut_str`
* `<[T]>::clone_from_slice` - the `usize` return value is removed
* `<[T]>::sort_by_key`
* `i32::checked_rem` (and other signed types)
* `i32::checked_neg` (and other signed types)
* `i32::checked_shl` (and other signed types)
* `i32::checked_shr` (and other signed types)
* `i32::saturating_mul` (and other signed types)
* `i32::overflowing_add` (and other signed types)
* `i32::overflowing_sub` (and other signed types)
* `i32::overflowing_mul` (and other signed types)
* `i32::overflowing_div` (and other signed types)
* `i32::overflowing_rem` (and other signed types)
* `i32::overflowing_neg` (and other signed types)
* `i32::overflowing_shl` (and other signed types)
* `i32::overflowing_shr` (and other signed types)
* `u32::checked_rem` (and other unsigned types)
* `u32::checked_shl` (and other unsigned types)
* `u32::saturating_mul` (and other unsigned types)
* `u32::overflowing_add` (and other unsigned types)
* `u32::overflowing_sub` (and other unsigned types)
* `u32::overflowing_mul` (and other unsigned types)
* `u32::overflowing_div` (and other unsigned types)
* `u32::overflowing_rem` (and other unsigned types)
* `u32::overflowing_neg` (and other unsigned types)
* `u32::overflowing_shl` (and other unsigned types)
* `u32::overflowing_shr` (and other unsigned types)
* `ffi::IntoStringError`
* `CString::into_string`
* `CString::into_bytes`
* `CString::into_bytes_with_nul`
* `From<CString> for Vec<u8>`
* `From<CString> for Vec<u8>`
* `IntoStringError::into_cstring`
* `IntoStringError::utf8_error`
* `Error for IntoStringError`
Deprecated
* `Path::relative_from` - renamed to `strip_prefix`
* `Path::prefix` - use `components().next()` instead
* `os::unix::fs` constants - moved to the `libc` crate
* `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize
* `IntoCow` - conflicts with `Into` and may come back later
* `i32::{BITS, BYTES}` (and other integers) - not pulling their weight
* `DebugTuple::formatter` - will be removed
* `sync::Semaphore` - not used enough and confused with system semaphores
Closes#23284
cc #27709 (still lots more methods though)
Closes#27712Closes#27722Closes#27728Closes#27735Closes#27729Closes#27755Closes#27782Closes#27798
This is a bit weird since unsized types can't be used in trait objects,
but Any is *also* used as pure marker trait since Reflect isn't stable.
There are many cases (e.g. TypeMap) where all you need is a TypeId.
r? @aturon
This commit stabilizes and deprecates the FCP (final comment period) APIs for
the upcoming 1.7 beta release. The specific APIs which changed were:
Stabilized
* `Path::strip_prefix` (renamed from `relative_from`)
* `path::StripPrefixError` (new error type returned from `strip_prefix`)
* `Ipv4Addr::is_loopback`
* `Ipv4Addr::is_private`
* `Ipv4Addr::is_link_local`
* `Ipv4Addr::is_multicast`
* `Ipv4Addr::is_broadcast`
* `Ipv4Addr::is_documentation`
* `Ipv6Addr::is_unspecified`
* `Ipv6Addr::is_loopback`
* `Ipv6Addr::is_unique_local`
* `Ipv6Addr::is_multicast`
* `Vec::as_slice`
* `Vec::as_mut_slice`
* `String::as_str`
* `String::as_mut_str`
* `<[T]>::clone_from_slice` - the `usize` return value is removed
* `<[T]>::sort_by_key`
* `i32::checked_rem` (and other signed types)
* `i32::checked_neg` (and other signed types)
* `i32::checked_shl` (and other signed types)
* `i32::checked_shr` (and other signed types)
* `i32::saturating_mul` (and other signed types)
* `i32::overflowing_add` (and other signed types)
* `i32::overflowing_sub` (and other signed types)
* `i32::overflowing_mul` (and other signed types)
* `i32::overflowing_div` (and other signed types)
* `i32::overflowing_rem` (and other signed types)
* `i32::overflowing_neg` (and other signed types)
* `i32::overflowing_shl` (and other signed types)
* `i32::overflowing_shr` (and other signed types)
* `u32::checked_rem` (and other unsigned types)
* `u32::checked_neg` (and other unsigned types)
* `u32::checked_shl` (and other unsigned types)
* `u32::saturating_mul` (and other unsigned types)
* `u32::overflowing_add` (and other unsigned types)
* `u32::overflowing_sub` (and other unsigned types)
* `u32::overflowing_mul` (and other unsigned types)
* `u32::overflowing_div` (and other unsigned types)
* `u32::overflowing_rem` (and other unsigned types)
* `u32::overflowing_neg` (and other unsigned types)
* `u32::overflowing_shl` (and other unsigned types)
* `u32::overflowing_shr` (and other unsigned types)
* `ffi::IntoStringError`
* `CString::into_string`
* `CString::into_bytes`
* `CString::into_bytes_with_nul`
* `From<CString> for Vec<u8>`
* `From<CString> for Vec<u8>`
* `IntoStringError::into_cstring`
* `IntoStringError::utf8_error`
* `Error for IntoStringError`
Deprecated
* `Path::relative_from` - renamed to `strip_prefix`
* `Path::prefix` - use `components().next()` instead
* `os::unix::fs` constants - moved to the `libc` crate
* `fmt::{radix, Radix, RadixFmt}` - not used enough to stabilize
* `IntoCow` - conflicts with `Into` and may come back later
* `i32::{BITS, BYTES}` (and other integers) - not pulling their weight
* `DebugTuple::formatter` - will be removed
* `sync::Semaphore` - not used enough and confused with system semaphores
Closes#23284
cc #27709 (still lots more methods though)
Closes#27712Closes#27722Closes#27728Closes#27735Closes#27729Closes#27755Closes#27782Closes#27798
Add fast path for ASCII in UTF-8 validation
This speeds up the ASCII case (and long stretches of ASCII in otherwise
mixed UTF-8 data) when checking UTF-8 validity.
Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14 (smallish input).
On XML and mostly English language input (en.wikipedia XML dump),
throughput improves by a factor 7 (large input).
On mostly non-ASCII input, performance increases slightly or is the
same.
The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, bounds
checks are statically elided by LLVM and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.
A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.
Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.
Old code is `regular`, this PR is called `fast`.
Datasets:
- `ascii` is just ASCII (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia XML dump
- `enwik8` is 100MB of an en.wikipedia XML dump
- `jawik10` is 10MB of a ja.wikipedia XML dump
```
test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular ... bench: 10,944 ns/iter (+/- 795) = 468 MB/s
test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```
Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
This is a bit weird since unsized types can't be used in trait objects,
but Any is *also* used as pure marker trait since Reflect isn't stable.
There are many cases (e.g. TypeMap) where all you need is a TypeId.
This commit migrates all of the methods on `num::wrapping::OverflowingOps` onto
inherent methods of the integer types. This also fills out some missing gaps in
the saturating and checked departments such as:
* `saturating_mul`
* `checked_{neg,rem,shl,shr}`
This is done in preparation for stabilization,
cc #27755
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.
Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.
[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.
Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.
[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
This speeds up the ascii case (and long stretches of ascii in otherwise
mixed UTF-8 data) when checking UTF-8 validity.
Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14!
On xml and mostly english language input (en.wikipedia xml dump),
throughput increases by a factor 7.
On mostly non-ASCII input, performance increases slightly or is the
same.
The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, they
are statically elided by llvm and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.
A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.
Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.
Old code is `regular`, this PR is called `fast`.
Datasets:
- `ascii` is just ascii (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia xml dump
- `enwik10` is 100MB of an en.wikipedia xml dump
- `jawik10` is 10MB of a ja.wikipedia xml dump
```
test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular ... bench: 12,250 ns/iter (+/- 437) = 418 MB/s
test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```
Co-authored-by: Björn Steinbrink <bsteinbr@gmail.com>
This commit migrates all of the methods on `num::wrapping::OverflowingOps` onto
inherent methods of the integer types. This also fills out some missing gaps in
the saturating and checked departments such as:
* `saturating_mul`
* `checked_{neg,rem,shl,shr}`
This is done in preparation for stabilization,
cc #27755
It was recently realized that we accept defaulted type parameters everywhere, without feature gate, even though the only place that we really *intended* to accept them were on types. This PR adds a lint warning unless the "type-parameter-defaults" feature is enabled. This should eventually become a hard error.
This is a [breaking-change] in that new feature gates are required (or simply removing the defaults, which is probably a better choice as they have little effect at this time). Results of a [crater run][crater] suggest that approximately 5-15 crates are affected. I didn't do the measurement quite right so that run cannot distinguish "true" regressions from "non-root" regressions, but even the upper bound of 15 affected crates seems relatively minimal.
[crater]: https://gist.github.com/nikomatsakis/760c6a67698bd24253bf
cc @rust-lang/lang
r? @pnkfelix
Make `".".parse::<f32>()` and `".".parse::<f64>()` return Err
This fixes#30344.
This is a [breaking-change], which the libs team have classified as a
bug fix.
This makes both of the following return Err:
".".parse::<f32>()
".".parse::<f64>()
This is a [breaking-change], which the libs team have classified as a
bug fix.
The current explanation for scan() is not very clear as to how it works, especially when it compares itself to fold().
I believe these changes makes it all a bit more clear for the reader, and makes it easier to understand the example code.
r? @steveklabnik