`transmute` should also assume non-null pointers
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
optimize slice::ptr_rotate for small rotates
r? `@scottmcm`
This swaps the positions and numberings of algorithms 1 and 2 in `slice::ptr_rotate`, and pulls the entire outer loop into algorithm 3 since it was redundant for the first two. Effectively, `ptr_rotate` now always does the `memcpy`+`memmove`+`memcpy` sequence if the shifts fit into the stack buffer.
With this change, an `IndexMap`-style `move_index` function is optimized correctly.
Assembly comparisons:
- `move_index`, before: https://godbolt.org/z/Kr616KnYM
- `move_index`, after: https://godbolt.org/z/1aoov6j8h
- the code from `#89714`, before: https://godbolt.org/z/Y4zaPxEG6
- the code from `#89714`, after: https://godbolt.org/z/1dPx83axc
related to #89714
some relevant discussion in https://internals.rust-lang.org/t/idea-shift-move-to-efficiently-move-elements-in-a-vec/22184
Behavior tests pass locally. I can't get any consistent microbenchmark results on my machine, but the assembly diffs look promising.
* Renames the methods:
* `get_many_mut` -> `get_disjoint_mut`
* `get_many_unchecked_mut` -> `get_disjoint_unchecked_mut`
* Does not rename the feature flag: `get_many_mut`
* Marks the feature as stable
* Renames some helper stuff:
* `GetManyMutError` -> `GetDisjointMutError`
* `GetManyMutIndex` -> `GetDisjointMutIndex`
* `get_many_mut_helpers` -> `get_disjoint_mut_helpers`
* `get_many_check_valid` -> `get_disjoint_check_valid`
This only touches slice methods.
HashMap's methods and feature gates are not renamed here
(nor are they stabilized).
Improve `select_nth_unstable` documentation clarity
* Instead uses `before` and `after` variable names in the example
where `greater` and `lesser` are flipped.
* Uses `<=` and `>=` instead of "less than or equal to" and "greater
than or equal to" to make the docs more concise.
* General attempt to remove unnecessary words and be more precise. For
example it seems slightly wrong to say "its final sorted position",
since this implies there is only one sorted position for this element.
Improve prose around `as_slice` example of IterMut
I've removed the cryptic message about not being able to call `&mut self` methods while retaining a shared borrow of the iterator, such as `as_slice` produces. This is just normal borrowing rules and does not seem especially relevant here. I can whip up a replacement if someone thinks it has value.
* Instead uses `before` and `after` variable names in the example
where `greater` and `lesser` are flipped.
* Uses `<=` and `>=` instead of "less than or equal to" and "greater
than or equal to" to make the docs more concise.
* General attempt to remove unnecessary words and be more precise. For
example it seems slightly wrong to say "its final sorted position",
since this implies there is only one sorted position for this element.
I've removed the cryptic message about not being able to call `&mut self` methods while retaining a shared borrow of the iterator, such as `as_slice` produces. This is just normal borrowing rules and does not seem especially relevant here. I can whip up a replacement if someone thinks it has value.
Optimize `is_ascii` for `str` and `[u8]` further
Replace the existing optimized function with one that enables auto-vectorization.
This is especially beneficial on x86-64 as `pmovmskb` can be emitted with careful structuring of the code. The instruction can detect non-ASCII characters one vector register width at a time instead of the current `usize` at a time check.
The resulting implementation is completely safe.
`case00_libcore` is the current implementation, `case04_while_loop` is this PR.
```
benchmarks:
ascii::is_ascii_slice::long::case00_libcore 22.25/iter +/- 1.09
ascii::is_ascii_slice::long::case04_while_loop 6.78/iter +/- 0.92
ascii::is_ascii_slice::medium::case00_libcore 2.81/iter +/- 0.39
ascii::is_ascii_slice::medium::case04_while_loop 1.56/iter +/- 0.78
ascii::is_ascii_slice::short::case00_libcore 5.55/iter +/- 0.85
ascii::is_ascii_slice::short::case04_while_loop 3.75/iter +/- 0.22
ascii::is_ascii_slice::unaligned_both_long::case00_libcore 26.59/iter +/- 0.66
ascii::is_ascii_slice::unaligned_both_long::case04_while_loop 5.78/iter +/- 0.16
ascii::is_ascii_slice::unaligned_both_medium::case00_libcore 2.97/iter +/- 0.32
ascii::is_ascii_slice::unaligned_both_medium::case04_while_loop 2.41/iter +/- 0.10
ascii::is_ascii_slice::unaligned_head_long::case00_libcore 23.71/iter +/- 0.79
ascii::is_ascii_slice::unaligned_head_long::case04_while_loop 7.83/iter +/- 1.31
ascii::is_ascii_slice::unaligned_head_medium::case00_libcore 3.69/iter +/- 0.54
ascii::is_ascii_slice::unaligned_head_medium::case04_while_loop 7.05/iter +/- 0.32
ascii::is_ascii_slice::unaligned_tail_long::case00_libcore 24.44/iter +/- 1.41
ascii::is_ascii_slice::unaligned_tail_long::case04_while_loop 5.12/iter +/- 0.18
ascii::is_ascii_slice::unaligned_tail_medium::case00_libcore 3.24/iter +/- 0.40
ascii::is_ascii_slice::unaligned_tail_medium::case04_while_loop 2.86/iter +/- 0.14
```
`unaligned_head_medium` is the main regression in the benchmarks. It is a 32 byte string being sliced `bytes[1..]`.
The first commit can be used to run the benchmarks against the current core implementation.
Previous implementation was done in #74066
---
Two potential drawbacks of this implementation are that it increases instruction count and may regress other platforms/architectures. The benches here may also be too artificial to glean much insight from.
https://rust.godbolt.org/z/G9znGfY36
Improve prose around into_slice example of IterMut
Having a part without modification and one with seems redundant, since `into_slice` is only called for the part without. I have brought the modification into the remaining part, although it perhaps does not add much (or only distracts?).
Rename `elem_offset` to `element_offset`
Tracking issue: #126769
Renames `slice::elem_offset` to `slice::element_offset` and improves the documentation of it and its related methods.
The current documentation can be misinterpreted (as explained [here](https://github.com/rust-lang/rust/issues/126769#issuecomment-2453363897)).