This PR cuts down on a large number of `#[inline(always)]` and `#[inline]`
annotations in libcore for various core functions. The `#[inline(always)]`
annotation is almost never needed and is detrimental to debug build times as it
forces LLVM to perform inlining when it otherwise wouldn't need to in debug
builds. Additionally `#[inline]` is an unnecessary annoation on almost all
generic functions because the function will already be monomorphized into other
codegen units and otherwise rarely needs the extra "help" from us to tell LLVM
to inline something.
Overall this PR cut the compile time of a [microbenchmark][1] by 30% from 1s to
0.7s.
[1]: https://gist.github.com/alexcrichton/a7d70319a45aa60cf36a6a7bf540dd3a
Exposes the swapping logic from PR 40454 as `pub unsafe fn ptr::swap_nonoverlapping` under feature swap_nonoverlapping
This is most helpful for compound types where LLVM didn't vectorize the loop. Highlight: bench slice::rotate_medium_by727_strings gets 38% faster.
Add an in-place rotate method for slices to libcore
A helpful primitive for moving chunks of data around inside a slice.
For example, if you have a range selected and are drag-and-dropping it somewhere else (Example from [Sean Parent's talk](https://youtu.be/qH6sSOr-yk8?t=560)).
(If this should be an RFC instead of a PR, please let me know.)
Edit: changed example
Make RangeInclusive just a two-field struct
Not being an enum improves ergonomics and consistency, especially since NonEmpty variant wasn't prevented from being empty. It can still be iterable without an extra "done" bit by making the range have !(start <= end), which is even possible without changing the Step trait.
Implements merged https://github.com/rust-lang/rfcs/pull/1980; tracking issue https://github.com/rust-lang/rust/issues/28237.
This is definitely a breaking change to anything consuming `RangeInclusive` directly (not as an Iterator) or constructing it without using the sugar. Is there some change that would make sense before this so compilation failures could be compatibly fixed ahead of time?
r? @aturon (as FCP proposer on the RFC)
A helpful primitive for moving chunks of data around inside a slice.
In particular, adding elements to the end of a Vec then moving them
somewhere else, as a way to do efficient multiple-insert. (There's
drain for efficient block-remove, but no easy way to block-insert.)
Talk with another example: <https://youtu.be/qH6sSOr-yk8?t=560>
Not being an enum improves ergonomics, especially since NonEmpty could be Empty. It can still be iterable without an extra "done" bit by making the range have !(start <= end), which is even possible without changing the Step trait.
Implements RFC 1980
Since LLVM doesn't vectorize the loop for us, do unaligned reads
of a larger type and use LLVM's bswap intrinsic to do the
reversing of the actual bytes. cfg!-restricted to x86 and
x86_64, as I assume it wouldn't help on things like ARMv5.
Also makes [u16]::reverse() a more modest 1.5x faster by
loading/storing u32 and swapping the u16s with ROT16.
Thank you ptr::*_unaligned for making this easy :)
Add ptr::offset_to
This PR adds a method to calculate the signed distance (in number of elements) between two pointers. The resulting value can then be passed to `offset` to get one pointer from the other. This is similar to pointer subtraction in C/C++.
There are 2 special cases:
- If the distance is not a multiple of the element size then the result is rounded towards zero. (in C/C++ this is UB)
- ZST return `None`, while normal types return `Some(isize)`. This forces the user to handle the ZST case in unsafe code. (C/C++ doesn't have ZSTs)