Make useless_ptr_null_checks smarter about some std functions
This teaches the `useless_ptr_null_checks` lint that some std functions can't ever return null pointers, because they need to point to valid data, get references as input, etc.
This is achieved by introducing an `#[rustc_never_returns_null_ptr]` attribute and adding it to these std functions (gated behind bootstrap `cfg_attr`).
Later on, the attribute could maybe be used to tell LLVM that the returned pointer is never null. I don't expect much impact of that though, as the functions are pretty shallow and usually the input data is already never null.
Follow-up of PR #113657Fixes#114442
slice::from_raw_parts: mention no-wrap-around condition
Cc https://github.com/rust-lang/rust/issues/83996. This probably needs to be mentioned in more places, so I am not closing that issue, but this here should help at least.
Update runtime guarantee for `select_nth_unstable`
#106933 changed the runtime guarantee for `select_nth_unstable` from O(n) to O(n log n), since the old guarantee wasn't actually met by the implementation at the time. Now with #107522, `select_nth_unstable` should be truly linear in runtime, so we can revert its runtime guarantee to O(n). Since #106933 was considered a bug fix, this will probably need an FCP because it counts as a new API guarantee.
r? `@Amanieu`
Add Median of Medians fallback to introselect
Fixes#102451.
This PR is a follow up to #106997. It adds a Fast Deterministic Selection implementation as a fallback to the introselect algorithm used by `select_nth_unstable`. This allows it to guarantee O(n) worst case running time, while maintaining good performance in all cases.
This would fix#102451, which was opened because the `select_nth_unstable` docs falsely claimed that it had O(n) worst case performance, even though it was actually quadratic in the worst case. #106997 improved the worst case complexity to O(n log n) by using heapsort as a fallback, and this PR further improves it to O(n) (this would also make #106933 unnecessary).
It also improves the actual runtime if the fallback gets called: Using a pathological input of size `1 << 19` (see the playground link in #102451), calculating the median is roughly 3x faster using fast deterministic selection as a fallback than it is using heapsort.
The downside to this is less code reuse between the sorting and selection algorithms, but I don't think it's that bad. The additional algorithms are ~250 LOC with no `unsafe` blocks (I tried using unsafe to avoid bounds checks but it didn't noticeably improve the performance).
I also let it fuzz for a while against the current `select_nth_unstable` implementation to ensure correctness, and it seems to still fulfill all the necessary postconditions.
cc `@scottmcm` who reviewed #106997
Use code with reliable branchless code-gen for slice::sort merge
The recent LLVM 16 update changes code-gen to be not branchless anymore, in the slice::sort implementation merge function. This improves performance by 30% for random patterns, restoring the performance to the state with LLVM 15.
Fixes#111559
The recent LLVM 16 update changes code-gen to be not branchless anymore, in the
slice::sort implementation merge function. This improves performance by 30% for
random patterns, restoring the performance to the state with LLVM 15.
Stabilize const slice::split_at
This stabilizes the use of the following method in const context:
```rust
impl<T> [T] {
pub const fn split_at(&self, mid: usize) -> (&[T], &[T]);
}
```
cc tracking issue #101158
Currently, slice iterators over ZSTs store `end = start.wrapping_byte_add(len)`.
That's slightly convenient for `is_empty`, but kinda annoying for pretty much everything else -- see bugs like 42789, for example.
This PR instead changes it to just `end = ptr::invalid(len)` instead.
That's easier to think about (IMHO, at least) as well as easier to represent.
Remove some `assume`s from slice iterators that don't do anything
Because the start pointer is iterators is already a `NonNull`, we emit the appropriate `!nonnull` metadata when loading the pointer to tell LLVM that it's non-null.
Probably the best way to see that it's the metadata that's important (and not the `assume`) is to observe that LLVM actually *removes* the `assume` from the optimized IR: <https://rust.godbolt.org/z/KhE6G963n>.
(I also checked that, yes, the if-not-ZST `assume` on `end` is still doing something: it's how there's a `!nonnull` metadata on its load, even though it's an ordinary raw pointer. The codegen test added in this PR fails if the other `assume` is removed.)
Always const-evaluate the GCD in `slice::align_to_offsets`
Use an inline `const`-block to force the compiler to calculate the GCD at compile time, even in debug mode. This shouldn't affect the behavior of the program at all, but it drastically cuts down on the number of instructions emitted with optimizations disabled.
With the current implementation, a single `slice::align_to` instantiation (specifically `<[u8]>::align_to::<u128>()`) generates 676 instructions (on x86-64). Forcing the GCD computation to be const cuts it down to 327 instructions, so just over 50% less. This is obviously not representative of actual runtime gains, but I still see it as a significant win as long as it doesn't degrade compile times.
Not having to worry about LLVM const-evaluating the GCD function also allows it to use the textbook recursive euclidean algorithm instead of a much more complicated iterative implementation with multiple `unsafe`-blocks.
Improve internal field comments on `slice::Iter(Mut)`
I wrote these in a previous PR that I ended up withdrawing, so might as well submit them separately.
`@bors` rollup=always
Spelling library
Split per https://github.com/rust-lang/rust/pull/110392
I can squash once people are happy w/ the changes. It's really uncommon for large sets of changes to be perfectly acceptable w/o at least some changes.
I probably won't have time to respond until tomorrow or the next day
Fix no_global_oom_handling build
`provide_sorted_batch` in core is incorrectly marked with `#[cfg(not(no_global_oom_handling))]` which prevents core from building with the cfg enabled.
Nothing in `core` allocates memory (including this function). The `cfg` gate is incorrect.
cc ``@dpaoliello``
r? ``@wesleywiser``
The cfg was added by #107191
More `IS_ZST` in `library`
I noticed that `post_inc_start` and `pre_dec_end` were doing this check in different ways
d19b64fb54/library/core/src/slice/iter/macros.rs (L76-L93)
so started making this PR, then added a few more I found since I was already making changes anyway.
I noticed that `post_inc_start` and `pre_dec_end` were doing this check in different ways
d19b64fb54/library/core/src/slice/iter/macros.rs (L76-L93)
so started making this PR, then added a few more I found since I was already making changes anyway.
`provide_sorted_batch` in core is incorrectly marked with
`#[cfg(not(no_global_oom_handling))]` which prevents core
from building with the cfg enabled.
Nothing in core allocates memory including this function, so
the `cfg` gate is incorrect.