Optimize Ord trait implementation for bool
Casting the booleans to `i8`s and converting their difference into `Ordering` generates better assembly than casting them to `u8`s and comparing them.
Fixes#66780
#### Comparison([Godbolt link](https://rust.godbolt.org/z/PjBpvF))
##### Old assembly:
```asm
example::boolean_cmp:
mov ecx, edi
xor ecx, esi
test esi, esi
mov eax, 255
cmove eax, ecx
test edi, edi
cmovne eax, ecx
ret
```
##### New assembly:
```asm
example::boolean_cmp:
mov eax, edi
sub al, sil
ret
```
##### Old LLVM-MCA statistics:
```
Iterations: 100
Instructions: 800
Total Cycles: 234
Total uOps: 1000
Dispatch Width: 6
uOps Per Cycle: 4.27
IPC: 3.42
Block RThroughput: 1.7
```
##### New LLVM-MCA statistics:
```
Iterations: 100
Instructions: 300
Total Cycles: 110
Total uOps: 500
Dispatch Width: 6
uOps Per Cycle: 4.55
IPC: 2.73
Block RThroughput: 1.0
```
libstd miri tests: avoid warnings
Ignore tests in a way that all the code still gets compiled, to get rid of all the "unused" warnings that otherwise show up when running the test suite in Miri.
These three lines are from c82da7a54b in
2015.
They cause problems when applying rustfmt to the codebase, because
reordering wildcard imports can trigger new unused import warnings.
As a minimized example, the following program compiles successfully:
#![deny(unused_imports)]
use std::fmt::Debug;
use std::marker::Send;
pub mod repro {
use std::prelude::v1::*;
use super::*;
pub type D = dyn Debug;
pub type S = dyn Send;
}
pub type S = dyn Send;
but putting it through rustfmt produces a program that fails to compile:
#![deny(unused_imports)]
use std::fmt::Debug;
use std::marker::Send;
pub mod repro {
use super::*;
use std::prelude::v1::*;
pub type D = dyn Debug;
pub type S = dyn Send;
}
pub type S = dyn Send;
The error is:
error: unused import: `std::prelude::v1::*`
--> src/main.rs:8:9
|
8 | use std::prelude::v1::*;
| ^^^^^^^^^^^^^^^^^^^
Add `cmp::{min_by, min_by_key, max_by, max_by_key}`
This adds the following functions to `core::cmp`:
- `min_by`
- `min_by_key`
- `max_by`
- `max_by_key`
`min_by` and `max_by` are somewhat trivial to implement, but not entirely because `min_by` returns the first value in case the two are equal (and `max_by` the second). `min` and `max` can be implemented in terms of `min_by` and `max_by`, but not as easily the other way around.
To give an example of why I think these functions could be useful: the `Iterator::{min_by, min_by_key, max_by, max_by_key}` methods all currently hard-code the behavior mentioned above which is an ever so small duplication of logic. If we delegate them to `cmp::{min_by, max_by}` methods instead, we get the correct behavior for free. (edit: this is now included in the PR)
I added `min_by_key` / `max_by_key` for consistency's sake but I wouldn't mind removing them. I don't have a particular use case in mind for them, and `min_by` / `max_by` seem to be more useful.
Tracking issue: #64460
Override `StepBy::{try_fold, try_rfold}`
Previous PR: https://github.com/rust-lang/rust/pull/51435
The previous PR was closed in favor of https://github.com/rust-lang/rust/pull/51601, which was later reverted. I don't think these implementations will make it harder to specialize `StepBy<Range<_>>` later, so we should be able to land this without any consequences.
This should fix https://github.com/rust-lang/rust/issues/57517 – in my benchmarks `iter` and `iter.step_by(1)` now perform equally well, provided internal iteration is used.
Add Iterator comparison methods that take a comparison function
This PR adds `Iterator::{cmp_by, partial_cmp_by, eq_by, ne_by, lt_by, le_by, gt_by, ge_by}`. We already have `Iterator::{cmp, partial_cmp, ...}` which are less general (but not any simpler) than the ones I'm proposing here.
I'm submitting this PR now because #61505 has been merged, so this change should not have a noticeable effect on the `Iterator` docs page size.
The diff is quite messy, here's what I changed:
- The logic of `cmp` / `partial_cmp` / `eq` is moved to `cmp_by` / `partial_cmp_by` / `eq_by` respectively, changing `x.cmp(&y)` to `cmp(&x, &y)` in the `cmp` method where `cmp` is the given comparison function (and similar for `partial_cmp_by` and `eq_by`).
- `ne_by` / `lt_by` / `le_by` / `gt_by` / `ge_by` are each implemented in terms of one of the three methods above.
- The existing comparison methods are each forwarded to their `_by` counterpart, passing one of `Ord::cmp` / `PartialOrd::partial_cmp` / `PartialEq::eq` as the comparison function.
The corresponding `_by_key` methods aren't included because they're not as fundamental as the `_by` methods and can easily be implemented in terms of them. Is that reasonable, or would adding the `_by_key` methods be desirable for the sake of completeness?
I didn't add any tests – I couldn't think of any that weren't already covered by our existing tests. Let me know if there's a particular test that would be useful to add.
Because of a compiler bug that adding `Self: ExactSizeIterator` makes
the compiler forget `Self::Item` is `<I as Iterator>::Item`, we remove
this specialization for now.
Fix bug in iter::Chain::size_hint
`Chain::size_hint` currently ignores `self.state`, which means that the size hints of the underlying iterators are always combined regardless of the iteration state. This, of course, should only happen when the state is `ChainState::Both`.
Override Cycle::try_fold
It's not very pretty, but I believe this is the simplest way to correctly implement `Cycle::try_fold`. The following may seem correct:
```rust
loop {
acc = self.iter.try_fold(acc, &mut f)?;
self.iter = self.orig.clone();
}
```
...but this loops infinitely in case `self.orig` is empty, as opposed to returning `acc`. So we first have to fully iterate `self.orig` to check whether it is empty or not, and before _that_, we have to iterate the remaining elements of `self.iter`.
This should always call `self.orig.clone()` the same amount of times as repeated `next()` calls would.
r? @scottmcm
enable flt2dec tests in Miri
With ldexp implemented (thanks to @christianpoveda), we can finally enable these tests in Miri. Well, most of them -- some are just too slow.
Improve `ptr_rotate` performance, tests, and benches
The corresponding issue is #61784. I am not actually sure if miri can handle the test, but I can change the commit if necessary.
bump rand in libcore/liballoc test suites
This pulls in the fix for https://github.com/rust-random/rand/issues/779, which trips Miri when running these test suites.
`SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?
Use internal iteration in the Sum and Product impls of Result and Option
This PR adds internal iteration to the `ResultShunt` iterator type underlying the `Sum` and `Product` impls of `Result`. I had to change `ResultShunt` to hold a mutable reference to an error instead, similar to `itertools::ProcessResults`, in order to be able to pass the `ResultShunt` itself by value (which is necessary for internal iteration).
`ResultShunt::process` can unfortunately no longer be an associated function because that would make it generic over the lifetime of the error reference, which wouldn't work, so I turned it into the free function `process_results`.
I removed the `OptionShunt` type and forwarded the `Sum` and `Product` impls of `Option` to their respective impls of `Result` instead, to avoid having to repeat the internal iteration logic.
fix UB in a test
We used to compare two mutable references that were supposed to point to the same thing. That's no good.
Compare them as raw pointers instead.
Implement DoubleEndedIterator for iter::{StepBy, Peekable, Take}
Now that `DoubleEndedIterator::nth_back` has landed, `StepBy` and `Take` can have an efficient `DoubleEndedIterator` implementation. I don't know if there was any particular reason for `Peekable` not having a `DoubleEndedIterator` implementation, but it's quite trivial and I don't see any drawbacks to having it.
I'm not very happy about the implementation of `Peekable::try_rfold`, but I didn't see another way to only take the value out of `self.peeked` in case `self.iter.try_rfold` didn't exit early.
I only added `Peekable::rfold` (in addition to `try_rfold`) because its `Iterator` implementation has both `fold` and `try_fold` (and for similar reasons I added `Take::try_rfold` but not `Take::rfold`). Do we have any guidelines on whether we want both? If we do want both, maybe we should investigate which iterator adaptors override `try_fold` but not `fold` and add the missing implementations. At the moment I think that it's better to always have iterator adaptors implement both, because some iterators have a simpler `fold` implementation than their `try_fold` implementation.
The tests that I added may not be sufficient because they're all just existing tests where `next`/`nth`/`fold`/`try_fold` are replaced by their DEI counterparts, but I do think all paths are covered. Is there anything in particular that I should probably also test?
squash of all commits for nth_back on ChunksMut
wip nth_back for chunks_mut
working chunksmut
fixed nth_back for chunksmut
Signed-off-by: wizAmit <amitforfriends_dns@yahoo.com>
r? @timvermeulen
r? @scottmcm