Use step_unchecked more liberally in range iter impls
Without these `_unchecked`, these operations on iterators of `char` fail to optimize out the unreachable panicking condition on overflow.
cc @cuviper https://github.com/rayon-rs/rayon/pull/771 where this was discovered.
The `E` type parameter was unnecessary, so it's now removed. The folding
closure now has reduced parametricity on just `T = Self::Item`, rather
than the whole `Self` iterator type. There's otherwise no functional
change in this.
Resolve overflow behavior for RangeFrom
This specifies a documented unspecified implementation detail of `RangeFrom` and makes it consistently implement the specified behavior.
Specifically, `(u8::MAX).next()` is defined to cause an overflow, and resolve that overflow in the same manner as the `Step::forward` implementation.
The inconsistency that has existed is `<RangeFrom as Iterator>::nth`. The existing behavior should be plain to see after #69659: the skipping part previously always panicked if it caused an overflow, but the final step (to set up the state for further iteration) has always been debug-checked.
The inconsistency, then, is that `RangeFrom::nth` does not implement the same behavior as the naive (and default) implementation of just calling `next` multiple times. This PR aligns `RangeFrom::nth` to have identical behavior to the naive implementation. It also lines up with the standard behavior of primitive math in Rust everywhere else in the language: debug checked overflow.
cc @Amanieu
---
Followup to #69659. Closes#25708 (by documenting the panic as intended).
The documentation wording is preliminary and can probably be improved.
This will probably need an FCP, as it changes observable stable behavior.
Add Extend::{extend_one,extend_reserve}
This adds new optional methods on `Extend`: `extend_one` add a single
element to the collection, and `extend_reserve` pre-allocates space for
the predicted number of incoming elements. These are used in `Iterator`
for `partition` and `unzip` as they shuffle elements one-at-a-time into
their respective collections.
This adds new optional methods on `Extend`: `extend_one` add a single
element to the collection, and `extend_reserve` pre-allocates space for
the predicted number of incoming elements. These are used in `Iterator`
for `partition` and `unzip` as they shuffle elements one-at-a-time into
their respective collections.
impl Step for char (make Range*<char> iterable)
[[irlo thread]](https://internals.rust-lang.org/t/mini-rfc-make-range-char-work/12392?u=cad97) [[godbolt asm example]](https://rust.godbolt.org/z/fdveKo)
Add an implementation of the `Step` trait for `char`, which has the effect of making `RangeInclusive<char>` (and the other range types) iterable.
I've used the surrogate range magic numbers as magic numbers here rather than e.g. a `const SURROGATE_RANGE = 0xD800..0xE000` because these numbers appear to be used as magic numbers elsewhere and there doesn't exist constants for them yet. These files definitely aren't where surrogate range constants should live.
`ExactSizeIterator` is not implemented because `0x10FFFF` is bigger than fits in a `usize == u16`. However, given we already provide some `ExactSizeIterator` that are not correct on 16 bit targets, we might still want to consider providing it for `Range`[`Inclusive`]`<char>`, as it is definitely _very_ convenient. (At the very least, we want to make sure `.count()` doesn't bother iterating the range.)
The second commit in this PR changes a call to `Step::forward` to use `Step::forward_unchecked` in `RangeInclusive::next`. This is because without this patch, iteration over all codepoints (`'\0'..=char::MAX`) does not successfully optimize out the panicking branch. This was mentioned in the PR that updated `Step` to its current design, but was deemed not yet necessary as it did not impact codegen for integral types.
More of `Range*`'s implementations' calls to `Step` methods will probably want to see if they can use the `_unchecked` version as (if) we open up `Step` to being implemented on more types.
---
cc @rust-lang/libs, this is insta-stable and a fairly significant addition to `Range*`'s capabilities; this is the first instance of a noncontinuous domain being iterable with `Range` (or, well, anything other than primitive integers). I don't think this needs a full RFC, but it should definitely get some decent eyes on it.
Add Peekable::next_if
Prior art:
`rust_analyzer` uses [`Parser::eat`](50f4ae798b/crates/ra_parser/src/parser.rs (L94)), which is `next_if` specialized to `|y| self.next_if(|x| x == y)`.
Basically every other parser I've run into in Rust has an equivalent of `Parser::eat`; see for example
- [cranelift](94190d5724/cranelift/reader/src/parser.rs (L498))
- [rcc](a8159c3904/src/parse/mod.rs (L231))
- [crunch](8521874fab/crates/crunch-parser/src/parser/mod.rs (L213-L241))
Possible extensions: A specialization of `next_if` to using `Eq::eq`. The only difficulty here is the naming - maybe `next_if_eq`?
Alternatives:
- Instead of `func: impl FnOnce(&I::Item) -> bool`, use `func: impl FnOnce(I::Item) -> Option<I::Item>`. This has the advantage that `func` can move the value if necessary, but means that there is no guarantee `func` will return the same value it was given.
- Instead of `fn next_if(...) -> Option<I::Item>`, use `fn next_if(...) -> bool`. This makes the common case of `iter.next_if(f).is_some()` easier, but makes the unusual case impossible.
Bikeshedding on naming:
- `next_if` could be renamed to `consume_if` (to match `eat`, but a little more formally)
- `next_if_eq` could be renamed to `consume`. This is more concise but less self-explanatory if you haven't written a lot of parsers.
- Both of the above, but with `consume` replaced by `eat`.
Enables Range<char> to be iterable
Note: https://rust.godbolt.org/z/fdveKo
An iteration over all char ('\0'..=char::MAX)
includes unreachable panic code currently.
Updating RangeInclusive::next to call
Step::forward_unchecked (which is safe to do
but not done yet becuase it wasn't necessary)
successfully removes the panic from this iteration.
`fold` is currently implemented via `try_fold`, but implementing it
directly results in slightly less LLVM IR being generated, speeding up
compilation of some benchmarks.
(And likewise for `rfold`.)
The commit adds `fold` implementations to all the iterators that lack
one but do have a `try_fold` implementation. Most of these just call the
`try_fold` implementation directly.
Rework the std::iter::Step trait
Previous attempts: #43127#62886#68807
Tracking issue: #42168
This PR reworks the `Step` trait to be phrased in terms of the *successor* and *predecessor* operations. With this, `Step` hopefully has a consistent identity that can have a path towards stabilization. The proposed trait:
```rust
/// Objects that have a notion of *successor* and *predecessor* operations.
///
/// The *successor* operation moves towards values that compare greater.
/// The *predecessor* operation moves towards values that compare lesser.
///
/// # Safety
///
/// This trait is `unsafe` because its implementation must be correct for
/// the safety of `unsafe trait TrustedLen` implementations, and the results
/// of using this trait can otherwise be trusted by `unsafe` code to be correct
/// and fulful the listed obligations.
pub unsafe trait Step: Clone + PartialOrd + Sized {
/// Returns the number of *successor* steps required to get from `start` to `end`.
///
/// Returns `None` if the number of steps would overflow `usize`
/// (or is infinite, or if `end` would never be reached).
///
/// # Invariants
///
/// For any `a`, `b`, and `n`:
///
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::forward(&a, n) == Some(b)`
/// * `steps_between(&a, &b) == Some(n)` if and only if `Step::backward(&a, n) == Some(a)`
/// * `steps_between(&a, &b) == Some(n)` only if `a <= b`
/// * Corollary: `steps_between(&a, &b) == Some(0)` if and only if `a == b`
/// * Note that `a <= b` does _not_ imply `steps_between(&a, &b) != None`;
/// this is the case wheen it would require more than `usize::MAX` steps to get to `b`
/// * `steps_between(&a, &b) == None` if `a > b`
fn steps_between(start: &Self, end: &Self) -> Option<usize>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, m).and_then(|x| Step::forward_checked(x, n))`
///
/// For any `a`, `n`, and `m` where `n + m` does not overflow:
///
/// * `Step::forward_checked(a, n).and_then(|x| Step::forward_checked(x, m)) == Step::forward_checked(a, n + m)`
///
/// For any `a` and `n`:
///
/// * `Step::forward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::forward_checked(&x, 1))`
/// * Corollary: `Step::forward_checked(&a, 0) == Some(a)`
fn forward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::forward(Step::forward(a, n), m) == Step::forward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_checked(a, n) == Some(Step::forward(a, n))`
/// * `Step::forward(a, n) == (0..n).fold(a, |x, _| Step::forward(x, 1))`
/// * Corollary: `Step::forward(a, 0) == a`
/// * `Step::forward(a, n) >= a`
/// * `Step::backward(Step::forward(a, n), n) == a`
fn forward(start: Self, count: usize) -> Self {
Step::forward_checked(start, count).expect("overflow in `Step::forward`")
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `forward` or `forward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b > a`, it is safe to call `Step::forward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&a, &b) == Some(n)`,
/// it is safe to call `Step::forward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::forward_unchecked(a, n)` is equivalent to `Step::forward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn forward_unchecked(start: Self, count: usize) -> Self {
Step::forward(start, count)
}
/// Returns the value that would be obtained by taking the *successor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`, returns `None`.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`:
///
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == n.checked_add(m).and_then(|x| Step::backward_checked(a, x))`
/// * `Step::backward_checked(a, n).and_then(|x| Step::backward_checked(x, m)) == try { Step::backward_checked(a, n.checked_add(m)?) }`
///
/// For any `a` and `n`:
///
/// * `Step::backward_checked(a, n) == (0..n).try_fold(a, |x, _| Step::backward_checked(&x, 1))`
/// * Corollary: `Step::backward_checked(&a, 0) == Some(a)`
fn backward_checked(start: Self, count: usize) -> Option<Self>;
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// If this would overflow the range of values supported by `Self`,
/// this function is allowed to panic, wrap, or saturate.
/// The suggested behavior is to panic when debug assertions are enabled,
/// and to wrap or saturate otherwise.
///
/// Unsafe code should not rely on the correctness of behavior after overflow.
///
/// # Invariants
///
/// For any `a`, `n`, and `m`, where no overflow occurs:
///
/// * `Step::backward(Step::backward(a, n), m) == Step::backward(a, n + m)`
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_checked(a, n) == Some(Step::backward(a, n))`
/// * `Step::backward(a, n) == (0..n).fold(a, |x, _| Step::backward(x, 1))`
/// * Corollary: `Step::backward(a, 0) == a`
/// * `Step::backward(a, n) <= a`
/// * `Step::forward(Step::backward(a, n), n) == a`
fn backward(start: Self, count: usize) -> Self {
Step::backward_checked(start, count).expect("overflow in `Step::backward`")
}
/// Returns the value that would be obtained by taking the *predecessor*
/// of `self` `count` times.
///
/// # Safety
///
/// It is undefined behavior for this operation to overflow the
/// range of values supported by `Self`. If you cannot guarantee that this
/// will not overflow, use `backward` or `backward_checked` instead.
///
/// # Invariants
///
/// For any `a`:
///
/// * if there exists `b` such that `b < a`, it is safe to call `Step::backward_unchecked(a, 1)`
/// * if there exists `b`, `n` such that `steps_between(&b, &a) == Some(n)`,
/// it is safe to call `Step::backward_unchecked(a, m)` for any `m <= n`.
///
/// For any `a` and `n`, where no overflow occurs:
///
/// * `Step::backward_unchecked(a, n)` is equivalent to `Step::backward(a, n)`
#[unstable(feature = "unchecked_math", reason = "niche optimization path", issue = "none")]
unsafe fn backward_unchecked(start: Self, count: usize) -> Self {
Step::backward(start, count)
}
}
```
Note that all of these are associated functions and not callable via method syntax; the calling syntax is always `Step::forward(start, n)`. This version of the trait additionally changes the stepping functions to talk their arguments by value.
As opposed to previous attempts which provided a "step by one" method directly, this version of the trait only exposes "step by n". There are a few reasons for this:
- `Range*`, the primary consumer of `Step`, assumes that the "step by n" operation is cheap. If a single step function is provided, it will be a lot more enticing to implement "step by n" as n repeated calls to "step by one". While this is not strictly incorrect, this behavior would be surprising for anyone used to using `Range<{primitive integer}>`.
- With a trivial default impl, this can be easily added backwards-compatibly later.
- The debug-wrapping "step by n" needs to exist for `RangeFrom` to be consistent between "step by n" and "step by one" operation. (Note: the behavior is not changed by this PR, but making the behavior consistent is made tenable by this PR.)
Three "kinds" of step are provided: `_checked`, which returns an `Option` indicating attempted overflow; (unsuffixed), which provides "safe overflow" behavior (is allowed to panic, wrap, or saturate, depending on what is most convenient for a given type); and `_unchecked`, which is a version which assumes overflow does not happen.
Review is appreciated to check that:
- The invariants as described on the `Step` functions are enough to specify the "common sense" consistency for successor/predecessor.
- Implementation of `Step` functions is correct in the face of overflow and the edges of representable integers.
- Added tests of `Step` functions are asserting the correct behavior (and not just the implemented behavior).
Currently it uses `for x in self`, which seems dubious within an
iterator method. Furthermore, `self.next()` is used in all the other
iterator methods.
- Remove a type parameter from `[A]RcFromIter`.
- Remove an implementation of `[A]RcFromIter` that didn't actually
specialize anything.
- Remove unused implementation of `IsZero` for `Option<&mut T>`.
- Change specializations of `[A]RcEqIdent` to use a marker trait version
of `Eq`.
- Remove `BTreeClone`. I couldn't find a way to make this work with
`min_specialization`.
- Add `rustc_unsafe_specialization_marker` to `Copy` and `TrustedLen`.
Document unsafety in core::{panicking, alloc::layout, hint, iter::adapters::zip}
Helps with #66219.
r? @Mark-Simulacrum do you want to continue reading safety comments? :D