Add ExactChunks::remainder and ExactChunks::into_remainder
These allow to get the leftover items of the slice that are not being
iterated as part of the iterator due to not filling a complete chunk.
The mutable version consumes the slice because otherwise we would either
a) have to borrow the iterator instead of taking the lifetime of
the underlying slice, which is not what *any* of the other iterator
functions is doing, or
b) would allow returning multiple mutable references to the same data
The current behaviour of consuming the iterator is consistent with
IterMut::into_slice for the normal iterator.
----
This is related to https://github.com/rust-lang/rust/issues/47115#issuecomment-392685177 and the following comments.
While there the discussion was first about a way to get the "tail" of the iterator (everything from the slice that is still not iterated yet), this gives kind of unintuitive behaviour and is inconsistent with how the other slice iterators work.
Unintuitive because the `next_back` would have no effect on the tail (or otherwise the tail could not include the remainder items), inconsistent because a) generally the idea of the slice iterators seems to be to only ever return items that were not iterated yet (and don't provide a way to access the same item twice) and b) we would return a "flat" `&[T]` slice but the iterator's shape is `&[[T]]` instead, c) the mutable variant would have to borrow from the iterator instead of the underlying slice (all other iterator functions borrow from the underlying slice!)
As such, I've only implemented functions to get the remainder. This also allows the implementation to be completely safe still (and around slices instead of raw pointers), while getting the tail would either be inefficient or would have to be implemented around raw pointers.
CC @kerollmops
Implement always-fallible TryFrom for usize/isize conversions that are infallible on some platforms
This reverts commit 837d6c7023 "Remove TryFrom impls that might become conditionally-infallible with a portability lint".
This fixes#49415 by adding (restoring) missing `TryFrom` impls for integer conversions to or from `usize` or `isize`, by making them always fallible at the type system level (that is, with `Error=TryFromIntError`) even though they happen to be infallible on some platforms (for some values of `size_of::<usize>()`).
They had been removed to allow the possibility to conditionally having some of them be infallible `From` impls instead, depending on the platforms, and have the [portability lint](https://github.com/rust-lang/rfcs/pull/1868) warn when they are used in code that is not already opting into non-portability. For example `#[allow(some_lint)] usize::from(x: u64)` would be valid on code that only targets 64-bit platforms.
This PR gives up on this possiblity for two reasons:
* Based on discussion with @aturon, it seems that the portability lint is not happening any time soon. It’s better to have the conversions be available *at all* than keep blocking them for so long. Portability-lint-gated platform-specific APIs can always be added separately later.
* For code that is fine with fallibility, the alternative would force it to opt into "non-portability" even though there would be no real portability issue.
Stabilize Iterator::flatten in 1.29, fixes#48213.
This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).
Tracking issue is: #48213.
cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)
Document round-off error in `.mod_euc()`-method, see issue #50179
Due to a round-off error the method `.mod_euc()` of both `f32` and `f64` can produce mathematical invalid outputs. If `self` in magnitude is much small than the modulus `rhs` and negative, `self + rhs` in the first branch cannot be represented in the given precision and results into `rhs`. In the mathematical strict sense, this breaks the definition. But given the limitation of floating point arithmetic it can be thought of the closest representable value to the true result, although it is not strictly in the domain `[0.0, rhs)` of the function. It is rather the left side asymptotical limit. It would be desirable that it produces the mathematical more sound approximation of `0.0`, the right side asymptotical limit. But this breaks the property, that `self == self.div_euc(rhs) * rhs + a.mod_euc(rhs)`.
The discussion in issue #50179 did not find an satisfying conclusion to which property is deemed more important. But at least we can document the behaviour. Which this pull request does.
Optimize sum of Durations by using custom function
The current `impl Sum for Duration` uses `fold` to perform several `add`s (or really `checked_add`s) of durations. In doing so, it has to guarantee the number of nanoseconds is valid after every addition. If you squeese the current implementation into a single function it looks kind of like this:
````rust
fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration {
let mut sum = Duration::new(0, 0);
for rhs in iter {
if let Some(mut secs) = sum.secs.checked_add(rhs.secs) {
let mut nanos = sum.nanos + rhs.nanos;
if nanos >= NANOS_PER_SEC {
nanos -= NANOS_PER_SEC;
if let Some(new_secs) = secs.checked_add(1) {
secs = new_secs;
} else {
panic!("overflow when adding durations");
}
}
sum = Duration { secs, nanos }
} else {
panic!("overflow when adding durations");
}
}
sum
}
````
We only need to check if `nanos` is in the correct range when giving our final answer so we can have a more optimized version like so:
````rust
fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration {
let mut total_secs: u64 = 0;
let mut total_nanos: u64 = 0;
for entry in iter {
total_secs = total_secs
.checked_add(entry.secs)
.expect("overflow in iter::sum over durations");
total_nanos = match total_nanos.checked_add(entry.nanos as u64) {
Some(n) => n,
None => {
total_secs = total_secs
.checked_add(total_nanos / NANOS_PER_SEC as u64)
.expect("overflow in iter::sum over durations");
(total_nanos % NANOS_PER_SEC as u64) + entry.nanos as u64
}
};
}
total_secs = total_secs
.checked_add(total_nanos / NANOS_PER_SEC as u64)
.expect("overflow in iter::sum over durations");
total_nanos = total_nanos % NANOS_PER_SEC as u64;
Duration {
secs: total_secs,
nanos: total_nanos as u32,
}
}
````
We now only convert `total_nanos` to `total_secs` (1) if `total_nanos` overflows and (2) at the end of the function when we have to output a valid `Duration`. This gave a 5-22% performance improvement when I benchmarked it, depending on how big the `nano` value of the `Duration`s in `iter` were.
Specialize StepBy<Range(Inclusive)>
Part of #51557, related to #43064, #31155
As discussed in the above issues, `step_by` optimizes very badly on ranges which is related to
1. the special casing of the first `StepBy::next()` call
2. the need to do 2 additions of `n - 1` and `1` inside the range's `next()`
This PR eliminates both by overriding `next()` to always produce the current element and also step ahead by `n` elements in one go. The generated code is much better, even identical in the case of a `Range` with constant `start` and `end` where `start+step` can't overflow. Without constant bounds it's a bit longer than the manual loop. `RangeInclusive` doesn't optimize as nicely but is still much better than the original asm.
Unsigned integers optimize better than signed ones for some reason.
See the following two links for a comparison.
[godbolt: specialization for ..](https://godbolt.org/g/haHLJr)
[godbolt: specialization for ..=](https://godbolt.org/g/ewyMu6)
`RangeFrom`, the only other range with an `Iterator` implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness".
The approach can not be used in general, because it would produce side effects of the underlying iterator too early.
May obsolete #51435, haven't checked.
the originally generated code was highly suboptimal
this brings it close to the same code or even exactly the same as a
manual while-loop by eliminating a branch and the
double stepping of n-1 + 1 steps
The intermediate trait lets us circumvent the specialization
type inference bugs
Add Ref/RefMut map_split method
As proposed [here](https://internals.rust-lang.org/t/make-refcell-support-slice-splitting/7707).
TLDR: Add a `map_split` method that allows multiple `RefMut`s to exist simultaneously so long as they refer to non-overlapping regions of the original `RefCell`. This is useful for things like the slice `split_at_mut` method.
These allow to get the leftover items of the slice that are not being
iterated as part of the iterator due to not filling a complete chunk.
The mutable version consumes the slice because otherwise we would either
a) have to borrow the iterator instead of taking the lifetime of
the underlying slice, which is not what *any* of the other iterator
functions is doing, or
b) would allow returning multiple mutable references to the same data
The current behaviour of consuming the iterator is consistent with
IterMut::into_slice for the normal iterator.
Improve `Debug` impl of `time::Duration`
Hi there!
For a long time now, I was getting annoyed by the derived `Debug` impl of `Duration`. Usually, I use `Duration` to either do quick'n'dirty benchmarking or measuring the time of some operation in general. The output of the derived Debug impl is hard to parse for humans: is { secs: 0, nanos: 968360102 } or { secs: 0, nanos 98507324 } longer?
So after running into the annoyance several times (sometimes building my own function to print the Duration properly), I decided to tackle this. Now the output looks like this:
```
Duration::new(1, 0) => 1s
Duration::new(1, 1) => 1.000000001s
Duration::new(1, 300) => 1.0000003s
Duration::new(1, 4000) => 1.000004s
Duration::new(1, 600000) => 1.0006s
Duration::new(1, 7000000) => 1.007s
Duration::new(0, 0) => 0ns
Duration::new(0, 1) => 1ns
Duration::new(0, 300) => 300ns
Duration::new(0, 4001) => 4.001µs
Duration::new(0, 600300) => 600.3µs
Duration::new(0, 7000000) => 7ms
```
Note that I implemented the formatting manually and didn't use floats. No information is "lost" when printing. So `Duration::new(123_456_789_000, 900_000_001)` prints as `123456789000.900000001s`.
~~This is not yet finished~~, but I wanted to open the PR now already in order to get some feedback (maybe everyone likes the derived impl).
### Still ToDo:
- [x] Respect precision ~~and width~~ parameter of the formatter (see [this comment](https://github.com/rust-lang/rust/pull/50364#issuecomment-386107107))
### Alternatives/Decisions
- Should large durations displayed in minutes, hours, days, ...? For now, I decided not to because the current formatting is close the how a `Duration` is stored. From this new `Debug` output, you can still easily see what the values of `secs` and `nanos` are. A formatting like `3h 27m 12s 9ms` might be more appropriate for a `Display` impl?
- Should this rather be a `Display` impl and should `Debug` be derived? Maybe this formatting is too fancy for `Debug`? In my opinion it's not and, as already mentioned, from the current format one can still very easily determine the values for `secs` and `nanos`.
- Whitespace between the number and the unit?
### Notes for reviewers
- ~~The combined diff sucks. Rather review both commits individually.~~
- ~~In the unit test, I am building my own type implementing `fmt::Write` to test the output. Maybe there is already something like that which I can use?~~
- My `Debug` impl block is marked as `#[stable(...)]`... but that's fine since the derived Debug impl was stable already, right?
---
~~Apart from the main change, I moved all `time` unit tests into the `tests` directory. All other `libcore` tests are there, so I guess it was simply an oversight. Prior to this change, the `time` tests weren't run, so I guess this is kind of a bug fix. If my `Debug` impl is rejected, I can of course just send the fix as PR.~~ (this was already merged in #50466)
Escape combining characters in char::Debug
Although combining characters are technically printable, they make little sense to print on their own with `Debug`: it'd be better to escape them like non-printable characters.
This is a breaking change, but I imagine the fact `escape_debug` is rare and almost certainly primarily used for debugging that this is an acceptable change.
Resolves#41922.
r? @alexcrichton
cc @clarcharr
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
Previously, the code would panic for high precision values. Now it
has the same behavior as printing normal floating point values: if
a high precision is specified, '0's are added.
Rounding is done like for printing floating point numbers. If the
first digit which isn't printed (due to the precision parameter) is
larger than '4', the number is rounded up.
Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)
So one day I was writing something in my codebase that basically amounted to `impl SliceIndex for (Bound<usize>, Bound<usize>)`, and I said to myself:
*Boy, gee, golly! I never realized bounds checking was so tricky!*
At some point when I had around 60 lines of tests for it, I decided to go see how the standard library does it to see if I missed any edge cases. ...That's when I discovered that libcore only had about 40 lines of tests for slicing altogether, and none of them even used `..=`.
---
This PR includes:
* **Literally the first appearance of the word `get_unchecked_mut` in any directory named `test` or `tests`.**
* Likewise the first appearance of `get_mut` used with _any type of range argument_ in these directories.
* Tests for the panics on overflow with `..=`.
* I wanted to test on `[(); usize::MAX]` as well but that takes linear time in debug mode </3
* A horrible and ugly test-generating macro for the `should_panic` tests that increases the DRYness by a single order of magnitude (which IMO wasn't enough, but I didn't want to go any further and risk making the tests inaccessible to next guy).
* Same stuff for str!
* Actually, the existing `str` tests were pretty good. I just helped filled in the holes.
* [A fix for the bug it caught](https://github.com/rust-lang/rust/issues/50002). (only one ~~sadly~~)
Prior to this, Duration simply derived Debug. Since Duration doesn't
implement `Display`, the only way to inspect its value is to use
`Debug`. Unfortunately, the derived `Debug` impl is far from optimal
for humans. In many cases, Durations are used for some quick'n'dirty
benchmarking (or in general: measuring the time of some code). Correctly
understanding the output of Duration's Debug impl is not easy (e.g.
is "{ secs: 0, nanos: 968360102 }" or "{ secs: 0, nanos 98507324 }"
shorter?).
This commit replaces the derived impl with a manual one. It prints
the duration as seconds (i.e. "3.1803s") if the duration is longer than
a second, otherwise it prints it in either ms, µs or ns (depending on
the duration's length). This already helps readability a lot and it
never omits information that is stored.
This `Debug` impl does *not* respect the following formatting parameters:
- fill/align/padding: difficult to implement, probably not worth it
- alternate # flag: not clear what this should do
All other tests of libcore reside in the tests/ directory,
too. Apparently the tests of `time.rs` weren't run before, at
least not by `x.py test src/libcore`.