Commit graph

2237 commits

Author SHA1 Message Date
bors
b4b1e5ece2 Auto merge of #38049 - frewsxcv:libunicode, r=alexcrichton
Rename 'librustc_unicode' crate to 'libstd_unicode'.

Fixes https://github.com/rust-lang/rust/issues/26554.
2016-12-12 13:19:33 +00:00
bors
dedd985084 Auto merge of #38192 - stjepang:faster-sort-algorithm, r=bluss
Implement a faster sort algorithm

Hi everyone, this is my first PR.

I've made some changes to the standard sort algorithm, starting out with a few tweaks here and there, but in the end this endeavour became a complete rewrite of it.

#### Summary

Changes:

* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.

Benchmark:

```
 name                                        out1 ns/iter          out2 ns/iter          diff ns/iter   diff %
 slice::bench::sort_large_ascending          85,323 (937 MB/s)     8,970 (8918 MB/s)          -76,353  -89.49%
 slice::bench::sort_large_big_ascending      2,135,297 (599 MB/s)  355,955 (3595 MB/s)     -1,779,342  -83.33%
 slice::bench::sort_large_big_descending     2,266,402 (564 MB/s)  416,479 (3073 MB/s)     -1,849,923  -81.62%
 slice::bench::sort_large_big_random         3,053,031 (419 MB/s)  1,921,389 (666 MB/s)    -1,131,642  -37.07%
 slice::bench::sort_large_descending         313,181 (255 MB/s)    14,725 (5432 MB/s)        -298,456  -95.30%
 slice::bench::sort_large_mostly_ascending   287,706 (278 MB/s)    243,204 (328 MB/s)         -44,502  -15.47%
 slice::bench::sort_large_mostly_descending  415,078 (192 MB/s)    271,028 (295 MB/s)        -144,050  -34.70%
 slice::bench::sort_large_random             545,872 (146 MB/s)    521,559 (153 MB/s)         -24,313   -4.45%
 slice::bench::sort_large_random_expensive   30,321,770 (2 MB/s)   23,533,735 (3 MB/s)     -6,788,035  -22.39%
 slice::bench::sort_medium_ascending         616 (1298 MB/s)       155 (5161 MB/s)               -461  -74.84%
 slice::bench::sort_medium_descending        1,952 (409 MB/s)      202 (3960 MB/s)             -1,750  -89.65%
 slice::bench::sort_medium_random            3,646 (219 MB/s)      3,421 (233 MB/s)              -225   -6.17%
 slice::bench::sort_small_ascending          39 (2051 MB/s)        34 (2352 MB/s)                  -5  -12.82%
 slice::bench::sort_small_big_ascending      96 (13333 MB/s)       96 (13333 MB/s)                  0    0.00%
 slice::bench::sort_small_big_descending     248 (5161 MB/s)       243 (5267 MB/s)                 -5   -2.02%
 slice::bench::sort_small_big_random         501 (2554 MB/s)       490 (2612 MB/s)                -11   -2.20%
 slice::bench::sort_small_descending         95 (842 MB/s)         63 (1269 MB/s)                 -32  -33.68%
 slice::bench::sort_small_random             372 (215 MB/s)        354 (225 MB/s)                 -18   -4.84%
```

#### Background

First, let me just do a quick brain dump to discuss what I learned along the way.

The official documentation says that the standard sort in Rust is a stable sort. This constraint is thus set in stone and immediately rules out many popular sorting algorithms. Essentially, the only algorithms we might even take into consideration are:

1. [Merge sort](https://en.wikipedia.org/wiki/Merge_sort)
2. [Block sort](https://en.wikipedia.org/wiki/Block_sort) (famous implementations are [WikiSort](https://github.com/BonzaiThePenguin/WikiSort) and [GrailSort](https://github.com/Mrrl/GrailSort))
3. [TimSort](https://en.wikipedia.org/wiki/Timsort)

Actually, all of those are just merge sort flavors. :) The current standard sort in Rust is a simple iterative merge sort. It has three problems. First, it's slow on partially sorted inputs (even though #29675 helped quite a bit). Second, it always makes around `log(n)` iterations copying the entire array between buffers, no matter what. Third, it allocates huge amounts of temporary memory (a buffer of size `2*n`, where `n` is the size of input).

The problem of auxilliary memory allocation is a tough one. Ideally, it would be best for our sort to allocate `O(1)` additional memory. This is what block sort (and it's variants) does. However, it's often very complicated (look at [this](https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.cpp)) and even then performs rather poorly. The author of WikiSort claims good performance, but that must be taken with a grain of salt. It performs well in comparison to `std::stable_sort` in C++. It can even beat `std::sort` on partially sorted inputs, but on random inputs it's always far worse. My rule of thumb is: high performance, low memory overhead, stability - choose two.

TimSort is another option. It allocates a buffer of size `n/2`, which is not great, but acceptable. Performs extremelly well on partially sorted inputs. However, it seems pretty much all implementations suck on random inputs. I benchmarked implementations in [Rust](https://github.com/notriddle/rust-timsort), [C++](https://github.com/gfx/cpp-TimSort), and [D](fd518eb310/std/algorithm/sorting.d (L2062)). The results were a bit disappointing. It seems bad performance is due to complex galloping procedures in hot loops. Galloping noticeably improves performance on partially sorted inputs, but worsens it on random ones.

#### The new algorithm

Choosing the best algorithm is not easy. Plain merge sort is bad on partially sorted inputs. TimSort is bad on random inputs and block sort is even worse. However, if we take the main ideas from TimSort (intelligent merging strategy of sorted runs) and drop galloping, then we'll have great performance on random inputs and it won't be bad on partially sorted inputs either.

That is exactly what this new algorithm does. I can't call it TimSort, since it steals just a few of it's ideas. Complete TimSort would be a much more complex and elaborate implementation. In case we in the future figure out how to incorporate more of it's ideas into this implementation without crippling performance on random inputs, it's going to be very easy to extend. I also did several other minor improvements, like reworked insertion sort to make it faster.

There are also new, more thorough benchmarks and panic safety tests.

The final code is not terribly complex and has less unsafe code than I anticipated, but there's still plenty of it that should be carefully reviewed. I did my best at documenting non-obvious code.

I'd like to notify several people of this PR, since they might be interested and have useful insights:

1. @huonw because he wrote the [original merge sort](https://github.com/rust-lang/rust/pull/11064).
2. @alexcrichton because he was involved in multiple discussions of it.
3. @veddan because he wrote [introsort](https://github.com/veddan/rust-introsort) in Rust.
4. @notriddle because he wrote [TimSort](https://github.com/notriddle/rust-timsort) in Rust.
5. @bluss because he had an attempt at writing WikiSort in Rust.
6. @gnzlbg, @rkruppe, and @mark-i-m because they were involved in discussion #36318.

**P.S.** [quickersort](https://github.com/notriddle/quickersort) describes itself as being universally [faster](https://github.com/notriddle/quickersort/blob/master/perf.txt) than the standard sort, which is true. However, if this PR gets merged, things might [change](https://gist.github.com/stjepang/b9f0c3eaa0e1f1280b61b963dae19a30) a bit. ;)
2016-12-09 10:00:25 +00:00
Stjepan Glavina
c0e150a2a6 Inline nested fn collapse
Since merge_sort is generic and collapse isn't, that means calls to
collapse won't be inlined.  inlined. Therefore, we must stick an
`#[inline]` above `fn collapse`.
2016-12-08 22:37:36 +01:00
bors
7537f953e2 Auto merge of #38182 - bluss:more-vec-extend, r=alexcrichton
Specialization for Extend<&T> for vec

Specialize to use copy_from_slice when extending a Vec with &[T] where
T: Copy.

This specialization results in `.clone()` not being called in `extend_from_slice` and `extend` when the element is `Copy`.

Fixes #38021
2016-12-08 15:39:39 +00:00
Stjepan Glavina
c8d73ea68a Implement a faster sort algorithm
This is a complete rewrite of the standard sort algorithm. The new algorithm
is a simplified variant of TimSort. In summary, the changes are:

* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.
2016-12-07 21:35:07 +01:00
bors
5938eba4e3 Auto merge of #38149 - bluss:is-empty, r=alexcrichton
Forward more ExactSizeIterator methods and `is_empty` edits

- Forward ExactSizeIterator methods in more places, like `&mut I` and `Box<I>` iterator impls.
- Improve `VecDeque::is_empty` itself (see commit 4)
- All the collections iterators now have `len` or `is_empty` forwarded if doing so is a benefit. In the remaining cases, they already use a simple size hint (using something like a stored `usize` value), which is sufficient for the default implementation of len and is_empty.
2016-12-07 07:15:31 +00:00
Ulrik Sverdrup
02bf1ce9cc vec: More specialization for Extend<&T> for vec
Specialize to use copy_from_slice when extending a Vec with &[T] where
T: Copy.
2016-12-06 07:58:56 +01:00
Ulrik Sverdrup
28852c3c7c binary_heap: Forward ExactSizeIterator::is_empty 2016-12-04 15:46:36 +01:00
Ulrik Sverdrup
343b4c321d collections: Simplify VecDeque::is_empty
Improve is_empty on the VecDeque and its iterators by just comparing
tail and head; this saves a few instructions (to be able to remove the
`& (size - 1)` computation, it would have to know that size is a power of two).
2016-12-04 15:46:36 +01:00
Clar Charr
4dd590ac8c Remove redundant assertion near is_char_boundary. 2016-12-03 12:14:39 -05:00
Clar Charr
cbf734f9ab Add String::split_off. 2016-11-30 23:24:57 -05:00
Corey Farwell
274777a158 Rename 'librustc_unicode' crate to 'libstd_unicode'.
Fixes #26554.
2016-11-30 01:24:01 -05:00
bors
f8614c3973 Auto merge of #36340 - sfackler:slice-get-slice, r=alexcrichton
Implement RFC 1679

cc #35729

r? @alexcrichton
2016-11-26 18:47:06 -06:00
bors
9003e1ab6a Auto merge of #38008 - bluss:rustbuild-benches, r=alexcrichton
Add rustbuild command `bench`

Add command bench to rustbuild, so that `./x.py bench <path>` can compile and run benchmarks.

`./x.py bench --stage 1 src/libcollections` and `./x.py bench --stage 1 src/libstd` should both compile well. Just `./x.py bench` runs all benchmarks for the libstd crates.

Fixes #37897
2016-11-26 12:32:19 -06:00
Steven Fackler
5377b5e9c4 Overload get{,_mut}{,_unchecked} 2016-11-26 10:07:39 -08:00
Seo Sanghyeon
6ffcdff06c Rollup merge of #37967 - sfackler:enumset-issue, r=sfackler
Add a tracking issue for enum_set

I totally forgot this even existed!
2016-11-26 22:02:14 +09:00
Ulrik Sverdrup
42e66344b5 rustbuild: Point to core and collections's external benchmarks. 2016-11-25 23:10:43 +01:00
Steven Fackler
8560991cb0 Add a tracking issue for enum_set 2016-11-23 10:55:44 -08:00
Ulrik Sverdrup
74cde120e5 core, collections: Implement better .is_empty() for slice and vec iterators
These iterators can use a pointer comparison instead of computing the length.
2016-11-23 02:31:41 +01:00
Ulrik Sverdrup
c36edc7261 vec: Use less code bloat specialized Vec::from_iter
Vec::from_iter's general case allocates the vector up front;
this is redundant for the TrustedLen case, and can then be avoided
to reduce the size of the code.
2016-11-13 01:30:42 +01:00
Ulrik Sverdrup
2b3a37bd2e Restore Vec::from_iter() specialization
Since I said "no intentional functional change" in the previous commit,
I guess it was inevitable there were unintentional changes. Not
functional, but optimization-wise. This restores the extend
specialization's use in Vec::from_iter.
2016-11-13 00:13:09 +01:00
Ulrik Sverdrup
5058e58676 vec: Write the .extend() specialization in cleaner style
As far as possible, use regular `default fn` specialization in favour of
ad-hoc conditionals.
2016-11-11 12:54:10 +01:00
Alex Crichton
5bce6ad16a Rollup merge of #37587 - ollie27:to_mut, r=alexcrichton
Remove recursive call from Cow::to_mut

It seems to prevent it from being inlined.
2016-11-05 10:50:25 -07:00
Alex Crichton
727f1d3f16 Rollup merge of #37585 - leodasvacas:change_into_to_from, r=alexcrichton
Change `Into<Vec<u8>> for String` and `Into<OsString> for PathBuf` to From

Fixes #37561. First contribution, happy with any and all feedback!
2016-11-05 10:50:25 -07:00
Alex Crichton
638436e55f Rollup merge of #37574 - ollie27:cow_add, r=alexcrichton
Fix issues with the Add/AddAssign impls for Cow<str>

* Correct the stability attributes.
* Make Add and AddAssign actually behave the same.
* Use String::with_capacity when allocating a new string.
* Fix the tests.
2016-11-05 10:50:24 -07:00
Oliver Middleton
775d399da8 Remove recursive call from Cow::to_mut
It seems to prevent it from being inlined.
2016-11-04 18:47:32 +00:00
leonardo.yvens
3e4bd88438 Change Into<Vec<u8>> for String and Into<OsString> for PathBuf to From impls 2016-11-04 15:54:08 -02:00
bors
81601cd3a3 Auto merge of #37306 - bluss:trusted-len, r=alexcrichton
Add Iterator trait TrustedLen to enable better FromIterator / Extend

This trait attempts to improve FromIterator / Extend code by enabling it to trust the iterator to produce an exact number of elements, which means that reallocation needs to happen only once and is moved out of the loop.

`TrustedLen` differs from `ExactSizeIterator` in that it attempts to include _more_ iterators by allowing for the case that the iterator's len does not fit in `usize`. Consumers must check for this case (for example they could panic, since they can't allocate a collection of that size).

For example, chain can be TrustedLen and all numerical ranges can be TrustedLen. All they need to do is to report an exact size if it fits in `usize`, and `None` as the upper bound otherwise.

The trait describes its contract like this:

```
An iterator that reports an accurate length using size_hint.

The iterator reports a size hint where it is either exact
(lower bound is equal to upper bound), or the upper bound is `None`.
The upper bound must only be `None` if the actual iterator length is
larger than `usize::MAX`.

The iterator must produce exactly the number of elements it reported.

This trait must only be implemented when the contract is upheld.
Consumers of this trait must inspect `.size_hint()`’s upper bound.
```

Fixes #37232
2016-11-04 10:40:30 -07:00
Oliver Middleton
1e40c80cf5 Fix issues with the Add/AddAssign impls for Cow<str>
* Correct the stability attributes.
* Make Add and AddAssign actually behave the same.
* Use String::with_capacity when allocating a new string.
* Fix the tests.
2016-11-04 01:07:00 +00:00
Ulrik Sverdrup
f0e6b90790 Link the tracking issue for TrustedLen 2016-11-04 01:00:55 +01:00
iirelu
e593c3b893 Changed most vec! invocations to use square braces
Most of the Rust community agrees that the vec! macro is clearer when
called using square brackets [] instead of regular brackets (). Most of
these ocurrences are from before macros allowed using different types of
brackets.

There is one left unchanged in a pretty-print test, as the pretty
printer still wants it to have regular brackets.
2016-10-31 22:51:40 +00:00
Ulrik Sverdrup
5dc9db541e vec: Remove the Vec specialization for .extend()
This now produces as good code (with optimizations) using the TrustedLen
codepath.
2016-10-27 01:45:31 +02:00
Ulrik Sverdrup
2411be5cae impl TrustedLen for vec::IntoIter 2016-10-27 00:18:13 +02:00
bors
c59cb71d97 Auto merge of #37419 - GuillaumeGomez:rollup, r=GuillaumeGomez
Rollup of 7 pull requests

- Successful merges: #36206, #37144, #37391, #37394, #37396, #37398, #37414
- Failed merges:
2016-10-26 14:58:16 -07:00
bors
3a25b65c1f Auto merge of #37315 - bluss:fold-more, r=alexcrichton
Implement Iterator::fold for .chain(), .cloned(), .map() and the VecDeque iterators.

Chain can do something interesting here where it passes on the fold
into its inner iterators.

The lets the underlying iterator's custom fold() be used, and skips the
regular chain logic in next.

Also implement .fold() specifically for .map() and .cloned() so that any
inner fold improvements are available through map and cloned.

The same way, a VecDeque iterator fold can be turned into two slice folds.

These changes lend the power of the slice iterator's loop codegen to
VecDeque, and to chains of slice iterators, and so on.
It's an improvement for .sum() and .product(), and other uses of fold.
2016-10-26 11:43:32 -07:00
Duncan
09227b17f4 Vec docs: fix broken links and make quoting consistent 2016-10-26 06:24:52 +13:00
Ulrik Sverdrup
15a95866b4 Special case .fold() for VecDeque's iterators 2016-10-25 15:50:52 +02:00
bors
40f79ba8c9 Auto merge of #37327 - aidanhs:aphs-bytes-iter-doc, r=alexcrichton
`as_bytes` is not the iterator on String, `bytes` is

r? @steveklabnik
2016-10-22 23:02:24 -07:00
bors
febfe7683b Auto merge of #37326 - SimonSapin:from-cow, r=alexcrichton
Implement `From<Cow<str>> for String` and `From<Cow<[T]>> for Vec<T>`.

Motivation: the `selectors` crate is generic over a string type, in order to support all of `String`,  `string_cache::Atom`, and `gecko_string_cache::Atom`. Multiple trait bounds are used for the various operations done with these strings. One of these operations is creating a string (as efficiently as possible, re-using an existing memory allocation if possible) from `Cow<str>`.

The `std::convert::From` trait seems natural for this, but the relevant implementation was missing before this PR. To work around this I’ve added a `FromCowStr` trait in `selectors`, but with trait coherence that means one of `selectors` or `string_cache` needs to depend on the other to implement this trait. Using a trait from `std` would solve this.

The `Vec<T>` implementation is just added for consistency. I also tried a more general `impl<'a, O, B: ?Sized + ToOwned<Owned=O>> From<Cow<'a, B>> for O`, but (the compiler thinks?) it conflicts with `From<T> for T` the impl (after moving all of `collections::borrow` into `core::borrow` to work around trait coherence).
2016-10-22 19:47:05 -07:00
Guillaume Gomez
18f9758c4f Rollup merge of #37043 - GuillaumeGomez:vec_urls, r=frewsxcv
Add missing urls on Vec docs

r? @steveklabnik
2016-10-22 01:21:58 +02:00
Aidan Hobson Sayers
dceb2c9cd2 as_bytes is not the iterator, bytes is 2016-10-21 18:28:02 +01:00
Ulrik Sverdrup
ee84ec1fa1 vec: Add a debug assertion where TrustedLen is used 2016-10-21 19:18:08 +02:00
Simon Sapin
7e603d4e3b Implement From<Cow<str>> for String and From<Cow<[T]>> for Vec<T>.
Motivation: the `selectors` crate is generic over a string type,
in order to support all of `String`, `string_cache::Atom`, and
`gecko_string_cache::Atom`. Multiple trait bounds are used
for the various operations done with these strings.
One of these operations is creating a string (as efficiently as possible,
re-using an existing memory allocation if possible) from `Cow<str>`.

The `std::convert::From` trait seems natural for this, but
the relevant implementation was missing before this PR.
To work around this I’ve added a `FromCowStr` trait in `selectors`,
but with trait coherence that means one of `selectors` or `string_cache`
needs to depend on the other to implement this trait.
Using a trait from `std` would solve this.

The `Vec<T>` implementation is just added for consistency.
I also tried a more general
`impl<'a, O, B: ?Sized + ToOwned<Owned=O>> From<Cow<'a, B>> for O`,
but (the compiler thinks?) it conflicts with `From<T> for T` the impl
(after moving all of `collections::borrow` into `core::borrow`
to work around trait coherence).
2016-10-21 17:42:29 +02:00
Ulrik Sverdrup
622f24f6d9 vec: Use Vec::extend specializations in extend_from_slice and more
The new Vec::extend covers the duties of .extend_from_slice() and some
previous specializations.
2016-10-21 14:06:38 +02:00
Ulrik Sverdrup
49557112d6 Use TrustedLen for Vec's FromIterator and Extend 2016-10-20 14:07:31 +02:00
Guillaume Gomez
599ad8ed61 Add missing urls on Vec docs 2016-10-20 11:59:47 +02:00
Guillaume Gomez
ce369bfa11 Rollup merge of #37187 - frewsxcv:cow-doc-example, r=kmcallister
Improve doc example for `std::borrow::Cow`.

None
2016-10-19 23:15:00 +02:00
Florian Diebold
187ddf30b0 Update comment in Vec::dedup_by 2016-10-16 14:41:01 +02:00
bors
030bc49bb4 Auto merge of #37094 - fhartwig:spec-extend-from-slice, r=alexcrichton
Specialize Vec::extend to Vec::extend_from_slice

I tried using the existing `SpecExtend` as a helper trait for this, but the instances would always conflict with the instances higher up in the file, so I created a new helper trait.

Benchmarking `extend` vs `extend_from_slice` with an slice of 1000 `u64`s gives the following results:

```
before:

running 2 tests
test tests::bench_extend_from_slice ... bench:         166 ns/iter (+/- 78)
test tests::bench_extend_trait      ... bench:       1,187 ns/iter (+/- 697)

after:
running 2 tests
test tests::bench_extend_from_slice ... bench:         149 ns/iter (+/- 87)
test tests::bench_extend_trait      ... bench:         138 ns/iter (+/- 70)
```
2016-10-15 01:48:42 -07:00
Corey Farwell
a8dc2975fd Improve doc example for std::borrow::Cow. 2016-10-15 00:55:46 -04:00