user0/rust - Forgejo: Beyond coding. We Forge.

user0/rust

Author	SHA1	Message	Date
Aaron Turon	9a5cef4de5	Address fallout	2016-12-16 19:42:17 -08:00
bors	b4b1e5ece2	Auto merge of #38049 - frewsxcv:libunicode, r=alexcrichton Rename 'librustc_unicode' crate to 'libstd_unicode'. Fixes https://github.com/rust-lang/rust/issues/26554.	2016-12-12 13:19:33 +00:00
bors	dedd985084	Auto merge of #38192 - stjepang:faster-sort-algorithm, r=bluss Implement a faster sort algorithm Hi everyone, this is my first PR. I've made some changes to the standard sort algorithm, starting out with a few tweaks here and there, but in the end this endeavour became a complete rewrite of it. #### Summary Changes: * Improved performance, especially on partially sorted inputs. * Performs less comparisons on both random and partially sorted inputs. * Decreased the size of temporary memory: the new sort allocates 4x less. Benchmark: ``` name out1 ns/iter out2 ns/iter diff ns/iter diff % slice::bench::sort_large_ascending 85,323 (937 MB/s) 8,970 (8918 MB/s) -76,353 -89.49% slice::bench::sort_large_big_ascending 2,135,297 (599 MB/s) 355,955 (3595 MB/s) -1,779,342 -83.33% slice::bench::sort_large_big_descending 2,266,402 (564 MB/s) 416,479 (3073 MB/s) -1,849,923 -81.62% slice::bench::sort_large_big_random 3,053,031 (419 MB/s) 1,921,389 (666 MB/s) -1,131,642 -37.07% slice::bench::sort_large_descending 313,181 (255 MB/s) 14,725 (5432 MB/s) -298,456 -95.30% slice::bench::sort_large_mostly_ascending 287,706 (278 MB/s) 243,204 (328 MB/s) -44,502 -15.47% slice::bench::sort_large_mostly_descending 415,078 (192 MB/s) 271,028 (295 MB/s) -144,050 -34.70% slice::bench::sort_large_random 545,872 (146 MB/s) 521,559 (153 MB/s) -24,313 -4.45% slice::bench::sort_large_random_expensive 30,321,770 (2 MB/s) 23,533,735 (3 MB/s) -6,788,035 -22.39% slice::bench::sort_medium_ascending 616 (1298 MB/s) 155 (5161 MB/s) -461 -74.84% slice::bench::sort_medium_descending 1,952 (409 MB/s) 202 (3960 MB/s) -1,750 -89.65% slice::bench::sort_medium_random 3,646 (219 MB/s) 3,421 (233 MB/s) -225 -6.17% slice::bench::sort_small_ascending 39 (2051 MB/s) 34 (2352 MB/s) -5 -12.82% slice::bench::sort_small_big_ascending 96 (13333 MB/s) 96 (13333 MB/s) 0 0.00% slice::bench::sort_small_big_descending 248 (5161 MB/s) 243 (5267 MB/s) -5 -2.02% slice::bench::sort_small_big_random 501 (2554 MB/s) 490 (2612 MB/s) -11 -2.20% slice::bench::sort_small_descending 95 (842 MB/s) 63 (1269 MB/s) -32 -33.68% slice::bench::sort_small_random 372 (215 MB/s) 354 (225 MB/s) -18 -4.84% ``` #### Background First, let me just do a quick brain dump to discuss what I learned along the way. The official documentation says that the standard sort in Rust is a stable sort. This constraint is thus set in stone and immediately rules out many popular sorting algorithms. Essentially, the only algorithms we might even take into consideration are: 1. [Merge sort](https://en.wikipedia.org/wiki/Merge_sort) 2. [Block sort](https://en.wikipedia.org/wiki/Block_sort) (famous implementations are [WikiSort](https://github.com/BonzaiThePenguin/WikiSort) and [GrailSort](https://github.com/Mrrl/GrailSort)) 3. [TimSort](https://en.wikipedia.org/wiki/Timsort) Actually, all of those are just merge sort flavors. :) The current standard sort in Rust is a simple iterative merge sort. It has three problems. First, it's slow on partially sorted inputs (even though #29675 helped quite a bit). Second, it always makes around `log(n)` iterations copying the entire array between buffers, no matter what. Third, it allocates huge amounts of temporary memory (a buffer of size `2n`, where `n` is the size of input). The problem of auxilliary memory allocation is a tough one. Ideally, it would be best for our sort to allocate `O(1)` additional memory. This is what block sort (and it's variants) does. However, it's often very complicated (look at [this](https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.cpp)) and even then performs rather poorly. The author of WikiSort claims good performance, but that must be taken with a grain of salt. It performs well in comparison to `std::stable_sort` in C++. It can even beat `std::sort` on partially sorted inputs, but on random inputs it's always far worse. My rule of thumb is: high performance, low memory overhead, stability - choose two. TimSort is another option. It allocates a buffer of size `n/2`, which is not great, but acceptable. Performs extremelly well on partially sorted inputs. However, it seems pretty much all implementations suck on random inputs. I benchmarked implementations in [Rust](https://github.com/notriddle/rust-timsort), [C++](https://github.com/gfx/cpp-TimSort), and [D](`fd518eb310/std/algorithm/sorting.d (L2062)`). The results were a bit disappointing. It seems bad performance is due to complex galloping procedures in hot loops. Galloping noticeably improves performance on partially sorted inputs, but worsens it on random ones. #### The new algorithm Choosing the best algorithm is not easy. Plain merge sort is bad on partially sorted inputs. TimSort is bad on random inputs and block sort is even worse. However, if we take the main ideas from TimSort (intelligent merging strategy of sorted runs) and drop galloping, then we'll have great performance on random inputs and it won't be bad on partially sorted inputs either. That is exactly what this new algorithm does. I can't call it TimSort, since it steals just a few of it's ideas. Complete TimSort would be a much more complex and elaborate implementation. In case we in the future figure out how to incorporate more of it's ideas into this implementation without crippling performance on random inputs, it's going to be very easy to extend. I also did several other minor improvements, like reworked insertion sort to make it faster. There are also new, more thorough benchmarks and panic safety tests. The final code is not terribly complex and has less unsafe code than I anticipated, but there's still plenty of it that should be carefully reviewed. I did my best at documenting non-obvious code. I'd like to notify several people of this PR, since they might be interested and have useful insights: 1. @huonw because he wrote the [original merge sort](https://github.com/rust-lang/rust/pull/11064). 2. @alexcrichton because he was involved in multiple discussions of it. 3. @veddan because he wrote [introsort](https://github.com/veddan/rust-introsort) in Rust. 4. @notriddle because he wrote [TimSort](https://github.com/notriddle/rust-timsort) in Rust. 5. @bluss because he had an attempt at writing WikiSort in Rust. 6. @gnzlbg, @rkruppe, and @mark-i-m because they were involved in discussion #36318. P.S.* [quickersort](https://github.com/notriddle/quickersort) describes itself as being universally [faster](https://github.com/notriddle/quickersort/blob/master/perf.txt) than the standard sort, which is true. However, if this PR gets merged, things might [change](https://gist.github.com/stjepang/b9f0c3eaa0e1f1280b61b963dae19a30) a bit. ;)	2016-12-09 10:00:25 +00:00
Stjepan Glavina	c8d73ea68a	Implement a faster sort algorithm This is a complete rewrite of the standard sort algorithm. The new algorithm is a simplified variant of TimSort. In summary, the changes are: * Improved performance, especially on partially sorted inputs. * Performs less comparisons on both random and partially sorted inputs. * Decreased the size of temporary memory: the new sort allocates 4x less.	2016-12-07 21:35:07 +01:00
Ulrik Sverdrup	343b4c321d	collections: Simplify VecDeque::is_empty Improve is_empty on the VecDeque and its iterators by just comparing tail and head; this saves a few instructions (to be able to remove the `& (size - 1)` computation, it would have to know that size is a power of two).	2016-12-04 15:46:36 +01:00
Clar Charr	cbf734f9ab	Add String::split_off.	2016-11-30 23:24:57 -05:00
Corey Farwell	274777a158	Rename 'librustc_unicode' crate to 'libstd_unicode'. Fixes #26554.	2016-11-30 01:24:01 -05:00
bors	217f57c0b5	Auto merge of #37943 - bluss:exact-is-empty, r=alexcrichton Implement better .is_empty() for slice and vec iterators These iterators can use a pointer comparison instead of computing the length.	2016-11-24 03:37:44 -06:00
Ulrik Sverdrup	74cde120e5	core, collections: Implement better .is_empty() for slice and vec iterators These iterators can use a pointer comparison instead of computing the length.	2016-11-23 02:31:41 +01:00
bors	fc2373c5a2	Auto merge of #37888 - bluss:chars-count, r=alexcrichton Improve .chars().count() Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to. benchmark descriptions and results for two configurations: - ascii: ascii text - cy: cyrillic text - jp: japanese text - words ascii: counting each split_whitespace item from the ascii text - words jp: counting each split_whitespace item from the jp text ``` x86-64 rustc -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,453 (1755 MB/s) 1,398 (1824 MB/s) -55 -3.79% count_cy 5,990 (856 MB/s) 2,545 (2016 MB/s) -3,445 -57.51% count_jp 3,075 (1169 MB/s) 1,772 (2029 MB/s) -1,303 -42.37% count_words_ascii 4,157 (521 MB/s) 1,797 (1205 MB/s) -2,360 -56.77% count_words_jp 3,337 (1071 MB/s) 1,772 (2018 MB/s) -1,565 -46.90% x86-64 rustc -Ctarget-feature=+avx -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,444 (1766 MB/s) 763 (3343 MB/s) -681 -47.16% count_cy 5,871 (874 MB/s) 1,527 (3360 MB/s) -4,344 -73.99% count_jp 2,874 (1251 MB/s) 1,073 (3351 MB/s) -1,801 -62.67% count_words_ascii 4,131 (524 MB/s) 1,871 (1157 MB/s) -2,260 -54.71% count_words_jp 3,253 (1099 MB/s) 1,331 (2686 MB/s) -1,922 -59.08% ``` I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time), but the code in this PR was always winning `count_words_ascii` in particular (counting many small strings); this solution is an improvement without tradeoffs.	2016-11-20 17:06:53 -06:00
Oliver Middleton	9e86e18092	Optimise CharIndices::last() The default implementation of last() goes through the entire iterator but that's not needed here.	2016-11-20 00:37:48 +00:00
Ulrik Sverdrup	5a3aa2f73c	str: Improve .chars().count() Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to.	2016-11-19 23:46:39 +01:00
Oliver Middleton	de2f61740d	Optimise Chars::last() The default implementation of last() goes through the entire iterator but that's not needed here.	2016-11-19 18:43:41 +00:00
Oliver Middleton	1e40c80cf5	Fix issues with the Add/AddAssign impls for Cow<str> * Correct the stability attributes. * Make Add and AddAssign actually behave the same. * Use String::with_capacity when allocating a new string. * Fix the tests.	2016-11-04 01:07:00 +00:00
bors	0da37c585e	Auto merge of #37212 - srinivasreddy:libcollectionstest, r=nrc run rustfmt on libcollectionstest	2016-10-27 22:02:31 -07:00
Srinivas Reddy Thatiparthy	e820a866bc	run rustfmt on libcollectionstest	2016-10-25 21:59:22 +05:30
Simon Sapin	7e603d4e3b	Implement `From<Cow<str>> for String` and `From<Cow<[T]>> for Vec<T>`. Motivation: the `selectors` crate is generic over a string type, in order to support all of `String`, `string_cache::Atom`, and `gecko_string_cache::Atom`. Multiple trait bounds are used for the various operations done with these strings. One of these operations is creating a string (as efficiently as possible, re-using an existing memory allocation if possible) from `Cow<str>`. The `std::convert::From` trait seems natural for this, but the relevant implementation was missing before this PR. To work around this I’ve added a `FromCowStr` trait in `selectors`, but with trait coherence that means one of `selectors` or `string_cache` needs to depend on the other to implement this trait. Using a trait from `std` would solve this. The `Vec<T>` implementation is just added for consistency. I also tried a more general `impl<'a, O, B: ?Sized + ToOwned<Owned=O>> From<Cow<'a, B>> for O`, but (the compiler thinks?) it conflicts with `From<T> for T` the impl (after moving all of `collections::borrow` into `core::borrow` to work around trait coherence).	2016-10-21 17:42:29 +02:00
bors	07b86d0d4d	Auto merge of #37162 - matklad:static-mut-lint, r=jseyfried Lint against lowercase static mut Closes #37145. Lint for non mut statics was added in https://github.com/rust-lang/rust/pull/7523, and it explicitly did not cover mut statics. I am not sure why.	2016-10-17 04:32:15 -07:00
Aleksey Kladov	72399f2db7	Rename static mut to upper case	2016-10-14 17:21:11 +03:00
bors	17af6b94b2	Auto merge of #36743 - SimonSapin:dedup-by, r=alexcrichton Add Vec::dedup_by and Vec::dedup_by_key	2016-10-13 19:56:53 -07:00
Guillaume Gomez	0b7fe4d67c	Rollup merge of #36699 - bluss:repeat-str, r=alexcrichton Add method str::repeat(self, usize) -> String It is relatively simple to repeat a string n times: `(0..n).map(\|_\| s).collect::<String>()`. It becomes slightly more complicated to do it “right” (sizing the allocation up front), which warrants a method that does it for us. This method is useful in writing testcases, or when generating text. `format!()` can be used to repeat single characters, but not repeating strings like this.	2016-10-11 17:51:26 +02:00
Simon Sapin	be34bac1ab	Add Vec::dedup_by and Vec::dedup_by_key	2016-10-11 14:39:14 +02:00
Ulrik Sverdrup	2b7222d3ec	Add method str::repeat(self, usize) -> String It is relatively simple to repeat a string n times: `(0..n).map(\|_\| s).collect::<String>()`. It becomes slightly more complicated to do it “right” (sizing the allocation up front), which warrants a method that does it for us. This method is useful in writing testcases, or when generating text. `format!()` can be used to repeat single characters, but not repeating strings like this.	2016-10-11 00:24:23 +02:00
bors	7a26aeca77	Auto merge of #36815 - alexcrichton:stabilize-1.13, r=aturon std: Stabilize and deprecate APIs for 1.13 This commit is intended to be backported to the 1.13 branch, and works with the following APIs: Stabilized * `i32::checked_abs` * `i32::wrapping_abs` * `i32::overflowing_abs` * `RefCell::try_borrow` * `RefCell::try_borrow_mut` Deprecated * `BinaryHeap::push_pop` * `BinaryHeap::replace` * `SipHash13` * `SipHash24` * `SipHasher` - use `DefaultHasher` instead in the `std::collections::hash_map` module Closes #28147 Closes #34767 Closes #35057 Closes #35070	2016-10-03 11:00:03 -07:00
Alex Crichton	10c3134da0	std: Stabilize and deprecate APIs for 1.13 This commit is intended to be backported to the 1.13 branch, and works with the following APIs: Stabilized * `i32::checked_abs` * `i32::wrapping_abs` * `i32::overflowing_abs` * `RefCell::try_borrow` * `RefCell::try_borrow_mut` * `DefaultHasher` * `DefaultHasher::new` * `DefaultHasher::default` Deprecated * `BinaryHeap::push_pop` * `BinaryHeap::replace` * `SipHash13` * `SipHash24` * `SipHasher` - use `DefaultHasher` instead in the `std::collections::hash_map` module Closes #28147 Closes #34767 Closes #35057 Closes #35070	2016-10-03 10:34:34 -07:00
Brian Anderson	9c4a01ee9e	Ignore lots and lots of std tests on emscripten	2016-09-30 14:02:48 -07:00
bors	c717cfa7c1	Auto merge of #36430 - llogiq:cow_add, r=alexcrichton impl Add<{str, Cow<str>}> for Cow<str> cc #35837	2016-09-29 15:50:32 -07:00
Andre Bogus	dd13a80344	impl {Add, AddAssign}<{str, Cow<str>}> for Cow<str> This does not actually add anything that wasn't there, but is merely an optimization for the given cases, which would have incurred additional heap allocation for adding empty strings, and improving the ergonomics of `Cow` with strings.	2016-09-29 14:56:58 +02:00
tormol	13a2dd96fe	[breaking-change] std: change `encode_utf{8,16}()` to take a buffer and return a slice They panic if the buffer is too small.	2016-09-28 09:03:30 +02:00
Simon Sapin	f14f4db6e8	Move Vec::dedup tests from slice.rs to vec.rs	2016-09-26 18:17:38 +02:00
Simon Sapin	dc973417a8	Remove duplicate test. test_dedup_shared has been exactly the same as test_dedup_unique since `6f16df4aa`, three years ago.	2016-09-26 18:15:50 +02:00
knight42	ebda77072a	Add tests for str::replacen	2016-09-13 10:16:31 +08:00
Andrew Paseltiner	ef4952e739	Address FIXME in libcollectionstest/btree/set.rs	2016-08-28 18:52:21 -04:00
Alex Crichton	afeeadeae5	std: Stabilize APIs for the 1.12 release Stabilized * `Cell::as_ptr` * `RefCell::as_ptr` * `IpAddr::is_{unspecified,loopback,multicast}` * `Ipv6Addr::octets` * `LinkedList::contains` * `VecDeque::contains` * `ExitStatusExt::from_raw` - both on Unix and Windows * `Receiver::recv_timeout` * `RecvTimeoutError` * `BinaryHeap::peek_mut` * `PeekMut` * `iter::Product` * `iter::Sum` * `OccupiedEntry::remove_entry` * `VacantEntry::into_key` Deprecated * `Cell::as_unsafe_cell` * `RefCell::as_unsafe_cell` * `OccupiedEntry::remove_pair` Closes #27708 cc #27709 Closes #32313 Closes #32630 Closes #32713 Closes #34029 Closes #34392 Closes #34285 Closes #34529	2016-08-19 11:59:56 -07:00
bors	7ac11cad3f	Auto merge of #35747 - jonathandturner:rollup, r=jonathandturner Rollup of 23 pull requests - Successful merges: #34370, #35415, #35595, #35610, #35613, #35614, #35621, #35660, #35663, #35670, #35671, #35672, #35681, #35686, #35690, #35695, #35707, #35708, #35713, #35722, #35725, #35726, #35731 - Failed merges: #35395	2016-08-17 09:49:34 -07:00
bors	76fa5875c6	Auto merge of #35733 - apasel422:issue-35721, r=alexcrichton Make `vec::IntoIter` covariant again Closes #35721 r? @alexcrichton	2016-08-17 06:25:56 -07:00
Jonathan Turner	3dd060f065	Rollup merge of #35707 - frewsxcv:vec-into-iter-debug, r=alexcrichton Implement `Debug` for `std::vec::IntoIter`. Display all the remaining items of the iterator, similar to the `Debug` implementation for `core::slice::Iter`: `f0bab98695/src/libcore/slice.rs (L930-L937)` Using the `as_slice` method that was added in: https://github.com/rust-lang/rust/pull/35447	2016-08-17 06:25:26 -07:00
bors	9376da6f77	Auto merge of #35559 - frewsxcv:slice-iter-as-ref, r=alexcrichton Implement `AsRef<[T]>` for `std::slice::Iter`. `AsRef` is designed for conversions that are "cheap" (as per the API docs). It is the case that retrieving the underlying data of `std::slice::Iter` is cheap. In my opinion, there's no ambiguity about what slice data will be returned, otherwise, I would be more cautious about implementing `AsRef`.	2016-08-16 19:44:10 -07:00
Andrew Paseltiner	7e148cd062	Make `vec::IntoIter` covariant again Closes #35721	2016-08-16 20:45:07 -04:00
bors	514d4cef24	Auto merge of #35354 - tomgarcia:covariant-drain, r=alexcrichton Made vec_deque::Drain, hash_map::Drain, and hash_set::Drain covariant Fixed the rest of the Drain iterators.	2016-08-16 13:26:15 -07:00
Corey Farwell	dc22186efb	Add basic unit test for `std::slice::Iter::as_slice`.	2016-08-16 11:20:43 -04:00
Corey Farwell	3808dc3560	Implement `AsRef<[T]>` for `std::slice::Iter`. `AsRef` is designed for conversions that are "cheap" (as per the API docs). It is the case that retrieving the underlying data of `std::slice::Iter` is cheap. In my opinion, there's no ambiguity about what slice data will be returned, otherwise, I would be more cautious about implementing `AsRef`.	2016-08-16 11:14:52 -04:00
Corey Farwell	bc52bdcedc	Implement `Debug` for `std::vec::IntoIter`. Display all the remaining items of the iterator, similar to the `Debug` implementation for `core::slice::Iter`: `f0bab98695/src/libcore/slice.rs (L930-L937)` Using the `as_slice` method that was added in: https://github.com/rust-lang/rust/pull/35447	2016-08-15 23:45:12 -04:00
Corey Farwell	01a766e521	Introduce `as_mut_slice` method on `std::vec::IntoIter` struct.	2016-08-11 16:49:01 -04:00
Corey Farwell	d099e30e48	Introduce `as_slice` method on `std::vec::IntoIter` struct. Similar to the `as_slice` method on `core::slice::Iter` struct.	2016-08-11 16:48:43 -04:00
Thomas Garcia	bf592cefde	Made vec_deque::Drain, hash_map::Drain, and hash_set::Drain covariant	2016-08-04 21:33:57 -07:00
Manish Goregaokar	96e3972707	Rollup merge of #35049 - knight42:add-test, r=alexcrichton Add a test for AddAssign on String Fix #35047	2016-07-30 13:44:46 +05:30
bors	d1df3fecdf	Auto merge of #34485 - tbu-:pr_unicode_debug_str, r=alexcrichton Escape fewer Unicode codepoints in `Debug` impl of `str` Use the same procedure as Python to determine whether a character is printable, described in [PEP 3138]. In particular, this means that the following character classes are escaped: - Cc (Other, Control) - Cf (Other, Format) - Cs (Other, Surrogate), even though they can't appear in Rust strings - Co (Other, Private Use) - Cn (Other, Not Assigned) - Zl (Separator, Line) - Zp (Separator, Paragraph) - Zs (Separator, Space), except for the ASCII space `' '` `0x20` This allows for user-friendly inspection of strings that are not English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`). Fixes #34318. CC #34422. [PEP 3138]: https://www.python.org/dev/peps/pep-3138/	2016-07-28 11:20:33 -07:00
Tobias Bucher	3d09b4a0d5	Rename `char::escape` to `char::escape_debug` and add tracking issue	2016-07-28 02:20:49 +02:00
Knight	6ac83de691	Add test for string AddAssign	2016-07-28 06:08:56 +08:00

1 2 3 4 5

214 commits