Some i128 tests
* Add some FFI tests for i128 on architectures where we have sort of working "C" FFI support. On all other architectures we ignore the test.
* enhance the u128 overflow tests
Fix transmute::<T, U> where T requires a bigger alignment than U
For transmute::<T, U> we simply pointercast the destination from a U
pointer to a T pointer, without providing any alignment information,
thus LLVM assumes that the destination is aligned to hold a value of
type T, which is not necessarily true. This can lead to LLVM emitting
machine instructions that assume said alignment, and thus cause aborts.
To fix this, we need to provide the actual alignment to store_operand()
and in turn to store() so they can set the proper alignment information
on the stores and LLVM can emit the proper machine instructions.
Fixes#32947
prefer hyphens in test files named after issue numbers
We have a lot of tests with filenames honoring particular issues by
number. Typically, these are called issue-${issue_no}.rs (note the
hyphen):
```
$ find . -regextype posix-egrep -regex '.*/issue-[0-9]*.rs' | wc
1289 1289 35935
```
We also had a much smaller number of files that are like this, but don't
have a hyphen in between the substring `issue` and the number:
```
$ find . -regextype posix-egrep -regex '.*/issue[0-9]*.rs'
./debuginfo/issue14411.rs
./debuginfo/issue12886.rs
./debuginfo/issue13213.rs
./debuginfo/issue22656.rs
./debuginfo/issue7712.rs
./compile-fail/issue32829.rs
./run-pass/issue24353.rs
./run-pass/issue34796.rs
./run-pass/issue18173.rs
./run-pass/issue22346.rs
./run-pass/auxiliary/issue13507.rs
./run-pass/issue26127.rs
./run-pass/issue22008.rs
./run-pass/issue34569.rs
./run-pass/issue29927.rs
./run-pass/issue36260.rs
```
Some would argue that the inconsistency is æsthetically displeasing,
hence this trivial patch. (Note that run-pass/auxiliary/issue13507.rs
has an excuse; it's `use`d in run-pass/issue-13507-2.rs; the matter of
there being two different compile-fail tests with different name
conventions for issue no. 32829 is also neglected here for the sake of
keeping this trivial cleanup patch as trivial as possible for ease of
review.)
For transmute::<T, U> we simply pointercast the destination from a U
pointer to a T pointer, without providing any alignment information,
thus LLVM assumes that the destination is aligned to hold a value of
type T, which is not necessarily true. This can lead to LLVM emitting
machine instructions that assume said alignment, and thus cause aborts.
To fix this, we need to provide the actual alignment to store_operand()
and in turn to store() so they can set the proper alignment information
on the stores and LLVM can emit the proper machine instructions.
Fixes#32947
adaptation to rustbuild for openbsd
Since the switch to rustbuild, the build for openbsd is broken:
- [X] `ar` inference based on compiler name is wrong (OpenBSD usually use `egcc`, but `ear` doesn't exist)
- [X] `make` isn't GNU-make under OpenBSD (and others BSD platforms)
- [x] `stdc++` isn't the right stdc++ library to link with (it should be `estdc++`)
- [x] corrects tests that don't pass anymore (problems related to rustbuild)
r? @alexcrichton
We have a lot of tests with filenames honoring particular issues by
number. Typically, these are called issue-${issue_no}.rs (note the
hyphen):
```
$ find . -regextype posix-egrep -regex '.*/issue-[0-9]*.rs' | wc
1289 1289 35935
```
We also had a much smaller number of files that are like this, but don't
have a hyphen in between the substring `issue` and the number:
```
$ find . -regextype posix-egrep -regex '.*/issue[0-9]*.rs'
./debuginfo/issue14411.rs
./debuginfo/issue12886.rs
./debuginfo/issue13213.rs
./debuginfo/issue22656.rs
./debuginfo/issue7712.rs
./compile-fail/issue32829.rs
./run-pass/issue24353.rs
./run-pass/issue34796.rs
./run-pass/issue18173.rs
./run-pass/issue22346.rs
./run-pass/auxiliary/issue13507.rs
./run-pass/issue26127.rs
./run-pass/issue22008.rs
./run-pass/issue34569.rs
./run-pass/issue29927.rs
./run-pass/issue36260.rs
```
Some would argue that the inconsistency is æsthetically displeasing,
hence this trivial patch. (Note that run-pass/auxiliary/issue13507.rs
has an excuse; it's `use`d in run-pass/issue-13507-2.rs; the matter of
there being two different compile-fail tests with different name
conventions for issue #32829 is also neglected here for the sake of
keeping this trivial cleanup patch as trivial as possible for ease of
review.)
struct field reordering and optimization
This is work in progress. The goal is to divorce the order of fields in source code from the order of fields in the LLVM IR, then optimize structs (and tuples/enum variants)by always ordering fields from least to most aligned. It does not work yet. I intend to check compiler memory usage as a benchmark, and a crater run will probably be required.
I don't know enough of the compiler to complete this work unaided. If you see places that still need updating, please mention them. The only one I know of currently is debuginfo, which I'm putting off intentionally until a bit later.
r? @eddyb
- The discriminant must be first in all variants.
- The loop responsible for patching enum variants when the discriminant is enlarged was nonfunctional.
erase lifetimes when translating specialized substs
Projections can generate lifetime variables with equality constraints,
that will not be resolved by `resolve_type_vars_if_possible`, so substs
need to be lifetime-erased after that.
Fixes#36848.
Consider provided trait methods in middle::reachable
Fixes https://github.com/rust-lang/rust/issues/38226 by also considering trait methods with default implementation instead of just methods provided in an impl.
r? @alexcrichton
cc @panicbit
Implement a faster sort algorithm
Hi everyone, this is my first PR.
I've made some changes to the standard sort algorithm, starting out with a few tweaks here and there, but in the end this endeavour became a complete rewrite of it.
#### Summary
Changes:
* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.
Benchmark:
```
name out1 ns/iter out2 ns/iter diff ns/iter diff %
slice::bench::sort_large_ascending 85,323 (937 MB/s) 8,970 (8918 MB/s) -76,353 -89.49%
slice::bench::sort_large_big_ascending 2,135,297 (599 MB/s) 355,955 (3595 MB/s) -1,779,342 -83.33%
slice::bench::sort_large_big_descending 2,266,402 (564 MB/s) 416,479 (3073 MB/s) -1,849,923 -81.62%
slice::bench::sort_large_big_random 3,053,031 (419 MB/s) 1,921,389 (666 MB/s) -1,131,642 -37.07%
slice::bench::sort_large_descending 313,181 (255 MB/s) 14,725 (5432 MB/s) -298,456 -95.30%
slice::bench::sort_large_mostly_ascending 287,706 (278 MB/s) 243,204 (328 MB/s) -44,502 -15.47%
slice::bench::sort_large_mostly_descending 415,078 (192 MB/s) 271,028 (295 MB/s) -144,050 -34.70%
slice::bench::sort_large_random 545,872 (146 MB/s) 521,559 (153 MB/s) -24,313 -4.45%
slice::bench::sort_large_random_expensive 30,321,770 (2 MB/s) 23,533,735 (3 MB/s) -6,788,035 -22.39%
slice::bench::sort_medium_ascending 616 (1298 MB/s) 155 (5161 MB/s) -461 -74.84%
slice::bench::sort_medium_descending 1,952 (409 MB/s) 202 (3960 MB/s) -1,750 -89.65%
slice::bench::sort_medium_random 3,646 (219 MB/s) 3,421 (233 MB/s) -225 -6.17%
slice::bench::sort_small_ascending 39 (2051 MB/s) 34 (2352 MB/s) -5 -12.82%
slice::bench::sort_small_big_ascending 96 (13333 MB/s) 96 (13333 MB/s) 0 0.00%
slice::bench::sort_small_big_descending 248 (5161 MB/s) 243 (5267 MB/s) -5 -2.02%
slice::bench::sort_small_big_random 501 (2554 MB/s) 490 (2612 MB/s) -11 -2.20%
slice::bench::sort_small_descending 95 (842 MB/s) 63 (1269 MB/s) -32 -33.68%
slice::bench::sort_small_random 372 (215 MB/s) 354 (225 MB/s) -18 -4.84%
```
#### Background
First, let me just do a quick brain dump to discuss what I learned along the way.
The official documentation says that the standard sort in Rust is a stable sort. This constraint is thus set in stone and immediately rules out many popular sorting algorithms. Essentially, the only algorithms we might even take into consideration are:
1. [Merge sort](https://en.wikipedia.org/wiki/Merge_sort)
2. [Block sort](https://en.wikipedia.org/wiki/Block_sort) (famous implementations are [WikiSort](https://github.com/BonzaiThePenguin/WikiSort) and [GrailSort](https://github.com/Mrrl/GrailSort))
3. [TimSort](https://en.wikipedia.org/wiki/Timsort)
Actually, all of those are just merge sort flavors. :) The current standard sort in Rust is a simple iterative merge sort. It has three problems. First, it's slow on partially sorted inputs (even though #29675 helped quite a bit). Second, it always makes around `log(n)` iterations copying the entire array between buffers, no matter what. Third, it allocates huge amounts of temporary memory (a buffer of size `2*n`, where `n` is the size of input).
The problem of auxilliary memory allocation is a tough one. Ideally, it would be best for our sort to allocate `O(1)` additional memory. This is what block sort (and it's variants) does. However, it's often very complicated (look at [this](https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.cpp)) and even then performs rather poorly. The author of WikiSort claims good performance, but that must be taken with a grain of salt. It performs well in comparison to `std::stable_sort` in C++. It can even beat `std::sort` on partially sorted inputs, but on random inputs it's always far worse. My rule of thumb is: high performance, low memory overhead, stability - choose two.
TimSort is another option. It allocates a buffer of size `n/2`, which is not great, but acceptable. Performs extremelly well on partially sorted inputs. However, it seems pretty much all implementations suck on random inputs. I benchmarked implementations in [Rust](https://github.com/notriddle/rust-timsort), [C++](https://github.com/gfx/cpp-TimSort), and [D](fd518eb310/std/algorithm/sorting.d (L2062)). The results were a bit disappointing. It seems bad performance is due to complex galloping procedures in hot loops. Galloping noticeably improves performance on partially sorted inputs, but worsens it on random ones.
#### The new algorithm
Choosing the best algorithm is not easy. Plain merge sort is bad on partially sorted inputs. TimSort is bad on random inputs and block sort is even worse. However, if we take the main ideas from TimSort (intelligent merging strategy of sorted runs) and drop galloping, then we'll have great performance on random inputs and it won't be bad on partially sorted inputs either.
That is exactly what this new algorithm does. I can't call it TimSort, since it steals just a few of it's ideas. Complete TimSort would be a much more complex and elaborate implementation. In case we in the future figure out how to incorporate more of it's ideas into this implementation without crippling performance on random inputs, it's going to be very easy to extend. I also did several other minor improvements, like reworked insertion sort to make it faster.
There are also new, more thorough benchmarks and panic safety tests.
The final code is not terribly complex and has less unsafe code than I anticipated, but there's still plenty of it that should be carefully reviewed. I did my best at documenting non-obvious code.
I'd like to notify several people of this PR, since they might be interested and have useful insights:
1. @huonw because he wrote the [original merge sort](https://github.com/rust-lang/rust/pull/11064).
2. @alexcrichton because he was involved in multiple discussions of it.
3. @veddan because he wrote [introsort](https://github.com/veddan/rust-introsort) in Rust.
4. @notriddle because he wrote [TimSort](https://github.com/notriddle/rust-timsort) in Rust.
5. @bluss because he had an attempt at writing WikiSort in Rust.
6. @gnzlbg, @rkruppe, and @mark-i-m because they were involved in discussion #36318.
**P.S.** [quickersort](https://github.com/notriddle/quickersort) describes itself as being universally [faster](https://github.com/notriddle/quickersort/blob/master/perf.txt) than the standard sort, which is true. However, if this PR gets merged, things might [change](https://gist.github.com/stjepang/b9f0c3eaa0e1f1280b61b963dae19a30) a bit. ;)
This is a complete rewrite of the standard sort algorithm. The new algorithm
is a simplified variant of TimSort. In summary, the changes are:
* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.