Commit graph

13 commits

Author SHA1 Message Date
Nikita Popov
acb5ee2f84 Disable append-elements.rs test with debug assertions
The IR is a bit different (in particular wrt naming) if
debug-assertions-std is enabled. Peculiarly, the issue goes away
if overflow-check-std is also enabled, which is why CI did not
catch this.
2026-01-30 13:01:22 +01:00
Stuart Cook
1c892e829c
Rollup merge of #147436 - okaneco:eq_ignore_ascii_autovec, r=scottmcm
slice/ascii: Optimize `eq_ignore_ascii_case` with auto-vectorization

- Refactor the current functionality into a helper function
- Use `as_chunks` to encourage auto-vectorization in the optimized chunk processing function
- Add a codegen test checking for vectorization and no panicking
- Add benches for `eq_ignore_ascii_case`

---

The optimized function is initially only enabled for x86_64 which has `sse2` as part of its baseline, but none of the code is platform specific. Other platforms with SIMD instructions may also benefit from this implementation.

Performance improvements only manifest for slices of 16 bytes or longer, so the optimized path is gated behind a length check for greater than or equal to 16.

Benchmarks - Cases below 16 bytes are unaffected, cases above all show sizeable improvements.
```
before:
    str::eq_ignore_ascii_case::bench_large_str_eq         4942.30ns/iter +/- 48.20
    str::eq_ignore_ascii_case::bench_medium_str_eq         632.01ns/iter +/- 16.87
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        16.28ns/iter  +/- 0.45
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        35.23ns/iter  +/- 2.28
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq       7.56ns/iter  +/- 0.22
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq    2.64ns/iter  +/- 0.06
after:
    str::eq_ignore_ascii_case::bench_large_str_eq         611.63ns/iter +/- 28.29
    str::eq_ignore_ascii_case::bench_medium_str_eq         77.10ns/iter +/- 19.76
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        3.49ns/iter  +/- 0.39
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        3.50ns/iter  +/- 0.27
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq      7.27ns/iter  +/- 0.09
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq   2.60ns/iter  +/- 0.05
```
2026-01-27 17:36:35 +11:00
The 8472
2b8f4a562f avoid phi node for pointers flowing into Vec appends 2026-01-18 21:03:14 +01:00
Jieyou Xu
cd79ff2e2c
Revert "avoid phi node for pointers flowing into Vec appends #130998"
This reverts PR <https://github.com/rust-lang/rust/pull/130998> because
the added test seems to be flaky / non-deterministic, and has been
failing in unrelated PRs during merge CI.
2026-01-15 09:37:16 +08:00
The 8472
468eb45b3f avoid phi node for pointers flowing into Vec appends 2026-01-12 02:54:30 +01:00
Camille Gillot
a02dc3487a Bless codegen. 2025-12-14 20:34:56 +00:00
Stuart Cook
ac729a4b18
Rollup merge of #149207 - EFanZh:add-ilog10-result-range-hints, r=Mark-Simulacrum
Add `ilog10` result range hints

This PR adds hints that the return value of `T::ilog10` will never exceed `T::MAX.ilog10()`.

This works because `ilog10` is a monotonically nondecreasing function, the maximum return value is reached at the max input value.
2025-12-08 11:46:23 +11:00
Scott McMurray
6bd9d7625d Assume the returned value in .filter(…).count()
Similar to how this helps in `slice::Iter::position`, LLVM sometimes loses track of how high this can get, so for `TrustedLen` iterators tell it what the upper bound is.
2025-12-01 21:32:04 -08:00
EFanZh
4046385d60 Add ilog10 result range hints 2025-11-29 09:52:19 +08:00
okaneco
a5ba24843d Relocate bench and use str corpora for data
Add #[inline(always)] to inner function and check not for filecheck test
2025-10-08 14:42:33 -04:00
The 8472
99ab27f90c specialize slice::fill to use memset when possible
LLVM generally can do this on its own, but it helps miri and other backends.
2025-10-08 20:14:24 +02:00
okaneco
b031c51664 slice/ascii: Optimize eq_ignore_ascii_case with auto-vectorization
Refactor the current functionality into a helper function
Use `as_chunks` to encourage auto-vectorization in the optimized chunk processing function
Add a codegen test
Add benches for `eq_ignore_ascii_case`

The optimized function is initially only enabled for x86_64 which has `sse2` as
part of its baseline, but none of the code is platform specific. Other
platforms with SIMD instructions may also benefit from this implementation.

Performance improvements only manifest for slices of 16 bytes or longer, so the
optimized path is gated behind a length check for greater than or equal to 16.
2025-10-07 05:52:37 -04:00
Guillaume Gomez
a27f3e3fd1 Rename tests/codegen into tests/codegen-llvm 2025-07-22 14:28:48 +02:00