rust/library
Stuart Cook a6e8a31b86
Rollup merge of #151611 - bonega:improve-is-slice-is-ascii-performance, r=folkertdev
Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics

# Summary

Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs.
AVX-512 keeps similiar performance characteristics.

This is building on the work already merged in rust-lang/rust#151259.
In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore.
Thanks to @folkertdev for pointing me to consider `as_chunk` again.

# The implementation:
- Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together
- Extracts the MSB mask with a single `pmovmskb` instruction
- Falls back to usize-at-a-time SWAR for inputs < 64 bytes

# Performance impact (vs before rust-lang/rust#151259):
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

  <details>
  <summary>Benchmark Results (click to expand)</summary>

  Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest).
  Tops out at 139GB/s for large inputs.

  ### early_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | 1.01 | **1.00** | 13.45 | 1.13 |
  | 1024 | 1.01 | **1.00** | 13.53 | 1.14 |
  | 65536 | 1.01 | **1.00** | 13.99 | 1.12 |
  | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 |

  ### late_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | **1.00** | 1.01 | 13.37 | 1.13 |
  | 1024 | 1.10 | **1.00** | 42.42 | 1.95 |
  | 65536 | **1.00** | 1.06 | 42.22 | 1.73 |
  | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 |

  ### pure_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 4 | 1.03 | **1.00** | 1.75 | 1.32 |
  | 8 | **1.00** | 1.14 | 3.89 | 2.06 |
  | 16 | **1.00** | 1.04 | 1.13 | 1.62 |
  | 32 | 1.07 | 1.19 | 5.11 | **1.00** |
  | 64 | **1.00** | 1.13 | 13.32 | 1.57 |
  | 128 | **1.00** | 1.01 | 19.97 | 1.55 |
  | 256 | **1.00** | 1.02 | 27.77 | 1.61 |
  | 1024 | **1.00** | 1.02 | 41.34 | 1.84 |
  | 4096 | 1.02 | **1.00** | 45.61 | 1.98 |
  | 16384 | 1.01 | **1.00** | 48.67 | 2.04 |
  | 65536 | **1.00** | 1.03 | 43.86 | 1.77 |
  | 262144 | **1.00** | 1.06 | 41.44 | 1.79 |
  | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 |

  </details>

## Reproduction / Test Projects

Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

- `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
- `fuzz/` - Compares old/new implementations with libfuzzer

Relates to: https://github.com/llvm/llvm-project/issues/176906
2026-01-26 14:36:21 +11:00
..
alloc Auto merge of #151337 - the8472:bail-before-memcpy2, r=Mark-Simulacrum 2026-01-25 19:45:35 +00:00
alloctests Move assert_matches to planned stable path 2026-01-21 23:17:24 +01:00
backtrace@b65ab935fb Update the backtrace submodule 2025-06-16 07:00:13 +00:00
compiler-builtins compiler-builtins: Enable AArch64 __chkstk for MinGW 2026-01-08 05:01:35 -05:00
core Rollup merge of #151611 - bonega:improve-is-slice-is-ascii-performance, r=folkertdev 2026-01-26 14:36:21 +11:00
coretests Rollup merge of #151489 - bend-n:constify-all-boolean-methods-under-feature-gate-const-bool, r=jhpratt 2026-01-24 08:18:07 +01:00
panic_abort Use core via rustc-std-workspace-core in library/panic* 2025-07-31 22:47:24 +00:00
panic_unwind Fix new function_casts_as_integer lint errors in core, std, panic_unwind and compiler crates 2025-11-10 16:38:28 +01:00
portable-simd Remove more #[must_use] from portable-simd 2025-11-11 13:27:04 -08:00
proc_macro Fix review comments 2026-01-24 14:44:03 +00:00
profiler_builtins Fix profiler_builtins build script to handle full path to profiler lib 2025-04-11 16:57:38 +02:00
rtstartup Update cfg(bootstrap) 2025-07-01 10:55:49 -07:00
rustc-std-workspace-alloc Disable unit tests for stdlib packages that don't contain any 2025-07-24 09:15:28 +00:00
rustc-std-workspace-core Use core via rustc-std-workspace-core in library/panic* 2025-07-31 22:47:24 +00:00
rustc-std-workspace-std Disable unit tests for stdlib packages that don't contain any 2025-07-24 09:15:28 +00:00
std Rollup merge of #150842 - PaulDance:patches/fix-win7-sleep, r=Mark-Simulacrum 2026-01-25 07:42:59 +01:00
std_detect Replace version placeholders with 1.94 2026-01-20 21:17:10 -05:00
stdarch Revised yield hints 2026-01-24 17:29:25 +00:00
sysroot Remove std_detect_file_io and std_detect_dlsym_getauxval features 2026-01-08 14:50:12 +00:00
test Don't use default build-script fingerprinting in test 2026-01-23 11:10:53 -08:00
unwind Correct hexagon "unwinder_private_data_size" 2025-12-29 15:21:09 -06:00
windows_targets Rollup merge of #144399 - bjorn3:stdlib_tests_separate_packages, r=Mark-Simulacrum 2025-07-28 08:36:53 +02:00
Cargo.lock Update literal-escaper version to 0.0.7 2026-01-08 14:10:33 +01:00
Cargo.toml Rephrase embed-bitcode comment 2026-01-20 16:51:41 +00:00