Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics # Summary Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs. AVX-512 keeps similiar performance characteristics. This is building on the work already merged in rust-lang/rust#151259. In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore. Thanks to @folkertdev for pointing me to consider `as_chunk` again. # The implementation: - Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together - Extracts the MSB mask with a single `pmovmskb` instruction - Falls back to usize-at-a-time SWAR for inputs < 64 bytes # Performance impact (vs before rust-lang/rust#151259): - AVX-512: 34-48x faster - SSE2: 1.5-2x faster <details> <summary>Benchmark Results (click to expand)</summary> Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest). Tops out at 139GB/s for large inputs. ### early_non_ascii | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 | |------------|------------|----------|------------|----------| | 64 | 1.01 | **1.00** | 13.45 | 1.13 | | 1024 | 1.01 | **1.00** | 13.53 | 1.14 | | 65536 | 1.01 | **1.00** | 13.99 | 1.12 | | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 | ### late_non_ascii | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 | |------------|------------|----------|------------|----------| | 64 | **1.00** | 1.01 | 13.37 | 1.13 | | 1024 | 1.10 | **1.00** | 42.42 | 1.95 | | 65536 | **1.00** | 1.06 | 42.22 | 1.73 | | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 | ### pure_ascii | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 | |------------|------------|----------|------------|----------| | 4 | 1.03 | **1.00** | 1.75 | 1.32 | | 8 | **1.00** | 1.14 | 3.89 | 2.06 | | 16 | **1.00** | 1.04 | 1.13 | 1.62 | | 32 | 1.07 | 1.19 | 5.11 | **1.00** | | 64 | **1.00** | 1.13 | 13.32 | 1.57 | | 128 | **1.00** | 1.01 | 19.97 | 1.55 | | 256 | **1.00** | 1.02 | 27.77 | 1.61 | | 1024 | **1.00** | 1.02 | 41.34 | 1.84 | | 4096 | 1.02 | **1.00** | 45.61 | 1.98 | | 16384 | 1.01 | **1.00** | 48.67 | 2.04 | | 65536 | **1.00** | 1.03 | 43.86 | 1.77 | | 262144 | **1.00** | 1.06 | 41.44 | 1.79 | | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 | </details> ## Reproduction / Test Projects Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation - `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison - `fuzz/` - Compares old/new implementations with libfuzzer Relates to: https://github.com/llvm/llvm-project/issues/176906 |
||
|---|---|---|
| .github | ||
| compiler | ||
| library | ||
| LICENSES | ||
| src | ||
| tests | ||
| .clang-format | ||
| .editorconfig | ||
| .git-blame-ignore-revs | ||
| .gitattributes | ||
| .gitignore | ||
| .gitmodules | ||
| .ignore | ||
| .mailmap | ||
| bootstrap.example.toml | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CODE_OF_CONDUCT.md | ||
| configure | ||
| CONTRIBUTING.md | ||
| COPYRIGHT | ||
| INSTALL.md | ||
| LICENSE-APACHE | ||
| license-metadata.json | ||
| LICENSE-MIT | ||
| package.json | ||
| README.md | ||
| RELEASES.md | ||
| REUSE.toml | ||
| rust-bors.toml | ||
| rustfmt.toml | ||
| triagebot.toml | ||
| typos.toml | ||
| x | ||
| x.ps1 | ||
| x.py | ||
| yarn.lock | ||
This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.
Why Rust?
-
Performance: Fast and memory-efficient, suitable for critical services, embedded devices, and easily integrated with other languages.
-
Reliability: Our rich type system and ownership model ensure memory and thread safety, reducing bugs at compile-time.
-
Productivity: Comprehensive documentation, a compiler committed to providing great diagnostics, and advanced tooling including package manager and build tool (Cargo), auto-formatter (rustfmt), linter (Clippy) and editor support (rust-analyzer).
Quick Start
Read "Installation" from The Book.
Installing from Source
If you really want to install from source (though this is not recommended), see INSTALL.md.
Getting Help
See https://www.rust-lang.org/community for a list of chat platforms and forums.
Contributing
See CONTRIBUTING.md.
License
Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.
Trademark
The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the "Rust Trademarks").
If you want to use these names or brands, please read the Rust language trademark policy.
Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.