Remove x86_64 assembly test for is_ascii

The SSE2 helper function is not inlined across crate boundaries,
so we cannot verify the codegen in an assembly test. The fix is
still verified by the absence of performance regression.
This commit is contained in:
Andreas Liljeqvist 2026-01-25 09:44:04 +01:00
parent a72f68e801
commit cbcd8694c6

View file

@ -1,32 +1,12 @@
//@ revisions: X86_64 LA64
//@ revisions: LA64
//@ assembly-output: emit-asm
//@ compile-flags: -C opt-level=3
//
//@ [X86_64] only-x86_64
//@ [X86_64] compile-flags: -C target-cpu=znver4
//@ [X86_64] compile-flags: -C llvm-args=-x86-asm-syntax=intel
//
//@ [LA64] only-loongarch64
#![crate_type = "lib"]
/// Verify `is_ascii` generates efficient code on different architectures:
///
/// - x86_64: Must NOT use `kshiftrd`/`kshiftrq` (broken AVX-512 auto-vectorization).
/// The fix uses explicit SSE2 intrinsics (`pmovmskb`/`vpmovmskb`).
/// See: https://github.com/llvm/llvm-project/issues/176906
///
/// - loongarch64: Should use `vmskltz.b` instruction for the fast-path.
/// This architecture still relies on LLVM auto-vectorization.
// X86_64-LABEL: test_is_ascii
// X86_64-NOT: kshiftrd
// X86_64-NOT: kshiftrq
// Verify explicit SSE2/AVX intrinsics are used:
// - pmovmskb/vpmovmskb: efficient mask extraction from the MSBs
// - vpor/por: OR-combining of 4x 16-byte loads (2x unrolled, 64-byte chunks)
// X86_64: {{vpmovmskb|pmovmskb}}
// X86_64: {{vpor|por}}
// LA64-LABEL: test_is_ascii
// LA64: vmskltz.b