Commit graph

108 commits

Author SHA1 Message Date
Jonathan Brouwer
e6ca590153
Rollup merge of #152404 - durin42:llvm-23-instcombine-shrink-constant, r=Mark-Simulacrum
tests: adapt align-offset.rs for InstCombine improvements in LLVM 23

Upstream [has improved InstCombine](8d2078332c) so that it can shrink added constants using known zeroes, which caused a little bit of change in this test. As far as I can tell either output is fine, so we just accept both.

@rustbot label: +llvm-main
2026-02-14 22:11:54 +01:00
Jacob Pratt
b1b6533077
Rollup merge of #142680 - beetrees:sparc64-float-struct-abi, r=tgross35
Fix passing/returning structs with the 64-bit SPARC ABI

Fixes the 64-bit SPARC part of rust-lang/rust#115609 by replacing the current implementation with a new implementation modelled on the RISC-V calling convention code ([SPARC ABI reference](https://sparc.org/wp-content/uploads/2014/01/SCD.2.4.1.pdf.gz)).

Pinging `sparcv9-sun-solaris` target maintainers: @psumbera @kulikjak
Fixes rust-lang/rust#115336
Fixes rust-lang/rust#115399
Fixes rust-lang/rust#122620
Fixes https://github.com/rust-lang/rust/issues/147883
r? @workingjubilee
2026-02-12 00:41:05 -05:00
Folkert de Vries
c9b5c934ca
Fix passing/returning structs with the 64-bit SPARC ABI
Co-authored-by: beetrees <b@beetr.ee>
2026-02-10 12:39:45 +01:00
Augie Fackler
aefb9a9ae2 tests: adapt align-offset.rs for InstCombine improvements in LLVM 23
Upstream has improved InstCombine so that it can shrink added constants
using known zeroes, which caused a little bit of change in this test. As
far as I can tell either output is fine, so we just accept both.
2026-02-09 15:53:38 -05:00
Eddy (Eduard) Stefes
51affa0394 add tests for s390x-unknown-none-softfloat
tests will check:
- correct emit of assembly for softfloat target
- incompatible set features will emit warnings/errors
- incompatible target tripples in crates will not link
2026-02-09 09:29:16 +01:00
Eddy (Eduard) Stefes
2b1dc3144b add a new s390x-unknown-none-softfloat target
This target is intended to be used for kernel development. Becasue on s390x
float and vector registers overlap we have to disable the vector extension.

The default s390x-unknown-gnu-linux target will not allow use of
softfloat.

Co-authored-by: Jubilee <workingjubilee@gmail.com>
2026-02-09 09:28:54 +01:00
ltdk
28feae0c87 Move bigint helper tracking issues 2026-02-02 18:45:26 -05:00
Nikita Popov
e015fc820d Adjust loongarch assembly test
This generates different code on loongarch32r now.
2026-01-27 12:09:39 +01:00
Jonathan Pallant
6ecb3f33f0
Adds two new Tier 3 targets - aarch64v8r-unknown-none and aarch64v8r-unknown-none-softfloat.
The existing `aarch64-unknown-none` target assumes Armv8.0-A as a baseline. However, Arm recently released the Arm Cortex-R82 processor which is the first to implement the Armv8-R AArch64 mode architecture. This architecture is similar to Armv8-A AArch64, however it has a different set of mandatory features, and is based off of Armv8.4. It is largely unrelated to the existing Armv8-R architecture target (`armv8r-none-eabihf`), which only operates in AArch32 mode.

The second `aarch64v8r-unknown-none-softfloat` target allows for possible Armv8-R AArch64 CPUs with no FPU, or for use-cases where FPU register stacking is not desired. As with the existing `aarch64-unknown-none` target we have coupled FPU support and Neon support together - there is no 'has FPU but does not have NEON' target proposed even though the architecture technically allows for it.

This PR was developed by Ferrous Systems on behalf of Arm. Arm is the owner of these changes.
2026-01-26 12:43:52 +00:00
Stuart Cook
a6e8a31b86
Rollup merge of #151611 - bonega:improve-is-slice-is-ascii-performance, r=folkertdev
Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics

# Summary

Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs.
AVX-512 keeps similiar performance characteristics.

This is building on the work already merged in rust-lang/rust#151259.
In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore.
Thanks to @folkertdev for pointing me to consider `as_chunk` again.

# The implementation:
- Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together
- Extracts the MSB mask with a single `pmovmskb` instruction
- Falls back to usize-at-a-time SWAR for inputs < 64 bytes

# Performance impact (vs before rust-lang/rust#151259):
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

  <details>
  <summary>Benchmark Results (click to expand)</summary>

  Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest).
  Tops out at 139GB/s for large inputs.

  ### early_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | 1.01 | **1.00** | 13.45 | 1.13 |
  | 1024 | 1.01 | **1.00** | 13.53 | 1.14 |
  | 65536 | 1.01 | **1.00** | 13.99 | 1.12 |
  | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 |

  ### late_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | **1.00** | 1.01 | 13.37 | 1.13 |
  | 1024 | 1.10 | **1.00** | 42.42 | 1.95 |
  | 65536 | **1.00** | 1.06 | 42.22 | 1.73 |
  | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 |

  ### pure_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 4 | 1.03 | **1.00** | 1.75 | 1.32 |
  | 8 | **1.00** | 1.14 | 3.89 | 2.06 |
  | 16 | **1.00** | 1.04 | 1.13 | 1.62 |
  | 32 | 1.07 | 1.19 | 5.11 | **1.00** |
  | 64 | **1.00** | 1.13 | 13.32 | 1.57 |
  | 128 | **1.00** | 1.01 | 19.97 | 1.55 |
  | 256 | **1.00** | 1.02 | 27.77 | 1.61 |
  | 1024 | **1.00** | 1.02 | 41.34 | 1.84 |
  | 4096 | 1.02 | **1.00** | 45.61 | 1.98 |
  | 16384 | 1.01 | **1.00** | 48.67 | 2.04 |
  | 65536 | **1.00** | 1.03 | 43.86 | 1.77 |
  | 262144 | **1.00** | 1.06 | 41.44 | 1.79 |
  | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 |

  </details>

## Reproduction / Test Projects

Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

- `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
- `fuzz/` - Compares old/new implementations with libfuzzer

Relates to: https://github.com/llvm/llvm-project/issues/176906
2026-01-26 14:36:21 +11:00
Andreas Liljeqvist
dbc870afec Mark is_ascii_sse2 as #[inline] 2026-01-25 20:05:08 +01:00
Andreas Liljeqvist
cbcd8694c6 Remove x86_64 assembly test for is_ascii
The SSE2 helper function is not inlined across crate boundaries,
so we cannot verify the codegen in an assembly test. The fix is
still verified by the absence of performance regression.
2026-01-25 09:44:04 +01:00
Andreas Liljeqvist
a72f68e801 Fix is_ascii performance on x86_64 with explicit SSE2 intrinsics
Use explicit SSE2 intrinsics to avoid LLVM's broken AVX-512
auto-vectorization which generates ~31 kshiftrd instructions.

Performance
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

Improves on earlier pr
2026-01-24 22:03:58 +01:00
Matthias Krüger
c11be675f4
Rollup merge of #151571 - androm3da:bcain/cstr_merge, r=tgross35
Fix cstring-merging test for Hexagon target

Hexagon assembler uses `.string` directive instead of `.asciz` for null-terminated strings. Both are equivalent but the test was only checking for `.asciz`.

Update the CHECK patterns to accept both directives using `.{{asciz|string}}` regex pattern.
2026-01-24 21:04:17 +01:00
Jonathan 'theJPster' Pallant
96897f016e Add ARMv6 bare-metal targets
Three targets, covering A32 and T32 instructions, and soft-float and
hard-float ABIs. Hard-float not available in Thumb mode. Atomics
in Thumb mode require __sync* functions from compiler-builtins.
2026-01-24 17:29:25 +00:00
Jonathan Brouwer
13f0399a57
Rollup merge of #151259 - bonega:fix-is-ascii-avx512, r=folkertdev
Fix is_ascii performance regression on AVX-512 CPUs when compiling with -C target-cpu=native

## Summary

This PR fixes a severe performance regression in `slice::is_ascii` on AVX-512 CPUs when compiling with `-C target-cpu=native`.

On affected systems, the current implementation achieves only ~3 GB/s for large inputs, compared to ~60–70 GB/s previously (≈20–24× regression). This PR restores the original performance characteristics.

This change is intended as a **temporary workaround** for upstream LLVM poor codegen. Once the underlying LLVM issue is fixed and Rust is able to consume that fix, this workaround should be reverted.

  ## Problem

  When `is_ascii` is compiled with AVX-512 enabled, LLVM's auto-vectorization generates ~31 `kshiftrd` instructions to extract mask bits one-by-one, instead of using the efficient `pmovmskb`
  instruction. This causes a **~22x performance regression**.

  Because `is_ascii` is marked `#[inline]`, it gets inlined and recompiled with the user's target settings, affecting anyone using `-C target-cpu=native` on AVX-512 CPUs.

## Root cause (upstream)

The underlying issue appears to be an LLVM vectorizer/backend bug affecting certain AVX-512 patterns.

An upstream issue has been filed by @folkertdev  to track the root cause: llvm/llvm-project#176906

Until this is resolved in LLVM and picked up by rustc, this PR avoids triggering the problematic codegen pattern.

  ## Solution

  Replace the counting loop with explicit SSE2 intrinsics (`_mm_movemask_epi8`) that force `pmovmskb` codegen regardless of CPU features.

  ## Godbolt Links (Rust 1.92)

  | Pattern | Target | Link | Result |
  |---------|--------|------|--------|
  | Counting loop (old) | Default SSE2 | https://godbolt.org/z/sE86xz4fY | `pmovmskb` |
  | Counting loop (old) | AVX-512 (znver4) | https://godbolt.org/z/b3jvMhGd3 | 31x `kshiftrd` (broken) |
  | SSE2 intrinsics (fix) | Default SSE2 | https://godbolt.org/z/hMeGfeaPv | `pmovmskb` |
  | SSE2 intrinsics (fix) | AVX-512 (znver4) | https://godbolt.org/z/Tdvdqjohn | `vpmovmskb` (fixed) |

  ## Benchmark Results

  **CPU:** AMD Ryzen 5 7500F (Zen 4 with AVX-512)

  ### Default Target (SSE2) — Mixed

  | Size | Before | After | Change |
  |------|--------|-------|--------|
  | 4 B | 1.8 GB/s | 2.0 GB/s | **+11%** |
  | 8 B | 3.2 GB/s | 5.8 GB/s | **+81%** |
  | 16 B | 5.3 GB/s | 8.5 GB/s | **+60%** |
  | 32 B | 17.7 GB/s | 15.8 GB/s | -11% |
  | 64 B | 28.6 GB/s | 25.1 GB/s | -12% |
  | 256 B | 51.5 GB/s | 48.6 GB/s | ~same |
  | 1 KB | 64.9 GB/s | 60.7 GB/s | ~same |
  | 4 KB+ | ~68-70 GB/s | ~68-72 GB/s | ~same |

  ### Native Target (AVX-512) — Up to 24x Faster

  | Size | Before | After | Speedup |
  |------|--------|-------|---------|
  | 4 B | 1.2 GB/s | 2.0 GB/s | **1.7x** |
  | 8 B | 1.6 GB/s | 5.0 GB/s | **3.3x** |
  | 16 B | ~7 GB/s | ~7 GB/s | ~same |
  | 32 B | 2.9 GB/s | 14.2 GB/s | **4.9x** |
  | 64 B | 2.9 GB/s | 23.2 GB/s | **8x** |
  | 256 B | 2.9 GB/s | 47.2 GB/s | **16x** |
  | 1 KB | 2.8 GB/s | 60.0 GB/s | **21x** |
  | 4 KB+ | 2.9 GB/s | ~68-70 GB/s | **23-24x** |

  ### Summary

  - **SSE2 (default):** Small inputs (4-16 B) 11-81% faster; 32-64 B ~11% slower; large inputs unchanged
  - **AVX-512 (native):** 21-24x faster for inputs ≥1 KB, peak ~70 GB/s (was ~3 GB/s)

  Note: this is the pure ascii path, but the story is similar for the others.
  See linked bench project.

  ## Test Plan

  - [x] Assembly test (`slice-is-ascii-avx512.rs`) verifies no `kshiftrd` with AVX-512
  - [x] Existing codegen test updated to `loongarch64`-only (auto-vectorization still used there)
  - [x] Fuzz testing confirms old/new implementations produce identical results (~53M iterations)
  - [x] Benchmarks confirm performance improvement
  - [x] Tidy checks pass

  ## Reproduction / Test Projects

  Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

  - `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
  - `fuzz/` - Compares old/new implementations with libfuzzer

  ## Related Issues
  - issue opened by @folkertdev llvm/llvm-project#176906
  - Regression introduced in https://github.com/rust-lang/rust/pull/130733
2026-01-24 08:18:05 +01:00
Jonathan Brouwer
42c3cae5e7
Rollup merge of #150556 - thejpster:add-thumbv7a-thumbv7r-thumbv8r, r=petrochenkov
Add Tier 3 Thumb-mode targets for Armv7-A, Armv7-R and Armv8-R

We currently have targets for bare-metal Armv7-R, Armv7-A and Armv8-R, but only in Arm mode. This PR adds five new targets enabling bare-metal support on these architectures in Thumb mode.

This has been tested using https://github.com/rust-embedded/aarch32/compare/main...thejpster:aarch32:support-thumb-mode-v7-v8?expand=1 and they all seem to work as expected.

However, I wasn't sure what to do with the maintainer lists as these are five new targets, but they share the docs page with the existing Arm versions. I can ask the Embedded Devices WG Arm Team about taking on these ones too, but whether Arm themselves want to take them on I guess is a bigger question.
2026-01-24 08:18:05 +01:00
Brian Cain
e558544565 Fix cstring-merging test for Hexagon target
Hexagon assembler uses `.string` directive instead of `.asciz` for
null-terminated strings. Both are equivalent but the test was only
checking for `.asciz`.

Update the CHECK patterns to accept both directives using
`.{{asciz|string}}` regex pattern.
2026-01-23 23:45:36 -06:00
Jonathan Brouwer
dec8d6ebcf
Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Andreas Liljeqvist
c609cce8cf Merge is_ascii codegen tests using revisions
Combine the x86_64 and loongarch64 is_ascii tests into a single file
using compiletest revisions. Both now test assembly output:

- X86_64: Verifies no broken kshiftrd/kshiftrq instructions (AVX-512 fix)
- LA64: Verifies vmskltz.b instruction is used (auto-vectorization)
2026-01-22 22:18:00 +01:00
Jonathan 'theJPster' Pallant
96647dde77 Add Thumb-mode targets for Armv7-R, Armv7-A and Armv8-R. 2026-01-22 18:37:52 +00:00
Jakob Koschel
c222a00e79 Create x86_64-unknown-linux-gnuasan target which enables ASAN by default
As suggested, in order to distribute sanitizer instrumented standard
libraries without introducing new rustc flags, this adds a new dedicated
target. With the target, we can distribute the instrumented standard
libraries through a separate rustup component.
2026-01-20 09:21:53 +00:00
Andreas Liljeqvist
a0f9a15b4a Fix is_ascii performance regression on AVX-512 CPUs
When `[u8]::is_ascii()` is compiled with `-C target-cpu=native` on
AVX-512 CPUs, LLVM generates inefficient code. Because `is_ascii` is
marked `#[inline]`, it gets inlined and recompiled with the user's
target settings. The previous implementation used a counting loop that
LLVM auto-vectorizes to `pmovmskb` on SSE2, but with AVX-512 enabled,
LLVM uses k-registers and extracts bits individually with ~31
`kshiftrd` instructions.

This fix replaces the counting loop with explicit SSE2 intrinsics
(`_mm_loadu_si128`, `_mm_or_si128`, `_mm_movemask_epi8`) for x86_64.
`_mm_movemask_epi8` compiles to `pmovmskb`, forcing efficient codegen
regardless of CPU features.

Benchmark results on AMD Ryzen 5 7500F (Zen 4 with AVX-512):
- Default build: ~73 GB/s → ~74 GB/s (no regression)
- With -C target-cpu=native: ~3 GB/s → ~67 GB/s (22x improvement)

The loongarch64 implementation retains the original counting loop
since it doesn't have this issue.

Regression from: https://github.com/rust-lang/rust/pull/130733
2026-01-17 17:38:51 +01:00
Jonathan Brouwer
002b68d628
Rollup merge of #150826 - s390x-asm-f16-vector, r=uweigand,tgross35
Add `f16` inline ASM support for s390x

tracking issue: https://github.com/rust-lang/rust/issues/116909
cc https://github.com/rust-lang/rust/issues/125398

Support the `f16x8` type in inline assembly. Only with the `nnp-assist` feature are there any instructions that make use of this type. Based on the riscv implementation I now cast to `i16x8` when that feature is not enabled.

As far as I'm aware there are no instructions operating on `f16` scalar values. Should we still add support for using them in inline assembly?

r? @tgross35
cc @uweigand
2026-01-13 09:01:29 +01:00
Matthias Krüger
f417f55e62
Rollup merge of #150368 - minicore-ordering, r=workingjubilee
adding Ordering enum to minicore.rs, importing minicore in "tests/assembly-llvm/rust-abi-arg-attr.rs" test file

this adds the `Ordering` enum to `minicore.rs`.

consequently, this updates `tests/assembly-llvm/rust-abi-arg-attr.rs` to import `minicore` directly. previously, this test file contained traits like `Copy` `Clone` `PointeeSized`, which were giving a duplicate lang item error, so replace those by importing `minicore` completely.
2026-01-11 09:56:38 +01:00
Folkert de Vries
6f12b86e9c
s390x: support f16 and f16x8 in inline assembly 2026-01-09 18:42:46 +01:00
paradoxicalguy
484ea769d3 adding minicore to test file to avoid duplicating lang error 2026-01-09 02:30:33 +00:00
Farid Zakaria
93f2e80f4a Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
2026-01-07 11:57:48 -08:00
Folkert de Vries
76d0843f8d
naked functions: emit .private_extern on macos 2026-01-06 16:48:04 +01:00
Kjetil Kjeka
746acc47a1 Nvptx: Use llbc as default linker 2025-12-19 21:39:48 +01:00
Matthias Krüger
eb0f57507c
Rollup merge of #149815 - is57primenumber:add-slp-vectorize-test, r=chenyukang
Add regression test for #120189

This PR adds regression tests for rust-lang/rust#120189.
I added tests to verify vectorization of loops inside closures.
2025-12-19 09:25:25 +01:00
Jonathan Brouwer
9890981c30
Rollup merge of #148849 - saethlin:windows-stack-protectors, r=wesleywiser
Set -Cpanic=abort in windows-msvc stack protector tests

I ran into a test failure with the 32-bit windows test on https://github.com/rust-lang/rust/pull/117192, one of the tests has been incorrectly passing (until my change!) because it is picking up the stack protector from another function. I've tried to prevent that happening again by adding CHECK-DAGs for the start and end of each function.

I've also done my best to correct the comments, some were based on the fact that we used to run these tests with unwinding panics, but LLVM doesn't add protectors to function with SEH funclets so it's must more straightforward for these tests to use `-Cpanic=abort`.
2025-12-18 18:37:14 +01:00
Kevaundray Wedderburn
1fe0a85df7 Add rv64IM 2025-12-15 12:17:55 +00:00
is57primenumber
2bda6713c7 Add regression test for closure loop vectorization 2025-12-09 23:02:14 +09:00
Stuart Cook
4e3b7a1e31
Rollup merge of #149409 - cezarbbb:stable_ssp, r=SparrowLii
Test the coexistence of 'stack-protector' and 'safe-stack'

This is a test to detect the coexistence of 'stack-protector' and 'safe-stack', and it's a supplement to pr rust-lang/rust#147115 . After the solution to issue rust-lang/rust#149340, I rewrote a version using minicore to circumvent the 'abi_mismatch' error.

r? `@SparrowLii` (Do you have time to review it?)
2025-11-29 21:12:26 +11:00
cezarbbb
7c24c9a908 Test the coexistence of 'stack-protector' and 'safe-stack' 2025-11-29 14:17:58 +08:00
Stuart Cook
0a712d2b14
Rollup merge of #147115 - cezarbbb:stable_ssp, r=SparrowLii
More robust stack protector testing

I've added some tests related to the stack protector. These tests were originally in the LLVM stack protector test project.
These tests were written for the "Stabilize stack-protector" proposal, and therefore removed the "stack-protector=basic" test option, as this stack protector was considered ineffective in Rust.
For the proposal, see: rust-lang/rust#146369
For the discussion, see zulip: https://rust-lang.zulipchat.com/#narrow/channel/233931-t-compiler.2Fmajor-changes/topic/Proposal.20for.20Adapt.20Stack.20Protector.20for.20Ru.E2.80.A6.20compiler-team.23841

I have opened an issue to discuss the 'abi_mismatch' issue I encountered while writing tests for the coexistence of 'stack-protector' and 'safe-stack': https://github.com/rust-lang/rust/issues/149340

r? `@wesleywiser` (feel free to reassign)
cc `@nikic,` `@rcvalle,` `@davidtwco,` `@arielb1,` `@Darksonn,` `@Noratrieb,` `@SparrowLii`
2025-11-27 12:36:47 +11:00
cezarbbb
be28e7fdd1 Add 'stack-protector' tests for Linux/Win32/Win64. 2025-11-26 17:04:37 +08:00
Matthias Krüger
422b83aeee
Rollup merge of #147173 - androm3da:bcain/hexagon_qurt, r=davidtwco,tgross35
Add support for hexagon-unknown-qurt target

MCP: https://github.com/rust-lang/compiler-team/issues/919
Fixes https://github.com/rust-lang/rust/issues/148982.
2025-11-20 11:15:51 +01:00
David Wood
ff00110543
sess: default to v0 symbol mangling
Rust's current mangling scheme depends on compiler internals; loses
information about generic parameters (and other things) which makes for
a worse experience when using external tools that need to interact with
Rust symbol names; is inconsistent; and can contain `.` characters
which aren't universally supported. Therefore, Rust has defined its own
symbol mangling scheme which is defined in terms of the Rust language,
not the compiler implementation; encodes information about generic
parameters in a reversible way; has a consistent definition; and
generates symbols that only use the characters `A-Z`, `a-z`, `0-9`, and
`_`.

Support for the new Rust symbol mangling scheme has been added to
upstream tools that will need to interact with Rust symbols (e.g.
debuggers).

This commit changes the default symbol mangling scheme from the legacy
scheme to the new Rust mangling scheme.

Signed-off-by: David Wood <david.wood@huawei.com>
2025-11-19 11:55:09 +00:00
Zalathar
31902f3838 Remove the "wasm32-bare" alias for wasm32-unknown-unknown
There is no compelling reason to use this alias instead of the full target
name.
2025-11-17 14:11:07 +11:00
Brian Cain
ecfc64207a Add support for hexagon-unknown-qurt target 2025-11-16 18:30:37 -06:00
Folkert de Vries
ddebb6269f
add assembly test for infinite recursion with become 2025-11-13 16:57:02 +01:00
Ben Kimock
a5f677b665 Try to fix i686 2025-11-12 23:40:47 -05:00
Ben Kimock
519785671b Set -Cpanic=abort in windows-msvc stack protector tests 2025-11-12 18:45:06 -05:00
Folkert de Vries
7516645928
stabilize s390x_target_feature_vector 2025-11-06 12:49:48 +01:00
Folkert de Vries
0645ac31cb
extract s390x vector and friends to their own rust feature 2025-11-06 12:49:04 +01:00
Stuart Cook
c33d51b9d8
Rollup merge of #147355 - sayantn:masked-loads, r=RalfJung,bjorn3
Add alignment parameter to `simd_masked_{load,store}`

This PR adds an alignment parameter in `simd_masked_load` and `simd_masked_store`, in the form of a const-generic enum `core::intrinsics::simd::SimdAlign`. This represents the alignment of the `ptr` argument in these intrinsics as follows

 - `SimdAlign::Unaligned` - `ptr` is unaligned/1-byte aligned
 - `SimdAlign::Element` - `ptr` is aligned to the element type of the SIMD vector (default behavior in the old signature)
 - `SimdAlign::Vector` - `ptr` is aligned to the SIMD vector type

The main motive for this is stdarch - most vector loads are either fully aligned (to the vector size) or unaligned (byte-aligned), so the previous signature doesn't cut it.

Now, stdarch will mostly use `SimdAlign::Unaligned` and `SimdAlign::Vector`, whereas portable-simd will use `SimdAlign::Element`.

 - [x] `cg_llvm`
 - [x] `cg_clif`
 - [x] `miri`/`const_eval`

## Alternatives

Using a const-generic/"const" `u32` parameter as alignment (and we error during codegen if this argument is not a power of two). This, although more flexible than this, has a few drawbacks

 - If we use an const-generic argument, then portable-simd somehow needs to pass `align_of::<T>()` as the alignment, which isn't possible without GCE
 - "const" function parameters are just an ugly hack, and a pain to deal with in non-LLVM backends

We can remedy the problem with the const-generic `u32` parameter by adding a special rule for the element alignment case (e.g. `0` can mean "use the alignment of the element type), but I feel like this is not as expressive as the enum approach, although I am open to suggestions

cc `@workingjubilee` `@RalfJung` `@BoxyUwU`
2025-11-05 10:59:18 +11:00
sayantn
75de619159
Add alignment parameter to simd_masked_{load,store} 2025-11-04 02:30:59 +05:30
Paul Murphy
bb9d800b78 Stabilize -Zjump-tables=<bool> into -Cjump-table=<bool> 2025-11-03 08:12:16 -06:00