It's described as a "backwards compatibility hack to keep the diff
small". Removing it requires only a modest amount of churn, and the
resulting code is clearer without the invisible derefs.
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.
Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
Overhaul filename handling for cross-compiler consistency
This PR overhauls the way we handle filenames in the compiler and `rmeta` in order to achieve achieve cross-compiler consistency (ie. having the same path no matter if the filename was created in the current compiler session or is coming from `rmeta`).
This is required as some parts of the compiler rely on consistent paths for the soundness of generated code (see rust-lang/rust#148328).
In order to achieved consistency multiple steps are being taken by this PR:
- by making `RealFileName` immutable
- by only having `SourceMap::to_real_filename` create `RealFileName`
- currently `RealFileName` can be created from any `Path` and are remapped afterwards, which creates consistency issue
- by also making `RealFileName` holds it's working directory, embeddable name and the remapped scopes
- this removes the need for a `Session`, to know the current(!) scopes and cwd, which is invalid as they may not be equal to the scopes used when creating the filename
In order for `SourceMap::to_real_filename` to know which scopes to apply `FilePathMapping` now takes the current remapping scopes to apply, which makes `FileNameDisplayPreference` and company useless and are removed.
This PR is split-up in multiple commits (unfortunately not atomic), but should help review the changes.
Unblocks https://github.com/rust-lang/rust/pull/147611
Fixes https://github.com/rust-lang/rust/issues/148328
Update `rustc_codegen_gcc` rotate operation document
## Description
This PR resolves a TODO comment in the `rustc_codegen_gcc` backend by documenting that the rotate operations (`rotate_left` and `rotate_right`) already implement the optimized branchless algorithm from comment.
The existing implementation already uses the optimal branchless rotation pattern:
- For left rotation: `(x << n) | (x >> (-n & (width-1)))`
- For right rotation: `(x >> n) | (x << (-n & (width-1)))`
This pattern avoids branches and generates efficient machine code across different platforms, which was the goal mentioned in the original TODO.
## Changes
- Removed the TODO comment that suggested implementing the algorithm from https://blog.regehr.org/archives/1063
Remove -Zoom=panic
There are major questions remaining about the reentrancy that this allows. It doesn't have any users on github outside of a single project that uses it in a panic=abort project to show backtraces. It can still be emulated through `#[alloc_error_handler]` or `set_alloc_error_hook` depending on if you use the standard library or not. And finally it makes it harder to do various improvements to the allocator shim.
With this PR the sole remaining symbol in the allocator shim that is not effectively emulating weak symbols is the symbol that prevents skipping the allocator shim on stable even when it would otherwise be empty because libstd + `#[global_allocator]` is used.
Closes https://github.com/rust-lang/rust/issues/43596
Fixes https://github.com/rust-lang/rust/issues/126683
This register is only supported on the *powerpc*spe targets. It is
only recognized by LLVM. gcc does not accept this as a clobber, nor
does it support these targets.
This is a volatile register, thus it is included with clobber_abi.
There are major questions remaining about the reentrancy that this
allows. It doesn't have any users on github outside of a single project
that uses it in a panic=abort project to show backtraces. It
can still be emulated through #[alloc_error_handler] or
set_alloc_error_hook depending on if you use the standard library or
not. And finally it makes it harder to do various improvements to the
allocator shim.
This implements a new unstable compiler flag `-Zannotate-moves` that makes
move and copy operations visible in profilers by creating synthetic debug
information. This is achieved with zero runtime cost by manipulating debug
info scopes to make moves/copies appear as calls to `compiler_move<T, SIZE>`
and `compiler_copy<T, SIZE>` marker functions in profiling tools.
This allows developers to identify expensive move/copy operations in their
code using standard profiling tools, without requiring specialized tooling
or runtime instrumentation.
The implementation works at codegen time. When processing MIR operands
(`Operand::Move` and `Operand::Copy`), the codegen creates an `OperandRef`
with an optional `move_annotation` field containing an `Instance` of the
appropriate profiling marker function. When storing the operand,
`store_with_annotation()` wraps the store operation in a synthetic debug
scope that makes it appear inlined from the marker.
Two marker functions (`compiler_move` and `compiler_copy`) are defined
in `library/core/src/profiling.rs`. These are never actually called -
they exist solely as debug info anchors.
Operations are only annotated if the type:
- Meets the size threshold (default: 65 bytes, configurable via
`-Zannotate-moves=SIZE`)
- Has a non-scalar backend representation (scalars use registers,
not memcpy)
This has a very small size impact on object file size. With the default
limit it's well under 0.1%, and even with a very small limit of 8 bytes
it's still ~1.5%. This could be enabled by default.
Fix ICE on offsetted ZST pointer
I'm not sure this is the *right* fix, but it's simple enough and does roughly what I'd expect. Like with the previous optimization to codegen usize rather than a zero-sized static, there's no guarantee that we continue returning a particular value from the offsetting.
A grep for `const_usize.*align` found the same code copied to rustc_codegen_gcc and cranelift but a quick skim didn't find other cases of similar 'optimization'. That said, I'm not convinced I caught everything, it's not trivial to search for this.
Closesrust-lang/rust#147516
Restrict sysroot crate imports to those defined in this repo.
It's common to import dependencies from the sysroot via `extern crate` rather than use an explicit cargo dependency, when it's necessary to use the same dependency version as used by rustc itself. However, this is dangerous for crates.io crates, since rustc may not pull in the dependency on some targets, or may pull in multiple versions. In both cases, the `extern crate` fails to resolve.
To address this, re-export all such dependencies from the appropriate `rustc_*` crates, and use this alias from crates which would otherwise need to use `extern crate`.
See https://github.com/rust-lang/rust/pull/143492 for an example of the kind of issue that can occur.
It's common to import dependencies from the sysroot via `extern crate`
rather than use an explicit cargo dependency, when it's necessary to use
the same dependency version as used by rustc itself. However, this is
dangerous for crates.io crates, since rustc may not pull in the
dependency on some targets, or may pull in multiple versions. In both
cases, the `extern crate` fails to resolve.
To address this, re-export all such dependencies from the appropriate
`rustc_*` crates, and use this alias from crates which would otherwise
need to use `extern crate`.
Add vsx register support for ppc inline asm, and implement preserves_flag option
This should address the last(?) missing pieces of inline asm for ppc:
* Explicit VSX register support. ISA 2.06 (POWER7) added a 64x128b register overlay extending the fpr's to 128b, and unifies them with the vmx (altivec) registers. Implementations details within gcc/llvm percolate up, and require using the `x` template modifier. I have updated the inline asm to implicitly include this for vsx arguments which do not specify it. ~~Support for the gcc codegen backend is still a todo.~~
* Implement the `preserves_flags` option. All ABI's, and all ISAs store their flags in `cr`, and the carry bit lives inside `xer`. The other status registers hold sticky bits or control bits which do not affect branch instructions.
There is some interest in the e500 (powerpcspe) port. Architecturally, it has a very different FP ISA, and includes a simd extension called SPR (which is not IBM's cell SPE). Notably, it does not have altivec/fpr/vsx registers. It also has an SPE accumulator register which its ABI marks as volatile, but I am not sure if the compiler uses it.
Implemented preserves_flags on powerpc by making it do
nothing. This prevents having two different ways to mark
`cr0` as clobbered. clang and gcc alias `cr0` to `cc`.
The gcc inline documentation does not state what this does
on powerpc* targets, but inspection of the source shows
it is equivalent to condition register field `cr0`, so it
should not be added.
Where supported, VSX is a 64x128b register set which encompasses
both the floating point and vector registers.
In the type tests, xvsqrtdp is used as it is the only two-argument
vsx opcode supported by all targets on llvm. If you need to copy
a vsx register, the preferred way is "xxlor xt, xa, xa".
In the future this should make it easier to use weak symbols for the
allocator shim on platforms that properly support weak symbols. And it
would allow reusing the allocator shim code for handling default
implementations of the upcoming externally implementable items feature
on platforms that don't properly support weak symbols.