This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.
For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
Much of the compiler calls functions on Align projected from AbiAlign.
AbiAlign impls Deref to its inner Align, so we can simplify these away.
Also, it will minimize disruption when AbiAlign is removed.
For now, preserve usages that might resolve to PartialOrd or PartialEq,
as those have odd inference.
Apply ABI attributes on return types in `rustc_codegen_cranelift`
- The [x86-64 System V ABI standard](https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build) doesn't sign/zero-extend integer arguments or return types.
- But the de-facto standard as implemented by Clang and GCC is to sign/zero-extend arguments to 32 bits (but not return types).
- Additionally, Apple targets [sign/zero-extend both arguments and return values to 32 bits](https://developer.apple.com/documentation/xcode/writing-64-bit-intel-code-for-apple-platforms#Pass-arguments-to-functions-correctly).
- However, the `rustc_target` ABI adjustment code currently [unconditionally extends both arguments and return values to 32 bits](https://github.com/rust-lang/rust/blame/e703dff8fe220b78195c53478e83fb2f68d8499c/compiler/rustc_target/src/callconv/x86_64.rs#L240) on all targets.
- This doesn't cause a miscompilation when compiling with LLVM as LLVM will ignore the `signext`/`zeroext` attribute when applied to return types on non-Apple x86-64 targets.
- Cranelift, however, does not have a similar special case, requiring `rustc` to set the argument extension attribute correctly.
- However, `rustc_codegen_cranelift` doesn't currently apply ABI attributes to return types at all, meaning `rustc_codegen_cranelift` will currently miscompile `i8`/`u8`/`i16`/`u16` returns on x86-64 Apple targets as those targets require sign/zero-extension of return types.
This PR fixes the bug(s) by making the `rustc_target` x86-64 System V ABI only mark return types as sign/zero-extended on Apple platforms, while also making `rustc_codegen_cranelift` apply ABI attributes to return types. The RISC-V and s390x C ABIs also require sign/zero extension of return types, so this will fix those targets when building with `rustc_codegen_cranelift` too.
r? `````@bjorn3`````
Currently, this error emit a diagnostic with no context like:
error: `compiler_builtins` cannot call functions through upstream monomorphizations; encountered invalid call from `<math::libm::support::hex_float::Hexf<i32> as core::fmt::LowerHex>::fmt` to `core::fmt::num::<impl core::fmt::LowerHex for i32>::fmt`
With this change, it at least usually points to the problematic
function:
error: `compiler_builtins` cannot call functions through upstream monomorphizations; encountered invalid call from `<math::libm::support::hex_float::Hexf<i32> as core::fmt::LowerHex>::fmt` to `core::fmt::num::<impl core::fmt::LowerHex for i32>::fmt`
--> src/../libm/src/math/support/hex_float.rs:270:5
|
270 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
Windows x86: Change i128 to return via the vector ABI
Clang and GCC both return `i128` in xmm0 on windows-msvc and windows-gnu. Currently, Rust returns the type on the stack. Add a calling convention adjustment so we also return scalar `i128`s using the vector ABI, which makes our `i128` compatible with C.
In the future, Clang may change to return `i128` on the stack for its `-msvc` targets (more at [1]). If this happens, the change here will need to be adjusted to only affect MinGW.
Link: https://github.com/rust-lang/rust/issues/134288 (does not fix) [1]
try-job: x86_64-msvc
try-job: x86_64-msvc-ext1
try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
Clang and GCC both return `i128` in xmm0 on windows-msvc and
windows-gnu. Currently, Rust returns the type on the stack. Add a
calling convention adjustment so we also return scalar `i128`s using the
vector ABI, which makes our `i128` compatible with C.
In the future, Clang may change to return `i128` on the stack for its
`-msvc` targets (more at [1]). If this happens, the change here will
need to be adjusted to only affect MinGW.
Link: https://github.com/rust-lang/rust/issues/134288
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.
Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
Add a hook for `should_codegen_locally`
This PR lifts the module-local function `should_codegen_locally` to `TyCtxt` as a hook.
In addition to monomorphization, this function is used for checking the dependency of `compiler_builtins` on other libraries. Moving this function to the hooks also makes overriding it possible for the tools that use the rustc interface.
Clean up a few minor refs in `format!` macro, as it has a performance cost. Apparently the compiler is unable to inline `format!("{}", &variable)`, and does a run-time double-reference instead (format macro already does one level referencing). Inlining format args prevents accidental `&` misuse.
This replaces the drop_in_place reference with null in vtables. On
librustc_driver.so, this drops about ~17k dynamic relocations from the
output, since many vtables can now be placed in read-only memory, rather
than having a relocated pointer included.
This makes a tradeoff by adding a null check at vtable call sites.
That's hard to avoid without changing the vtable format (e.g., to use a
pc-relative relocation instead of an absolute address, and avoid the
dynamic relocation that way). But it seems likely that the check is
cheap at runtime.