Move computation of allocator shim contents to cg_ssa
In the future this should make it easier to use weak symbols for the allocator shim on platforms that properly support weak symbols. And it would allow reusing the allocator shim code for handling default implementations of the upcoming externally implementable items feature on platforms that don't properly support weak symbols.
In addition to make this possible, the alloc error handler is now handled in a way such that it is possible to avoid using the allocator shim when liballoc is compiled without `no_global_oom_handling` if you use `#[alloc_error_handler]`. Previously this was only possible if you avoided liballoc entirely or compiled it with `no_global_oom_handling`. You still need to avoid libstd and to define the symbol that indicates that avoiding the allocator shim is unstable.
std: improve handling of timed condition variable waits on macOS
Fixesrust-lang/rust#37440 (for good).
This fixes two issues with `Condvar::wait_timeout` on macOS:
Apple's implementation of `pthread_cond_timedwait` internally converts the absolute timeout to a relative one, measured in nanoseconds, but fails to consider overflow when doing so. This results in `wait_timeout` returning much earlier than anticipated when passed a duration that is slightly longer than `u64::MAX` nanoseconds (around 584 years). The existing clamping introduced by rust-lang/rust#42604 to address rust-lang/rust#37440 unfortunately used a maximum duration of 1000 years and thus still runs into the bug when run on older macOS versions (or with `PTHREAD_MUTEX_USE_ULOCK` set to a value other than "1"). See https://github.com/rust-lang/rust/issues/37440#issuecomment-3285958326 for context.
Reducing the maximum duration alone however would not be enough to make the implementation completely correct. As macOS does not support `pthread_condattr_setclock`, the deadline passed to `pthread_cond_timedwait` is measured against the wall-time clock. `std` currently calculates the deadline by retrieving the current time and adding the duration to that, only for macOS to convert the deadline back to a relative duration by [retrieving the current time itself](1ebf56b3a7/src/pthread_cond.c (L802-L819)) (this conversion is performed before the aforementioned problematic one). Thus, if the wall-time clock is adjusted between the `std` lookup and the system lookup, the relative duration could have changed, possibly even to a value larger than $2^{64}\ \textrm{ns}$. Luckily however, macOS supports the non-standard, tongue-twisting `pthread_cond_timedwait_relative_np` function which avoids the wall-clock-time roundtrip by taking a relative timeout. Even apart from that, this function is perfectly suited for `std`'s purposes: it is public (albeit badly-documented) API, [available since macOS 10.4](1ebf56b3a7/include/pthread/pthread.h (L555-L559)) (that's way below our minimum of 10.12) and completely resilient against wall-time changes as all timeouts are [measured against the monotonic clock](e3723e1f17/bsd/kern/sys_ulock.c (L741)) inside the kernel.
Thus, this PR switches `Condvar::wait_timeout` to `pthread_cond_timedwait_relative_np`, making sure to clamp the duration to a maximum of $2^{64} - 1 \ \textrm{ns}$. I've added a miri shim as well, so the only thing missing is a definition of `pthread_cond_timedwait_relative_np` inside `libc`.
Implemented preserves_flags on powerpc by making it do
nothing. This prevents having two different ways to mark
`cr0` as clobbered. clang and gcc alias `cr0` to `cc`.
The gcc inline documentation does not state what this does
on powerpc* targets, but inspection of the source shows
it is equivalent to condition register field `cr0`, so it
should not be added.
Where supported, VSX is a 64x128b register set which encompasses
both the floating point and vector registers.
In the type tests, xvsqrtdp is used as it is the only two-argument
vsx opcode supported by all targets on llvm. If you need to copy
a vsx register, the preferred way is "xxlor xt, xa, xa".
Hide vendoring and copyright in GHA group
These two steps are currently the most verbose steps in a dist-linux build, which makes it harder to find more interesting parts. Hide them in a group like most things.
For example, see https://github.com/rust-lang/rust/actions/runs/18462295959/job/52596384752
Avoid redundant UB check in RangeFrom slice indexing
I noticed this while picking through the IR we generate for https://github.com/rust-lang/rust/pull/134938. I think we just forgot to apply this trick to `RangeFrom`?
std: implement `pal::os::exit` for VEXos
This PR provides a more "proper" implementation of process exiting in VEXos programs by going through `vexSystemExitRequest` rather than calling `intrinsics::abort`, which exits using an undefined instruction trap. This matches the existing implementation of `rt::abort_internal` and therefore makes `std::process::exit` have the same behavior as returning from main on VEXos targets.
fix 2 search graph bugs
wooooooooops, i should really run the fuzzer even when not changing the structure of the search graph as a whole :3 fixes the `ml-kem` ICE in the next-solver crater run
r? ````@BoxyUwU````
Enable `u64` limbs in `core::num::bignum`
Since `u128` is stable now, I guess, we can safely add this, not sure if this requires tests or anything
cc https://github.com/rust-lang/rust/issues/137887
only call polymorphic array iter drop machinery when the type requires it
I saw a bunch of dead, empty `<[core::mem::maybe_uninit::MaybeUninit<T>; N] as core::array::iter::iter_inner::PartialDrop>::partial_drop` functions when compiling with more than 1 CGU.
Let's see if we can help optimizations to eliminate stuff earlier.
r? ghost