Instead disable creation of assumptions during inlining using an
LLVM opt flag.
The -Z arg-align-attributes option which previously controlled this
behavior is removed.
Search other library paths when loking for link objects
Support the case when link objects are not located in Rust sysroot
but in other locations which could be specify through library paths.
Bump minimum required LLVM version to 6.0
Based on the discussion in #55842, while the overall position of Rust wrt LLVM continues to be contentious, there does seem to be a consensus that there is no need for continued support of LLVM 5. This PR bumps our version requirement to LLVM 6.0 and makes Travis run against that.
I hope that this is going to unblock #52694. If I understand correctly, while this issue still exists in LLVM 6, Ubuntu has backported the relevant patch.
r? @alexcrichton
rustc: Add the `cmpxchg16b` target feature on x86/x86_64
This appears to be called `cx16` in LLVM and a few other locations, but
the Intel Intrinsic Guide doesn't have a name for this and the CPU
manual from Intel only mentions `cmpxchg16b`, so that's the name chosen
here.
This appears to be called `cx16` in LLVM and a few other locations, but
the Intel Intrinsic Guide doesn't have a name for this and the CPU
manual from Intel only mentions `cmpxchg16b`, so that's the name chosen
here.
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
rustc: Add an unstable `simd_select_bitmask` intrinsic
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
Unconditionally emit the target-cpu LLVM attribute.
This PR makes `rustc` always emit the `target-cpu` LLVM attribute for functions. The goal is to allow for cross-language inlining of functions defined in `libstd`. So far `libstd` functions were the only function without a `target-cpu` attribute, so in whole-crate-graph cross-lang LTO scenarios they were not eligible for inlining into foreign code.
r? @alexcrichton
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
Overhaul `FileSearch` and `SearchPaths`
`FileSearch::search()` traverses one or more directories. For each
directory it generates a `Vec<PathBuf>` containing one element per file
in that directory.
In some benchmarks this occurs enough that the allocations done for the
`PathBuf`s are significant, and in practice a small number of
directories are being traversed over and over again. For example, when
compiling the `tokio-webpush-simple` benchmark, two directories are
traversed 58 times each. Each of these directories have more than 100
files.
We can do all the necessary traversals up front, when `Session` is created,
and get the `Vec<PathBuf>`s then.
This reduces instruction counts on several benchmarks by 1--5%.
r? @alexcrichton
CC @eddyb, @michaelwoerister, @nikomatsakis
This was intended to land way back in 1.24, but it was backed out due to
breakage which has long since been fixed. An unstable `#[unwind]`
attribute can be used to tweak the behavior here, but this is currently
simply switching rustc's internal default to abort-by-default if an
`extern` function panics, making our codegen sound primarily (as
currently you can produce UB with safe code)
Closes#52652
Instead of maybe storing its own sysroot and maybe deferring to the one
in `Session::opts`, just clone the latter when necessary so one is
always directly available. This removes the need for the getter.
codegen: Fix va_list - aarch64 iOS/Windows
## Summary
Fix code generated for `VaList` on Aarch64 iOS/Windows.
## Details
According to the [Apple - ARM64 Function Calling Conventions]:
> ... the type va_list is an alias for char * rather than for the struct
> type specified in the generic PCS.
The current implementation uses the generic Aarch64 structure for `VaList`
for Aarch64 iOS. Switch to using the `char *` variant of the `VaList`
and use the corresponding `emit_ptr_va_arg` for the `va_arg` intrinsic.
Windows always uses the `char *` variant of the `VaList`. Update the `va_arg`
intrinsic to use `emit_ptr_va_arg`.
[Apple - ARM64 Function Calling Conventions]: https://developer.apple.com/library/archive/documentation/Xcode/Conceptual/iPhoneOSABIReference/Articles/ARM64FunctionCallingConventions.html
According to the Apple developer docs:
> The type va_list is an alias for char * rather than for the struct
> type specified in the generic PCS.
The current implementation uses the generic Aarch64 structure for VaList
for Aarch64 iOS.
Windows always uses the char * variant of the va_list.
codegen_llvm_back: improve allocations
This commit was split out from https://github.com/rust-lang/rust/pull/54864. Last time it was causing an LLVM OOM, which was most probably caused by not collecting the globals.
- preallocate vectors of known length
- `extend` instead of `append` where the argument is consumable
- turn 2 `push` loops into `extend`s
- create a vector from a function producing one instead of using `extend_from_slice` on it
- consume `modules` when no longer needed
- ~~return an `impl Iterator` from `generate_lto_work`~~
- ~~don't `collect` `globals`, as they are iterated over and consumed right afterwards~~
While I'm hoping it won't cause an OOM anymore, I would still consider this a "high-risk" PR and not roll it up.
Enable -mergefunc-use-aliases
If the Rust LLVM fork is used, enable the -mergefunc-use-aliases
flag, which will create aliases for merged functions, rather than
inserting a call from one to the other.
A number of codegen tests needed to be adjusted, because functions
that previously fell below the thunk limit are now being merged.
Merging is prevented in various ways now.
I expect that this is going to break something, somewhere, because
it isn't able to deal with aliases properly, but we won't find out
until we try :)
This fixes#52651.
r? @rkruppe
name-anon-globals should always be run at the very end of the pass
pipeline, as optimization passes (in particular mergefunc) may
introduce new anonymous globals.
I believe we did not run into this earlier because it requires the
rather specific combination of a) mergefunc merging two weak functions
b) compilation not using thinlto.
If the Rust LLVM fork is used, enable the -mergefunc-use-aliases
flag, which will create aliases for merged functions, rather than
inserting a call from one to the other.
A number of codegen tests needed to be adjusted, because functions
that previously fell below the thunk limit are now being merged.
Merging is prevented either using -C no-prepopulate-passes, or by
making the functions non-identical.
I expect that this is going to break something, somewhere, because
it isn't able to deal with aliases properly, but we won't find out
until we try :)
This fixes#52651.
libcore: Add VaList and variadic arg handling intrinsics
## Summary
- Add intrinsics for `va_start`, `va_end`, `va_copy`, and `va_arg`.
- Add `core::va_list::VaList` to `libcore`.
Part 1 of (at least) 3 for #44930
Comments and critiques are very much welcomed 😄