Cosmetic improvements to doc comments
This has been factored out from https://github.com/rust-lang/rust/pull/58036 to only include changes to documentation comments (throughout the rustc codebase).
r? @steveklabnik
Once you're happy with this, maybe we could get it through with r=1, so it doesn't constantly get invalidated? (I'm not sure this will be an issue, but just in case...) Anyway, thanks for your advice so far!
Use LLVM intrinsics for saturating add/sub
Use the `[su](add|sub).sat` LLVM intrinsics, if we're compiling against LLVM 8, as they should optimize and codegen better than IR based on `[su](add|sub).with.overlow`.
For the fallback for LLVM < 8 I'm using the same expansion that target lowering in LLVM uses, which is not the same as Rust currently uses (in particular due to the use of selects rather than branches).
Fixes#55286.
Fixes#52203.
Fixes#44500.
r? @nagisa
Implement optimize(size) and optimize(speed) attributes
This PR implements both `optimize(size)` and `optimize(speed)` attributes.
While the functionality itself works fine now, this PR is not yet complete: the code might be messy in places and, most importantly, the compiletest must be improved with functionality to run tests with custom optimization levels. Otherwise the new attribute cannot be tested properly. Oh, and not all of the RFC is implemented – attribute propagation is not implemented for example.
# TODO
* [x] Improve compiletest so that tests can be written;
* [x] Assign a proper error number (E9999 currently, no idea how to allocate a number properly);
* [ ] Perhaps reduce the duplication in LLVM attribute assignment code…
The new git submodule src/llvm-project is a monorepo replacing src/llvm
and src/tools/{clang,lld,lldb}. This also serves as a rebase for these
projects to the new 8.x branch from trunk.
The src/llvm-emscripten fork is unchanged for now.
Don't ICE when logging unusual types
MonoItemExt#to_string is used for both debug logging and LLVM symbol
name generation. When debugging, we want to print out any type we
encounter, even if it's something weird like GeneratorWitness. However,
during codegen, we still want to error if we encounter an unexpected
type when generating a name.
To resolve this issue, this commit introduces a new 'debug' parameter to
the relevant methods. When set to 'true', it allows any type to be
printed - when set to 'false', it 'bug!'s when encountering an
unexpected type.
This prevents an ICE when enabling debug logging (via RUST_LOG) while
running rustc on generator-related code.
Remove quote_*! macros
This deletes a considerable amount of test cases, some of which we may want to keep. I'm not entirely certain what the primary intent of many of them was; if we should keep them I can attempt to edit each case to continue compiling without the quote_*! macros involved.
Fixes#46849.
Fixes#12265.
Fixes#12266.
Fixes#26994.
r? @Manishearth
Add intrinsic to create an integer bitmask from a vector mask
This PR adds a new simd intrinsic: `simd_bitmask(vector) -> unsigned integer` that creates an integer bitmask from a vector mask by extracting one bit of each vector lane.
This is required to implement: https://github.com/rust-lang-nursery/packed_simd/issues/166 .
EDIT: the reason we need an intrinsics for this is that we have to truncate the vector lanes to an `<i1 x N>` vector, and then bitcast that to an `iN` integer (while making sure that we only materialize `i8`, ... , `i64` - that is, no `i1`, `i2`, `i4`, types), and we can't do any of that in a Rust library.
r? @rkruppe
MonoItemExt#to_string is used for both debug logging and LLVM symbol
name generation. When debugging, we want to print out any type we
encounter, even if it's something weird like GeneratorWitness. However,
during codegen, we still want to error if we encounter an unexpected
type when generating a name.
To resolve this issue, this commit introduces a new 'debug' parameter to
the relevant methods. When set to 'true', it allows any type to be
printed - when set to 'false', it 'bug!'s when encountering an
unexpected type.
This prevents an ICE when enabling debug logging (via RUST_LOG) while
running rustc on generator-related code.
Issue 57762 points out a compiler crash when the compiler was built
using a stock LLVM 7. LLVM 7 was released without a necessary fix for
a bug in the DWARF discriminant code.
This patch changes rustc to use the fallback mode on (non-Rust) LLVM 7.
Closes#57762
Add a target option "merge-functions", and a corresponding -Z flag (works around #57356)
This commit adds a target option "merge-functions", which takes values in ("disabled", "trampolines", or "aliases" (default is "aliases")), to allow targets to opt out of the MergeFunctions LLVM pass. Additionally, the latest commit also adds an optional -Z flag, "merge-functions", which takes the same values and has precedence over the target option when both are specified.
This works around https://github.com/rust-lang/rust/issues/57356.
cc @eddyb @japaric @oli-obk @nox @nagisa
Also thanks to @denzp and @gnzlbg for discussing this on rust-cuda!
### Motivation
Basically, the problem is that the MergeFunctions pass, which rustc currently enables by default at -O2 and -O3 [1], and `extern "ptx-kernel"` functions (specific to the NVPTX target) are currently not compatible with each other. If the MergeFunctions pass is allowed to run, rustc can generate invalid PTX assembly (i.e. a PTX file that is not accepted by the native PTX assembler `ptxas`). Therefore we would like a way to opt out of the MergeFunctions pass, which is what our target option does.
### Related work
The current behavior of rustc is to enable MergeFunctions at -O2 and -O3 [1], and also to enable the use of function aliases within MergeFunctions [2] [3]. MergeFunctions seems to have some benefits, such as reducing code size and fixing a crash [4], which is why it is enabled. However, MergeFunctions both with and without function aliases is incompatible with the NVPTX target; a more detailed example for both cases is given below.
clang's "solution" is to have a "-fmerge-functions" flag that opts in to the MergeFunctions pass, but it is not enabled by default.
### Examples/more details
Consider an example Rust lib using `extern "ptx-kernel"` functions: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore.rs. If we try to compile this with nightly rustc, we get the following compiler error:
LLVM ERROR: Module has aliases, which NVPTX does not support.
This error happens because: (1) functions `foo` and `bar` have the same body, so are candidates to be merged by MergeFunctions; and (2) rustc configures MergeFunctions to generate function aliases using the "mergefunc-use-aliases" LLVM option [2] [3], but the NVPTX backend does not support those aliases.
Okay, so we can try omitting "mergefunc-use-aliases", and then rustc will happily emit PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-mergefunc-nousealiases-bad.ptx. However, this PTX is invalid! When we try to assemble it with `ptxas` (I'm on the CUDA 9.2 toolchain), we get an assembler error:
ptxas nocore-mergefunc-nousealiases-bad.ptx, line 38; error : Illegal call target, device function expected
ptxas fatal : Ptx assembly aborted due to errors
What's happening is that MergeFunctions rewrites the `bar` function to call `foo`. However, directly calling an `extern "ptx-kernel"` function from another `extern "ptx-kernel"` is wrong.
If we disable the MergeFunctions pass from running at all, rustc generates correct PTX assembly: https://github.com/peterhj/nvptx-mergefunc-bug/blob/master/nocore-nomergefunc-ok.ptx
[1] a36b960df6/src/librustc_codegen_ssa/back/write.rs (L155)
[2] a36b960df6/src/librustc_codegen_llvm/llvm_util.rs (L64)
[3] https://github.com/rust-lang/rust/pull/56358
[4] https://github.com/rust-lang/rust/pull/49479
This was originally attempted in #57048 but it was realized that we
could fully remove the crate via the `"unadjusted"` ABI on intrinsics.
This means that all intrinsics in stdsimd are implemented directly
against LLVM rather than using the abstraction layer provided here. That
ends up meaning that this crate is no longer used at all.
This crate developed long ago to implement the SIMD intrinsics, but we
didn't end up using it in the long run. In that case let's remove it!
"trampolines", or "aliases (the default)) to allow targets to opt out of
the MergeFunctions LLVM pass. Also add a corresponding -Z option with
the same name and values.
This works around: https://github.com/rust-lang/rust/issues/57356
Motivation:
Basically, the problem is that the MergeFunctions pass, which rustc
currently enables by default at -O2 and -O3, and `extern "ptx-kernel"`
functions (specific to the NVPTX target) are currently not compatible
with each other. If the MergeFunctions pass is allowed to run, rustc can
generate invalid PTX assembly (i.e. a PTX file that is not accepted by
the native PTX assembler ptxas). Therefore we would like a way to opt
out of the MergeFunctions pass, which is what our target option does.
Related work:
The current behavior of rustc is to enable MergeFunctions at -O2 and -O3,
and also to enable the use of function aliases within MergeFunctions.
MergeFunctions both with and without function aliases is incompatible with
the NVPTX target.
clang's "solution" is to have a "-fmerge-functions" flag that opts in to
the MergeFunctions pass, but it is not enabled by default.
This flag inserts `mcount` function call to the beginning of every function
after inline processing. So tracing tools like uftrace [1] (or ftrace for
Linux kernel modules) have a chance to examine function calls.
It is similar to the `-pg` flag provided by gcc or clang, but without
generating a `__gmon_start__` function for executables. If a program
runs without being traced, no `gmon.out` will be written to disk.
Under the hood, it simply adds `"instrument-function-entry-inlined"="mcount"`
attribute to every function. The `post-inline-ee-instrument` LLVM pass does
the actual job.
[1]: https://github.com/namhyung/uftrace
librustc_codegen_llvm: Don't eliminate empty structs in C ABI on linux-sparc64
This is in accordance with the SPARC Compliance Definition 2.4.1,
Page 3P-12. It says that structs of up to 8 bytes (which applies
to empty structs as well) are to be passed in one register.
This is in accordance with the SPARC Compliance Definition 2.4.1,
Page 3P-12. It says that structs of up to 8 bytes (which applies
to empty structs as well) are to be passed in one register.
Instead disable creation of assumptions during inlining using an
LLVM opt flag.
The -Z arg-align-attributes option which previously controlled this
behavior is removed.
Search other library paths when loking for link objects
Support the case when link objects are not located in Rust sysroot
but in other locations which could be specify through library paths.
Bump minimum required LLVM version to 6.0
Based on the discussion in #55842, while the overall position of Rust wrt LLVM continues to be contentious, there does seem to be a consensus that there is no need for continued support of LLVM 5. This PR bumps our version requirement to LLVM 6.0 and makes Travis run against that.
I hope that this is going to unblock #52694. If I understand correctly, while this issue still exists in LLVM 6, Ubuntu has backported the relevant patch.
r? @alexcrichton
rustc: Add the `cmpxchg16b` target feature on x86/x86_64
This appears to be called `cx16` in LLVM and a few other locations, but
the Intel Intrinsic Guide doesn't have a name for this and the CPU
manual from Intel only mentions `cmpxchg16b`, so that's the name chosen
here.
This appears to be called `cx16` in LLVM and a few other locations, but
the Intel Intrinsic Guide doesn't have a name for this and the CPU
manual from Intel only mentions `cmpxchg16b`, so that's the name chosen
here.
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
rustc: Add an unstable `simd_select_bitmask` intrinsic
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
Unconditionally emit the target-cpu LLVM attribute.
This PR makes `rustc` always emit the `target-cpu` LLVM attribute for functions. The goal is to allow for cross-language inlining of functions defined in `libstd`. So far `libstd` functions were the only function without a `target-cpu` attribute, so in whole-crate-graph cross-lang LTO scenarios they were not eligible for inlining into foreign code.
r? @alexcrichton
This is going to be required for binding a number of AVX-512 intrinsics
in the `stdsimd` repository, and this intrinsic is the same as
`simd_select` except that it takes a bitmask as the first argument
instead of a SIMD vector. This bitmask is then transmuted into a `<NN x
i8>` argument, depending on how many bits it is.
cc rust-lang-nursery/stdsimd#310
Overhaul `FileSearch` and `SearchPaths`
`FileSearch::search()` traverses one or more directories. For each
directory it generates a `Vec<PathBuf>` containing one element per file
in that directory.
In some benchmarks this occurs enough that the allocations done for the
`PathBuf`s are significant, and in practice a small number of
directories are being traversed over and over again. For example, when
compiling the `tokio-webpush-simple` benchmark, two directories are
traversed 58 times each. Each of these directories have more than 100
files.
We can do all the necessary traversals up front, when `Session` is created,
and get the `Vec<PathBuf>`s then.
This reduces instruction counts on several benchmarks by 1--5%.
r? @alexcrichton
CC @eddyb, @michaelwoerister, @nikomatsakis