rustc: Implement ThinLTO
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
Fnty args rustdoc
Fixes#44570.
cc @QuietMisdreavus
cc @rust-lang/dev-tools
Considering the impact on the `hir` libs, I'll put @eddyb as reviewer.
r? @eddyb
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
Improve resolution of associated types in declarative macros 2.0
Make various identifier comparisons for associated types (and sometimes other associated items) hygienic.
Now declarative macros 2.0 can use `Self::AssocTy`, `TyParam::AssocTy`, `Trait<AssocTy = u8>` where `AssocTy` is an associated type of a trait `Trait` visible from the macro. Also, `Trait` can now be implemented inside the macro and specialization should work properly (fixes https://github.com/rust-lang/rust/pull/40847#issuecomment-310867299).
r? @jseyfried or @eddyb
incr compilation struct_defs.rs
I am prematurely openeing this as I need mentoring help from @michaelwoerister (also pinged @nikomatsakis)
First, is this the right approach for these changes?
Second, I'm a bit confused by the results so far.
- Changing `TupleStructFieldType(i32)` -> `...(u32)` changes only Hir and HirBody, not TypeOfItem
- Chaning `TupleStructAddField(i32)` -> `...(i32, u32)` *does* change TypeOfItem
This seems wrong. I feel like it should change TypeOfItem in both cases. Is this a bug in incr compilation or is it expected?
incr.comp.: Switch to red/green change tracking, remove legacy system.
This PR finally switches incremental compilation to [red/green tracking](https://github.com/rust-lang/rust/issues/42293) and completely removes the legacy dependency graph implementation -- which includes a few quite costly passes that are simply not needed with the new system anymore.
There's still some documentation to be done and there's certainly still lots of optimizing and tuning ahead -- but the foundation for red/green is in place with this PR. This has been in the making for a long time `:)`
r? @nikomatsakis
cc @alexcrichton, @rust-lang/compiler
let htmldocck.py check for directories
Since i messed this up during https://github.com/rust-lang/rust/pull/44613, i wanted to codify this into the rustdoc tests to make sure that doesn't happen again.
MIR borrowck: move span_label to `borrowck_errors.rs`
The calls to `span_label` are moved and factorized for:
* E0503 (`cannot_use_when_mutably_borrowed()`)
* E0506 (`cannot_assign_to_borrowed()`)
Additionnally, the error E0594 (`cannot_assign_static()`) has been factorized between `check_loan.rs` and `borrowc_check.rs`.
Part of #44596
Fix native main() signature on 64bit
Hello,
in LLVM-IR produced by rustc on x86_64-linux-gnu, the native main() function had incorrect types for the function result and argc parameter: i64, while it should be i32 (really c_int). See also #20064, #29633.
So I've attempted a fix here. I tested it by checking the LLVM IR produced with --target x86_64-unknown-linux-gnu and i686-unknown-linux-gnu. Also I tried running the tests (`./x.py test`), however I'm getting two failures with and without the patch, which I'm guessing is unrelated.
rustc: Enable LTO and multiple codegen units
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
adding E0623 for return types - both parameters are anonymous
This is a fix for #44018
```
error[E0621]: explicit lifetime required in the type of `self`
--> $DIR/ex3-both-anon-regions-return-type-is-anon.rs:17:5
|
16 | fn foo<'a>(&self, x: &i32) -> &i32 {
| ---- ----
| |
| this parameter and the return type are
declared with different lifetimes...
17 | x
| ^ ...but data from `x` is returned here
error: aborting due to previous error
```
It also works for the below case where we have self as anonymous
```
error[E0623]: lifetime mismatch
--> src/test/ui/lifetime-errors/ex3-both-anon-regions-self-is-anon.rs:17:19
|
16 | fn foo<'a>(&self, x: &Foo) -> &Foo {
| ---- ----
| |
| this parameter and the return type are
declared with different lifetimes...
17 | if true { x } else { self }
| ^ ...but data from `x` is returned here
error: aborting due to previous error
```
r? @nikomatsakis
Currently, I have enabled E0621 where return type and self are anonymous, hence WIP.
First step toward implementing impl Trait in argument position
First step implementing #44721.
Add a flag to hir and ty TypeParameterDef and raise an error when using
explicit type parameters when calling a function using impl Trait in
argument position.
I don't know if there is a procedure to add an error code so I just took an available code. Is that ok ?
r? @nikomatsakis
rustc: Default 32 codegen units at O0
This commit changes the default of rustc to use 32 codegen units when compiling
in debug mode, typically an opt-level=0 compilation. Since their inception
codegen units have matured quite a bit, gaining features such as:
* Parallel translation and codegen enabling codegen units to get worked on even
more quickly.
* Deterministic and reliable partitioning through the same infrastructure as
incremental compilation.
* Global rate limiting through the `jobserver` crate to avoid overloading the
system.
The largest benefit of codegen units has forever been faster compilation through
parallel processing of modules on the LLVM side of things, using all the cores
available on build machines that typically have many available. Some downsides
have been fixed through the features above, but the major downside remaining is
that using codegen units reduces opportunities for inlining and optimization.
This, however, doesn't matter much during debug builds!
In this commit the default number of codegen units for debug builds has been
raised from 1 to 32. This should enable most `cargo build` compiles that are
bottlenecked on translation and/or code generation to immediately see speedups
through parallelization on available cores.
Work is being done to *always* enable multiple codegen units (and therefore
parallel codegen) but it requires #44841 at least to be landed and stabilized,
but stay tuned if you're interested in that aspect!
Point at signature on unused lint
```
warning: struct is never used: `Struct`
--> $DIR/unused-warning-point-at-signature.rs:22:1
|
22 | struct Struct {
| ^^^^^^^^^^^^^
```
Fix#33961.