This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
rustc: Default 32 codegen units at O0
This commit changes the default of rustc to use 32 codegen units when compiling
in debug mode, typically an opt-level=0 compilation. Since their inception
codegen units have matured quite a bit, gaining features such as:
* Parallel translation and codegen enabling codegen units to get worked on even
more quickly.
* Deterministic and reliable partitioning through the same infrastructure as
incremental compilation.
* Global rate limiting through the `jobserver` crate to avoid overloading the
system.
The largest benefit of codegen units has forever been faster compilation through
parallel processing of modules on the LLVM side of things, using all the cores
available on build machines that typically have many available. Some downsides
have been fixed through the features above, but the major downside remaining is
that using codegen units reduces opportunities for inlining and optimization.
This, however, doesn't matter much during debug builds!
In this commit the default number of codegen units for debug builds has been
raised from 1 to 32. This should enable most `cargo build` compiles that are
bottlenecked on translation and/or code generation to immediately see speedups
through parallelization on available cores.
Work is being done to *always* enable multiple codegen units (and therefore
parallel codegen) but it requires #44841 at least to be landed and stabilized,
but stay tuned if you're interested in that aspect!
This commit changes the default of rustc to use 32 codegen units when compiling
in debug mode, typically an opt-level=0 compilation. Since their inception
codegen units have matured quite a bit, gaining features such as:
* Parallel translation and codegen enabling codegen units to get worked on even
more quickly.
* Deterministic and reliable partitioning through the same infrastructure as
incremental compilation.
* Global rate limiting through the `jobserver` crate to avoid overloading the
system.
The largest benefit of codegen units has forever been faster compilation through
parallel processing of modules on the LLVM side of things, using all the cores
available on build machines that typically have many available. Some downsides
have been fixed through the features above, but the major downside remaining is
that using codegen units reduces opportunities for inlining and optimization.
This, however, doesn't matter much during debug builds!
In this commit the default number of codegen units for debug builds has been
raised from 1 to 32. This should enable most `cargo build` compiles that are
bottlenecked on translation and/or code generation to immediately see speedups
through parallelization on available cores.
Work is being done to *always* enable multiple codegen units (and therefore
parallel codegen) but it requires #44841 at least to be landed and stabilized,
but stay tuned if you're interested in that aspect!
Make `-Z borrowck-mir` imply that `EndRegion`'s should be emitted.
Before this change, the `-Z borrowck-mir` flag is useless if you do not also pass `-Z emit-end-regions`.
So, in the same spirit as f2892ad281, make `-Z borrowck-mir` also emit `EndRegion` statements. (This will hopefully avoid some initial speed bumps for new-comers helping out with NLL.)
`--cap-lints allow` switches off `can_emit_warnings`
This boolean field on the error `Handler` is toggled to silence
warnings when `-A warnings` is passed. (This is actually a separate
mechanism from the global lint level—whether there's some redundancy
to be factored away here is an important question, but not one we
concern ourselves with in this commit.) But the same rationale
applies for `--cap-lints allow`. In particular, this makes the "soft"
feature-gate warning introduced in 8492ad24 (which is not a lint, but
just calls `struct_span_warn`) not pollute the builds of dependent
crates.
Thanks to @kennytm for pointing out the potential of
`can_emit_warnings` for this purpose.
Resolves#44213.
Add proper help line for `-C inline threshold`
Looks like someone accidentally some words when adding this.
This also remove a period on a different help line for consistency, as no options have a period.
This boolean field on the error `Handler` is toggled to silence
warnings when `-A warnings` is passed. (This is actually a separate
mechanism from the global lint level—whether there's some redundancy
to be factored away here is an important question, but not one we
concern ourselves with in this commit.) But the same rationale
applies for `--cap-lints allow`. In particular, this makes the "soft"
feature-gate warning introduced in 8492ad24 (which is not a lint, but
just calls `struct_span_warn`) not pollute the builds of dependent
crates.
Thanks to @kennytm for pointing out the potential of
`can_emit_warnings` for this purpose.
Resolves#44213.
This commit removes the `dep_graph` field from the `Session` type according to
issue #44390. Most of the fallout here was relatively straightforward and the
`prepare_session_directory` function was rejiggered a bit to reuse the results
in the later-called `load_dep_graph` function.
Closes#44390
The main intent is to fix cases where EndRegion emission is believed
to be causing excess peak memory pressure.
It may also be a welcome change to people inspecting the MIR output
who find the EndRegions to be a distraction.
Compact display of static lib dependencies
Fixes#33173
Instead of displaying one dependency per line, I've changed the format to display them all in one line.
As a bonus they're in format of linker flags (`-lfoo`), so the output can be copy&pasted if one is actually going to link as suggested.
This feature allows targets to opt in to full support of the crt-static
feature. Currently, crt-static is allowed on all targets, even those
that really can't or really shouldn't support it. This works because it
is very loose in the specification of its effects. Changing the behavior
of crt-static to be more strict in how it chooses libraries and links
executables would likely cause compilation to fail on these platforms.
To avoid breaking existing uses of crt-static, whitelist targets that
support the new, stricter behavior. For all other targets, this changes
crt-static from being "mostly a no-op" to "explicitly a no-op".
One can either use `-Z borrowck-mir` or add the `#[rustc_mir_borrowck]` attribute
to opt into MIR based borrow checking.
Note that regardless of whether one opts in or not, AST-based borrow
check will still run as well. The errors emitted from AST-based
borrow check will include a "(Ast)" suffix in their error message,
while the errors emitted from MIR-based borrow check will include a
"(Mir)" suffix.
post-rebase: removed check for intra-statement mutual conflict;
replaced with assertion checking that at most one borrow is generated
per statement.
post-rebase: removed dead code: `IdxSet::pairs` and supporting stuff.
In preparation for incremental compilation this commit refactors the lint
handling infrastructure in the compiler to be more "eager" and overall more
incremental-friendly. Many passes of the compiler can emit lints at various
points but before this commit all lints were buffered in a table to be emitted
at the very end of compilation. This commit changes these lints to be emitted
immediately during compilation using pre-calculated lint level-related data
structures.
Linting today is split into two phases, one set of "early" lints run on the
`syntax::ast` and a "late" set of lints run on the HIR. This commit moves the
"early" lints to running as late as possible in compilation, just before HIR
lowering. This notably means that we're catching resolve-related lints just
before HIR lowering. The early linting remains a pass very similar to how it was
before, maintaining context of the current lint level as it walks the tree.
Post-HIR, however, linting is structured as a method on the `TyCtxt` which
transitively executes a query to calculate lint levels. Each request to lint on
a `TyCtxt` will query the entire crate's 'lint level data structure' and then go
from there about whether the lint should be emitted or not.
The query depends on the entire HIR crate but should be very quick to calculate
(just a quick walk of the HIR) and the red-green system should notice that the
lint level data structure rarely changes, and should hopefully preserve
incrementality.
Overall this resulted in a pretty big change to the test suite now that lints
are emitted much earlier in compilation (on-demand vs only at the end). This in
turn necessitated the addition of many `#![allow(warnings)]` directives
throughout the compile-fail test suite and a number of updates to the UI test
suite.
Add MIR Validate statement
This adds statements to MIR that express when types are to be validated (following [Types as Contracts](https://internals.rust-lang.org/t/types-as-contracts/5562)). Obviously nothing is stabilized, and in fact a `-Z` flag has to be passed for behavior to even change at all.
This is meant to make experimentation with Types as Contracts in miri possible. The design is definitely not final.
Cc @nikomatsakis @aturon
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.
[RFC 1974]: https://github.com/rust-lang/rfcs/pull/197
The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.
cc #27389
Prior to this PR, when we aborted because a "critical pass" failed, we
displayed the number of errors from that critical pass. While that's the
number of errors that caused compilation to abort in *that place*,
that's not what people really want to know. Instead, always report the
total number of errors, and don't bother to track the number of errors
from the last pass that failed.
This changes the compiler driver API to handle errors more smoothly,
and therefore is a compiler-api-[breaking-change].
Fixes#42793.
This commit integrates the `jobserver` crate into the compiler. The crate was
previously integrated in to Cargo as part of rust-lang/cargo#4110. The purpose
here is to two-fold:
* Primarily the compiler can cooperate with Cargo on parallelism. When you run
`cargo build -j4` then this'll make sure that the entire build process between
Cargo/rustc won't use more than 4 cores, whereas today you'd get 4 rustc
instances which may all try to spawn lots of threads.
* Secondarily rustc/Cargo can now integrate with a foreign GNU `make` jobserver.
This means that if you call cargo/rustc from `make` or another
jobserver-compatible implementation it'll use foreign parallelism settings
instead of creating new ones locally.
As the number of parallel codegen instances in the compiler continues to grow
over time with the advent of incremental compilation it's expected that this'll
become more of a problem, so this is intended to nip concurrent concerns in the
bud by having all the tools to cooperate!
Note that while rustc has support for itself creating a jobserver it's far more
likely that rustc will always use the jobserver configured by Cargo. Cargo today
will now set a jobserver unconditionally for rustc to use.
This commit deletes the in-tree `getopts` crate in favor of the crates.io-based
`getopts` crate. The main difference here is with a new builder-style API, but
otherwise everything else remains relatively standard.
MIR EndRegion Statements (was MIR dataflow for Borrows)
This PR adds an `EndRegion` statement to MIR (where the `EndRegion` statement is what terminates a borrow).
An earlier version of the PR implemented a dataflow analysis on borrow expressions, but I am now factoring that into a follow-up PR so that reviewing this one is easier. (And also because there are some revisions I want to make to that dataflow code, but I want this PR to get out of WIP status...)
This is a baby step towards MIR borrowck. I just want to get the review process going while I independently work on the remaining steps.
save-analysis: remove a lot of stuff
This commits us to the JSON format and the more general def/ref style of output, rather than also supporting different data formats for different data structures. This does not affect the RLS at all, but will break any clients of the CSV form - AFAIK there are none (beyond a few of my own toy projects) - DXR stopped working long ago.
r? @eddyb