Commit graph

43 commits

Author SHA1 Message Date
Bastian Köcher
a8a9a05abb Convert codegen-unit tests to use start instead of main
The new Termination traits brings in the unwinding machinery and that
blows up the required `TRANS_ITEM`s.
2017-12-26 12:26:39 +01:00
Bastian Köcher
8f539b09df Fixes codegen-units tests 2017-12-26 12:26:39 +01:00
Michael Woerister
6ae60ea94e Add regression tests for non-instantiation of inline and const fns. 2017-11-07 08:54:38 +01:00
Michael Woerister
7bb9353dd5 Update codegen-unit tests. 2017-11-07 08:54:38 +01:00
Alex Crichton
5d415e8d08 rustc: Handle #[inline(always)] at -O0
This commit updates the handling of `#[inline(always)]` functions at -O0 to
ensure that it's always inlined regardless of the number of codegen units used.

Closes #45201
2017-10-11 17:12:29 -07:00
Alex Crichton
4b2bdf7b54 rustc: Don't inline in CGUs at -O0
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.

The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.

Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.

One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.

[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
2017-10-07 19:09:46 -07:00
Alex Crichton
4ca1b19fde rustc: Implement ThinLTO
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.

This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:

* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
  single compilation. That is, we won't load upstream rlibs, but we'll instead
  just perform ThinLTO amongst all codegen units produced by the compiler for
  the local crate. This is intended to emulate a desired end point where we have
  codegen units turned on by default for all crates and ThinLTO allows us to do
  this without performance loss.

* In anther mode, like full LTO today, we'll optimize all upstream dependencies
  in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
  should finish much more quickly.

There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:

* Controlling parallelism means we can use the existing jobserver support to
  avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
  integrates with our own incremental strategy, but this is yet to be
  determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
  having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
  creation, where all our options we used today aren't necessarily supported by
  upstream LLVM yet.

My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
2017-10-07 08:17:52 -07:00
Michael Woerister
c93e62b2c5 Adapt cgu-partitioning tests to pre-trans symbol internalization. 2017-07-13 13:29:25 +02:00
Clar Charr
b9c8e99955 Move Fn to module. 2017-06-09 19:07:25 -04:00
Ariel Ben-Yehuda
f2c7917402 translate drop glue using MIR
Drop of arrays is now translated in trans::block in an ugly way that I
should clean up in a later PR, and does not handle panics in the middle
of an array drop, but this commit & PR are growing too big.
2017-03-18 02:53:08 +02:00
Ariel Ben-Yehuda
bf80fec326 translate function shims using MIR 2017-03-18 02:53:06 +02:00
Michael Woerister
5f90947c2c trans: Treat generics like regular functions, not like #[inline] functions during CGU partitioning. 2017-01-09 10:06:58 -05:00
Jeffrey Seyfried
dfa69be38a Fix fallout in tests. 2016-09-27 06:43:51 +00:00
bors
1cf592fa40 Auto merge of #36551 - eddyb:meta-games, r=nikomatsakis
Refactor away RBML from rustc_metadata.

RBML and `ty{en,de}code` have had their long-overdue purge. Summary of changes:
* Metadata is now a tree encoded in post-order and with relative backward references pointing to children nodes. With auto-deriving and type safety, this makes maintenance and adding new information to metadata painless and bug-free by default. It's also more compact and cache-friendly (cache misses should be proportional to the depth of the node being accessed, not the number of siblings as in EBML/RBML).
* Metadata sizes have been reduced, for `libcore` it went down 16% (`8.38MB` -> `7.05MB`) and for `libstd` 14% (`3.53MB` -> `3.03MB`), while encoding more or less the same information
* Specialization is used in the bundled `libserialize` (crates.io `rustc_serialize` remains unaffected) to customize the encoding (and more importantly, decoding) of various types, most notably those interned in the `TyCtxt`. Some of this abuses a soundness hole pending a fix (cc @aturon), but when that fix arrives, we'll move to macros 1.1 `#[derive]` and custom `TyCtxt`-aware serialization traits.
* Enumerating children of modules from other crates is now orthogonal to describing those items via `Def` - this is a step towards bridging crate-local HIR and cross-crate metadata
* `CrateNum` has been moved to `rustc` and both it and `NodeId` are now newtypes instead of `u32` aliases, for specializing their decoding. This is `[syntax-breaking]` (cc @Manishearth ).

cc @rust-lang/compiler
2016-09-21 19:17:24 -07:00
bors
5cc6c6b1b7 Auto merge of #36524 - michaelwoerister:trans-inline-only-on-demand, r=nikomatsakis
trans: Only instantiate #[inline] functions in codegen units referencing them

This PR changes how `#[inline]` functions are translated. Before, there was one "master instance" of the function with `external` linkage and a number of on-demand instances with `available_externally` linkage in each codegen unit that referenced the function. This had two downsides:

* Public functions marked with `#[inline]` would be present in machine code of libraries unnecessarily (see #36280 for an example)
* LLVM would crash on `i686-pc-windows-msvc` due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like `available_externally` there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with `internal` linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the `internal` instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler
2016-09-21 01:33:37 -07:00
Eduard Burtescu
4ac30013c3 rustc_trans: don't do on-demand drop glue instantiation. 2016-09-20 20:30:55 +03:00
Michael Woerister
cf976fe2cd Adapt codegen-unit test cases to new behaviour 2016-09-15 22:09:49 -04:00
Michael Woerister
1b3a588f55 trans: Let the collector find drop-glue for all vtables, not just VTableImpl. 2016-09-13 22:11:01 -04:00
Eduard Burtescu
b354ae95a2 rustc: move the SelfSpace before TypeSpace in Substs. 2016-08-17 05:50:57 +03:00
Michael Woerister
09e73a5b03 Make the translation item collector handle *uses* of 'const' items instead of declarations. 2016-08-12 12:07:51 -04:00
Michael Woerister
1c03bfe3b4 trans: Adjust linkage assignment so that we don't need weak linkage. 2016-07-08 10:42:48 -04:00
Michael Woerister
6c8c94b848 Improve linkage assignment in trans::partitioning. 2016-07-08 10:42:47 -04:00
Michael Woerister
2cd8cf92fc Ignore closure-related translation item collection tests. 2016-07-08 10:42:46 -04:00
Ariel Ben-Yehuda
68129a682a fix codegen-units fallout 2016-06-16 09:26:44 +03:00
Ariel Ben-Yehuda
4248269f8a fix fallout in tests 2016-06-04 13:26:37 +03:00
Michael Woerister
4386d19185 trans::collector: Remove some redundant calls to erase_regions(). 2016-05-23 10:21:50 -04:00
Michael Woerister
64bc3c266c trans: Make collector handle the drop_in_place() intrinsic. 2016-05-11 14:30:33 -04:00
James Miller
f4dd4be86a Add test for collecting items in statics 2016-05-11 13:59:14 -04:00
Michael Woerister
85b155f6f1 trans: Don't try to place declarations during codegen unit partitioning. 2016-05-11 13:58:23 -04:00
Steve Klabnik
aa63f54e37 Rollup merge of #33438 - birkenfeld:dup-words, r=steveklabnik
Fix some some duplicate words.
2016-05-07 15:35:19 -04:00
Niko Matsakis
8b1941a783 s/aux/auxiliary, because windows
For legacy reasons (presumably), Windows does not permit files name aux.
2016-05-06 16:24:48 -04:00
Niko Matsakis
fbc082dcc6 move auxiliary builds to a test-relative aux
Instead of finding aux-build files in `auxiliary`, we now search for an
`aux` directory relative to the test. So if your test is
`compile-fail/foo.rs`, we would look in `compile-fail/aux`.  Similarly,
we ignore the `aux` directory when searching for tets.
2016-05-06 16:24:48 -04:00
Georg Brandl
26eb2bef25 Fix some some duplicate words. 2016-05-05 21:12:37 +02:00
Michael Woerister
a4128e5950 Fix a race condition caused by concurrently executed codegen unit tests. 2016-05-01 13:53:39 -04:00
Michael Woerister
0fc9f9a200 Make the codegen unit partitioner also emit item declarations. 2016-04-28 16:53:00 -04:00
Michael Woerister
7f04d35cc6 Add FixedUnitCount codegen unit partitioning strategy. 2016-04-28 14:36:34 -04:00
Michael Woerister
c61f22932d Let the translation item collector make a distinction between drop-glue kinds 2016-04-28 14:36:34 -04:00
James Miller
0e3b37a52e Fix codegen-units tests
I'm not sure what the signficance of `drop-glue i8` is, nor why one of
the tests had it appear while the others had it disappear. Either way it
doesn't seem like the presence or absense of it is the focus of the
tests.
2016-04-28 13:18:51 +12:00
James Miller
869172305f Fixup tests
The drop glue for `i8` is no longer generated as a trans item
2016-04-28 13:18:51 +12:00
Michael Woerister
e8441b6784 Add initial version of codegen unit partitioning for incremental compilation. 2016-04-15 10:05:53 -04:00
Michael Woerister
a2217ddb58 Move translation-item-collection tests into subfolder. 2016-04-14 14:00:58 -04:00
Niko Matsakis
6056c5fbed fallout: update codegen-units tests 2016-03-25 14:07:19 -04:00
Michael Woerister
862911df9a Implement the translation item collector.
The purpose of the translation item collector is to find all monomorphic instances of functions, methods and statics that need to be translated into LLVM IR in order to compile the current crate.
So far these instances have been discovered lazily during the trans path. For incremental compilation we want to know the set of these instances in advance, and that is what the trans::collect module provides.
In the future, incremental and regular translation will be driven by the collector implemented here.
2016-01-26 10:17:45 -05:00