rework the README.md for rustc and add other readmes
OK, so, long ago I committed to the idea of trying to write some high-level documentation for rustc. This has proved to be much harder for me to get done than I thought it would! This PR is far from as complete as I had hoped, but I wanted to open it so that people can give me feedback on the conventions that it establishes. If this seems like a good way forward, we can land it and I will open an issue with a good check-list of things to write (and try to take down some of them myself).
Here are the conventions I established on which I would like feedback.
**Use README.md files**. First off, I'm aiming to keep most of the high-level docs in `README.md` files, rather than entries on forge. My thought is that such files are (a) more discoverable than forge and (b) closer to the code, and hence can be edited in a single PR. However, since they are not *in the code*, they will naturally get out of date, so the intention is to focus on the highest-level details, which are least likely to bitrot. I've included a few examples of common functions and so forth, but never tried to (e.g.) exhaustively list the names of functions and so forth.
- I would like to use the tidy scripts to try and check that these do not go out of date. Future work.
**librustc/README.md as the main entrypoint.** This seems like the most natural place people will look first. It lays out how the crates are structured and **is intended** to give pointers to the main data structures of the compiler (I didn't update that yet; the existing material is terribly dated).
**A glossary listing abbreviations and things.** It's much harder to read code if you don't know what some obscure set of letters like `infcx` stands for.
**Major modules each have their own README.md that documents the high-level idea.** For example, I wrote some stuff about `hir` and `ty`. Both of them have many missing topics, but I think that is roughly the level of depth that would be good. The idea is to give people a "feeling" for what the code does.
What is missing primarily here is lots of content. =) Here are some things I'd like to see:
- A description of what a QUERY is and how to define one
- Some comments for `librustc/ty/maps.rs`
- An overview of how compilation proceeds now (i.e., the hybrid demand-driven and forward model) and how we would like to see it going in the future (all demand-driven)
- Some coverage of how incremental will work under red-green
- An updated list of the major IRs in use of the compiler (AST, HIR, TypeckTables, MIR) and major bits of interesting code (typeck, borrowck, etc)
- More advice on how to use `x.py`, or at least pointers to that
- Good choice for `config.toml`
- How to use `RUST_LOG` and other debugging flags (e.g., `-Zverbose`, `-Ztreat-err-as-bug`)
- Helpful conventions for `debug!` statement formatting
cc @rust-lang/compiler @mgattozzi
Fix a typo in rustc help menu
Change from native-static-deps to native-static-libs.
Fix a typo introduced by this merged pull request: https://github.com/rust-lang/rust/pull/43067
Right now the HIR contains raw `syntax::ast::Attribute` structure but nowadays
these can contain arbitrary tokens. One variant of the `Token` enum is an
"interpolated" token which basically means to shove all the tokens for a
nonterminal in this position. A "nonterminal" in this case is roughly analagous
to a macro argument:
macro_rules! foo {
($a:expr) => {
// $a is a nonterminal as an expression
}
}
Currently nonterminals contain namely items and expressions, and this poses a
problem for incremental compilation! With incremental we want a stable hash of
all HIR items, but this means we may transitively need a stable hash *of the
entire AST*, which is certainly not stable w/ node ids and whatnot. Hence today
there's a "bug" where the "stable hash" of an AST is just the raw hash value of
the AST, and this only arises with interpolated nonterminals. The downside of
this approach, however, is that a bunch of errors get spewed out during
compilation about how this isn't a great idea.
This PR is focused at fixing these warnings, basically deleting them from the
compiler. The implementation here is to alter attributes as they're lowered from
the AST to HIR, expanding all nonterminals in-place as we see them. This code
for expanding a nonterminal to a token stream already exists for the
`proc_macro` crate, so we basically just reuse the same implementation there.
After this PR it's considered a bug to have an `Interpolated` token and hence
the stable hash implementation simply uses `bug!` in this location.
Closes#40946
Add proper help line for `-C inline threshold`
Looks like someone accidentally some words when adding this.
This also remove a period on a different help line for consistency, as no options have a period.
incr.comp.: Compute fingerprint for all query results.
This PR enables query result fingerprinting in incremental mode. This is an essential piece of infrastructure for red/green tracking. We don't do anything with the fingerprints yet but merging the infrastructure should protect it from bit-rotting and will make it easier to start measuring its performance impact (and thus let us determine if we should switch to a faster hashing algorithm rather sooner than later).
Note, this PR also includes the changes from https://github.com/rust-lang/rust/pull/43887 which I'm therefore closing. No need to re-review the first commit though.
r? @nikomatsakis
- Don't hash traits in scope as part of HIR hashing any more.
- Some queries returned DefIndexes from other crates.
- Provide a generic way of stably hashing maps (not used everywhere yet).
This makes sure that we don't introduce strange cases where we have
nodes outside the query system that could break red/green tracking
and it will allow to keep red/green neatly encapsulated within the
DepGraph implementation.
This commit moves the actual code generation in the compiler behind a query
keyed by a codegen unit's name. This ended up entailing quite a few internal
refactorings to enable this, along with a few cut corners:
* The `OutputFilenames` structure is now tracked in the `TyCtxt` as it affects a
whole bunch of trans and such. This is now behind a query and threaded into
the construction of the `TyCtxt`.
* The `TyCtxt` now has a channel "out the back" intended to send data to worker
threads in rustc_trans. This is used as a sort of side effect of the codegen
query but morally what's happening here is the return value of the query
(currently unit but morally a path) is only valid once the background threads
have all finished.
* Dispatching work items to the codegen threads was refactored to only rely on
data in `TyCtxt`, which mostly just involved refactoring where data was
stored, moving it from the translation thread to the controller thread's
`CodegenContext` or the like.
* A new thread locals was introduced in trans to work around the query
system. This is used in the implementation of `assert_module_sources` which
looks like an artifact of the old query system and will presumably go away
once red/green is up and running.
This commit attaches a channel to the LLVM workers to the `TyCtxt` which will
later be used during the codegen query to actually send work to LLVM workers.
Otherwise this commit is just plumbing this channel throughout the compiler to
ensure it reaches the right consumers.
This commit removes the `crate_trans_items` field from the `CrateContext` of
trans. This field, a big map, was calculated during partioning and was a set of
all translation items. This isn't quite incremental-friendly because the map may
change a lot but not have much effect on downstream consumers.
Instead a new query was added for the one location this map was needed, along
with a new comment explaining what the location is doing!
This is a big map that ends up inside of a `CrateContext` during translation for
all codegen units. This means that any change to the map may end up causing an
incremental recompilation of a codegen unit! In order to reduce the amount of
dependencies here between codegen units and the actual input crate this commit
refactors dealing with exported symbols and such into various queries.
The new queries are largely based on existing queries with filled out
implementations for the local crate in addition to external crates, but the main
idea is that while translating codegen untis no unit needs the entire set of
exported symbols, instead they only need queries about particulare `DefId`
instances every now and then.
The linking stage, however, still generates a full list of all exported symbols
from all crates, but that's going to always happen unconditionally anyway, so no
news there!
Otherwise we may emit double errors related to the `#[export_name]` attribute,
for example, and using a query should ensure that it's only emitted at most
once.
This commit moves the `collect_and_partition_translation_items` function into a
query on `TyCtxt` instead of a free function in trans, allowing us to track
dependencies and such of the function.
This commit moves the definition of the `ExportedSymbols` structure to the
`rustc` crate and then creates a query that'll be used to construct the
`ExportedSymbols` set. This in turn uses the reachablity query exposed in the
previous commit.
rustc: Preallocate when building the dep graph
This commit alters the `query` function in the dep graph module to preallocate
memory using `with_capacity` instead of relying on automatic growth. Discovered
in #44576 it was found that for the syntex_syntax clean incremental benchmark
the peak memory usage was found when the dep graph was being saved, particularly
the `DepGraphQuery` data structure itself. PRs like #44142 which add more
queries end up just making this much larger!
I didn't see an immediately obvious way to reduce the size of the
`DepGraphQuery` object, but it turns out that `with_capacity` helps quite a bit!
Locally 831 MB was used [before] this commit, and 770 MB is in use at the peak
of the compiler [after] this commit. That's a nice 7.5% improvement! This won't
quite make up for the losses in #44142 but I figured it's a good start.
[before]: https://gist.github.com/alexcrichton/2d2b9c7a65503761925c5a0bcfeb0d1e
[before]: https://gist.github.com/alexcrichton/6da51f2a6184bfb81694cc44f06deb5b
bring TyCtxt into scope
got comments both from @eddyb and @nikomatsakis (via https://github.com/rust-lang/rust/pull/44505) that we should always put `TyCtxt` in scope
should I just go and import it at other places in the codebase or we just keep doing small improvements?
Individualize feature gates for const fn invocation
This PR changes the meaning of `#![feature(const_fn)]` so it is only required to declare a const fn but not to call one. Based on discussion at #24111. I was hoping we could have an FCP here in order to move that conversation forward.
This sets the stage for future stabilization of the constness of several functions in the standard library (listed below), so could someone please tag the lang team for review.
- `std::cell`
- `Cell::new`
- `RefCell::new`
- `UnsafeCell::new`
- `std::mem`
- `size_of`
- `align_of`
- `std::ptr`
- `null`
- `null_mut`
- `std::sync`
- `atomic`
- `Atomic{Bool,Ptr,Isize,Usize}::new`
- `once`
- `Once::new`
- primitives
- `{integer}::min_value`
- `{integer}::max_value`
Some other functions are const but they are also unstable or hidden, e.g. `Unique::new` so they don't have to be considered at this time.
After this stabilization, the following `*_INIT` constants in the standard library can be deprecated. I wasn't sure whether to include those deprecations in the current PR.
- `std::sync`
- `atomic`
- `ATOMIC_{BOOL,ISIZE,USIZE}_INIT`
- `once`
- `ONCE_INIT`
This commit alters the `query` function in the dep graph module to preallocate
memory using `with_capacity` instead of relying on automatic growth. Discovered
in #44576 it was found that for the syntex_syntax clean incremental benchmark
the peak memory usage was found when the dep graph was being saved, particularly
the `DepGraphQuery` data structure itself. PRs like #44142 which add more
queries end up just making this much larger!
I didn't see an immediately obvious way to reduce the size of the
`DepGraphQuery` object, but it turns out that `with_capacity` helps quite a bit!
Locally 831 MB was used [before] this commit, and 770 MB is in use at the peak
of the compiler [after] this commit. That's a nice 7.5% improvement! This won't
quite make up for the losses in #44142 but I figured it's a good start.
[before]: https://gist.github.com/alexcrichton/2d2b9c7a65503761925c5a0bcfeb0d1e
[before]: https://gist.github.com/alexcrichton/6da51f2a6184bfb81694cc44f06deb5b
This commit removes the `dep_graph` field from the `Session` type according to
issue #44390. Most of the fallout here was relatively straightforward and the
`prepare_session_directory` function was rejiggered a bit to reuse the results
in the later-called `load_dep_graph` function.
Closes#44390
Remove deprecated lang items
They have been deprecated for years and there is no trace left of them in the compiler. Also removed `require_owned_box` which is dead code and other small refactorings.