Speedup `copy_src_dirs` in bootstrap
I was kinda offended by how slow it was. Just the `copy_src_dirs` part took ~3s locally in the `x dist rustc-src` step. In release mode it was just 1s, but that's kind of cheating (I wonder if we should build bootstrap in release mode on CI though...).
Did some basic optimizations to bring it down to ~1s also in debug mode.
Maybe it's overkill, due to https://github.com/rust-lang/rust/pull/145455. Up to you whether we should merge it or close it :)
r? `````````@jieyouxu`````````
Fix outdated doc comment
This updates the documentation comment for `Type::is_doc_subtype_of` to more accurately describe its purpose as a subtyping check, rather than equality
fixesrust-lang/rust#138572
r? ````````````@tgross35````````````
const-eval: full support for pointer fragments
This fixes https://github.com/rust-lang/const-eval/issues/72 and makes `swap_nonoverlapping` fully work in const-eval by enhancing per-byte provenance tracking with tracking of *which* of the bytes of the pointer this one is. Later, if we see all the same bytes in the exact same order, we can treat it like a whole pointer again without ever risking a leak of the data bytes (that encode the offset into the allocation). This lifts the limitation that was discussed quite a bit in https://github.com/rust-lang/rust/pull/137280.
For a concrete piece of code that used to fail and now works properly consider this example doing a byte-for-byte memcpy in const without using intrinsics:
```rust
use std::{mem::{self, MaybeUninit}, ptr};
type Byte = MaybeUninit<u8>;
const unsafe fn memcpy(dst: *mut Byte, src: *const Byte, n: usize) {
let mut i = 0;
while i < n {
*dst.add(i) = *src.add(i);
i += 1;
}
}
const _MEMCPY: () = unsafe {
let ptr = &42;
let mut ptr2 = ptr::null::<i32>();
// Copy from ptr to ptr2.
memcpy(&mut ptr2 as *mut _ as *mut _, &ptr as *const _ as *const _, mem::size_of::<&i32>());
assert!(*ptr2 == 42);
};
```
What makes this code tricky is that pointers are "opaque blobs" in const-eval, we cannot just let people look at the individual bytes since *we don't know what those bytes look like* -- that depends on the absolute address the pointed-to object will be placed at. The code above "breaks apart" a pointer into individual bytes, and then puts them back together in the same order elsewhere. This PR implements the logic to properly track how those individual bytes relate to the original pointer, and to recognize when they are in the right order again.
We still reject constants where the final value contains a not-fully-put-together pointer: I have no idea how one could construct an LLVM global where one byte is defined as "the 3rd byte of a pointer to that other global over there" -- and even if LLVM supports this somehow, we can leave implementing that to a future PR. It seems unlikely to me anyone would even want this, but who knows.^^
This also changes the behavior of Miri, by tracking the order of bytes with provenance and only considering a pointer to have valid provenance if all bytes are in the original order again. This is related to https://github.com/rust-lang/unsafe-code-guidelines/issues/558. It means one cannot implement XOR linked lists with strict provenance any more, which is however only of theoretical interest. Practically I am curious if anyone will show up with any code that Miri now complains about - that would be interesting data. Cc `@rust-lang/opsem`
Revert "Partially outline code inside the panic! macro".
This reverts https://github.com/rust-lang/rust/pull/115670
Without any tests/benchmarks that show some improvement, it's hard to know whether the change had any positive effect. (And if it did, whether that effect is still achieved today.)
Do not copy files in `copy_src_dirs` in dry run
This reduces the time to run the current 9 dist snapshot tests from ~24s to ~2s on my PC.
r? `@jieyouxu`
Fix tracing debug representation of steps without arguments in bootstrap
I was wondering why I see `lainSourceTarbal` in tracing logs...
r? `@jieyouxu`
Remove duplicated tracing span in bootstrap
`trace_cmd` is now called also in the `stream` method, so including it also here was duplicating command spans.
r? `@jieyouxu`
Enhance UI test output handling for runtime errors
When a UI test runs a compiled binary and an error/forbid pattern check fails, the failure message previously only showed compiler output, hiding the executed programs stdout/stderr. This makes it harder to see near-miss or unexpected runtime lines.
Fixedrust-lang/rust#141531
Supersedes rust-lang/rust#141977
bootstrap: Reduce dependencies
Eliminate the `fd-lock` dependency by using the new native locking in std.
Eliminate the `xattr` dependency by turning off a feature flag in `tar`, since
the tarballs that we extract with bootstrap don't need it.
Deduplicate -L search paths
For each -L passed to the compiler, we eagerly scan the whole directory. If it has a lot of files, that results in a lot of allocations. So it's needless to do this if some -L paths are actually duplicated (which can happen e.g. in the situation in the linked issue).
This PR both deduplicates the args, and also teaches rustdoc not to pass duplicated args to merged doctests.
Fixes: https://github.com/rust-lang/rust/issues/145375
Split codegen backend check step into two and don't run it with `x check compiler`
This reduces the amount of work that is done during `x check compiler`. We still check both backends during `x check` by defaut, even if they are not in `rust.codegen-backends`, as just checking them shouldn't require expensive preparations, like building GCC.
r? `@jieyouxu`
Reduce usage of `compiler_for` in bootstrap
While working on refactoring/fixing `dist` steps, I realized that `build.full-bootstrap` does much more than it should, and that it its documentation is wrong. It seems that the main purpose of this option should be to enable/disable stdlib/compiler uplifting (https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Purpose.20of.20.60build.2Efull-bootstrap.60/with/533985624), but currently it also affects staging, or more precisely which compiler will be used to build selected steps, because this option is used in the cursed `compiler_for` function.
I would like to change the option it so that it *only* affects uplifting, and doesn't affect stage selection, which I (partially) did in this PR. I removed the usage of `compiler_for` from the `Std` and `Rustc` steps, and explicitly implemented uplifting, without going through `compiler_for`.
The only remaining usages of `compiler_for` are in dist steps (which I'm currently refactoring, will send a PR later) and test steps (which I will take a look at after dist). After that we can finally remove the function.
I tried to document the case when uplifting was happening during cross-compilation, which was very implicit before. I also did a slight change in the uplifting logic for rustc when cross-compiling. Before, we would attempt to uplift a stage1 rustc, but that is not really a thing when cross-compiling.
r? `@jieyouxu`
When a UI test runs a compiled binary and an error/forbid pattern
check fails, the failure message previously only showed compiler output,
hiding the executed programs stdout/stderr. This makes it harder to
see near-miss or unexpected runtime lines.
Implement autodiff using intrinsics
This PR aims to move autodiff logic to `autodiff` intrinsic. Allowing us to delete a great part of our frontend code and overall, simplify the compilation pipeline of autodiff functions.
Change the desugaring of `assert!` for better error output
In the desugaring of `assert!`, we now expand to a `match` expression instead of `if !cond {..}`.
The span of incorrect conditions will point only at the expression, and not the whole `assert!` invocation.
```
error[E0308]: mismatched types
--> $DIR/issue-14091.rs:2:13
|
LL | assert!(1,1);
| ^ expected `bool`, found integer
```
We no longer mention the expression needing to implement the `Not` trait.
```
error[E0308]: mismatched types
--> $DIR/issue-14091-2.rs:15:13
|
LL | assert!(x, x);
| ^ expected `bool`, found `BytePos`
```
Now `assert!(val)` desugars to:
```rust
match val {
true => {},
_ => $crate::panic::panic_2021!(),
}
```
Fix#122159.
bootstrap: Fix jemalloc 64K page support for aarch64 tools
Resolvesrust-lang/rust#133748
The prior page size fix only targeted the compile build step, not the tools step: https://github.com/rust-lang/rust/pull/135081
Also note that since `miri` always uses jemalloc, I didn't copy the `builder.config.jemalloc(target)` check to the tools section.
Tested by running `strings` on the compiled `miri` binary to see the LG_PAGE value.
Before:
```
> strings miri | grep '^LG_PAGE'
LG_PAGE 14
```
After:
```
> strings miri | grep '^LG_PAGE'
LG_PAGE 16
```
May also need a separate fix for the standalone miri repository: https://github.com/rust-lang/miri/issues/4514 (likely a change needed in miri-script?)
Rename and document `ONLY_HOSTS` in bootstrap
Everytime I examined the `ONLY_HOSTS` flag of bootstrap steps, I was utterly confused. Why is it called ONLY_HOSTS? How does the fact that it is skipped if `--target` is passed, but `--host` is not (which was not accurate) help me?
The reality of the flag is that if it is true, the targets for which the given Step will be built is determined based on the `--host` flag, while if it is false, it is determined based on the `--target` flag, that's pretty much it. The previous comment was just a (not very helpful and not even accurate) corollary of that.
I clarified the comment, and also renamed the flag to `IS_HOST` (happy to brainstorm better names, but the doc. comment change is IMO the main improvement).
r? ``@jieyouxu``
Improve tracing in bootstrap
I was annoyed that bootstrap had like 5 separate ways of debugging/tracing/profiling, and it was hard for me to understand how are individual steps executed. This PR tries to unify severla things behind `BOOTSTRAP_TRACING`, and improve tracing/profiling in general:
- All generated tracing outputs are now stored in a single directory to make it easier to examine them, plus bootstrap prepares a `latest` symlink to the latest generated tracing output directory for convenience.
- All executed spans are now logged automatically (without requiring usage of `#[tracing::instrument]`).
- A custom span/event formatter was implemented, to provide domain-specific output (like location of executed commands or spans) and hopefully also to reduce visual clutter.
- `tracing_forest` was removed. While it did some useful postprocessing, it didn't expose enough information for making the dynamic step spans work.
- You can now explicitly log steps (`STEP=info`) and/or commands (`COMMAND=info`), to have more granular control over what gets logged.
- `print-step-timings` also show when a step starts its execution (not just when it ends it), so that when some step fails in CI, we can actually see what step it was (before we would only see the end of the previous step).
- The rustc-dev-guide page on debugging/profiling bootstrap was updated.
There are still some things that work outside of tracing (`print-step-timings` and `dump-bootstrap-shims`), but I think that for now this improvement is good enough.
I removed the `> step`, `< step` verbose output, because I found it unusable, as verbose bootstrap output also enables verbose Cargo output, and then you simply drown in too much data, and because I think that the new tracing system makes it obsolete (although it does require recompilation with the `tracing` feature). If you want to keep it, happy to revert 690c781475. And the information about cached steps is now also shown in the Graphviz step dependency graph.
We can modify the tracing output however we want, as we now implement it ourselves. Notably, we could also show exit logs for step spans, currently I only show enter spans. Maybe creating indents for each executed nested command is also not needed. Happy to hear feedback!
Some further improvements could be to print step durations, if we decide to also log step exit events. We could also try to enable tracing in CI logs, but it might be too verbose.
Best reviewed commit-by-commit.
r? ``@jieyouxu``
CC ``@Shourya742``
Avoid abbreviating "numerator" as "numer", to allow catching typo "numer" elsewhere
`typos.toml` has an exception for "numer", to avoid flagging its use as an abbreviation for "numerator". Remove the use of that abbrevation, spelling out "numerator" instead, and remove the exception, so that typo checks can find future instances of "numer" as a typo for "number".
fix(unicode-table-generator): fix duplicated unique indices
unicode-table-generator panicked while populating `distinct_indices` because of duplicated indices. This was introduced in rust-lang/rust#144134, where the order of `canonical_words.push(...)` and `canonical_words.len()` was swapped.
Fixes: rust-lang/rust#144134
Add tracing to resolve-related functions
Resolve-related functions are not called often but still make up for ~3% of execution time for non-repetitive programs (as seen in the first table below, obtained from running the rust snippet at the bottom with `n=1`). On the other hand, for repetitive programs they become less relevant (I tested the same snippet but with `n=100` and got ~1.5%), and it appears that only `try_resolve` is called more often (see the last two tables).
The first table was obtained by opening the trace file in https://ui.perfetto.dev and running the following query:
```sql
select "TOTAL PROGRAM DURATION" as name, count(*), max(ts + dur) as "sum(dur)", 100.0 as "%", null as "min(dur)", null as "max(dur)", null as "avg(dur)", null as "stddev(dur)" from slices union select "TOTAL OVER ALL SPANS (excluding events)" as name, count(*), sum(dur), cast(cast(sum(dur) as float) / (select max(ts + dur) from slices) * 1000 as int) / 10.0 as "%", min(dur), max(dur), cast(avg(dur) as int) as "avg(dur)", cast(sqrt(avg(dur*dur)-avg(dur)*avg(dur)) as int) as "stddev(dur)" from slices where parent_id is null and name != "frame" and name != "step" and dur > 0 union select name, count(*), sum(dur), cast(cast(sum(dur) as float) / (select max(ts + dur) from slices) * 1000 as int) / 10.0 as "%", min(dur), max(dur), cast(avg(dur) as int) as "avg(dur)", cast(sqrt(avg(dur*dur)-avg(dur)*avg(dur)) as int) as "stddev(dur)" from slices where parent_id is null and name != "frame" and name != "step" group by name order by sum(dur) desc, count(*) desc
```
<img width="1687" height="242" alt="image" src="https://github.com/user-attachments/assets/4d4bd890-869b-40f3-a473-8e4c42b02da4" />
The following two tables show how many `resolve` spans there per subname/subcategory, and how much time is spent in each. The first is for `n=1` and the second for `n=100`. The query that was used is:
```sql
select args.string_value as name, count(*), max(dur), avg(dur), sum(dur) from slices inner join args USING (arg_set_id) where args.key = "args." || slices.name and name = "resolve" group by args.string_value
```
<img width="1688" height="159" alt="image" src="https://github.com/user-attachments/assets/a8749856-c099-492e-a86e-6d67b146af9c" />
<img width="1688" height="159" alt="image" src="https://github.com/user-attachments/assets/ce3ac1b5-5c06-47d9-85a6-9b921aea348e" />
The snippet I tested with Miri to obtain the above traces is:
```rust
fn main() {
let n: usize = std::env::args().nth(1).unwrap().parse().unwrap();
let mut v = (0..n).into_iter().collect::<Vec<_>>();
for i in &mut v {
*i += 1;
}
}
```