Don't try to evaluate const blocks during constant promotion
As of https://github.com/rust-lang/rust/pull/138499, trying to evaluate a const block in anything depended on by borrow-checking will result in a query cycle. Since that could happen in constant promotion, this PR adds a check for const blocks there to stop them from being evaluated.
Admittedly, this is a hack. See https://github.com/rust-lang/rust/issues/124328 for discussion of a more principled fix: removing cases like this from constant promotion altogether. To simplify the conditions under which promotion can occur, we probably shouldn't be implicitly promoting division or array indexing at all if possible. That would likely require a FCW and migration period, so I figure we may as well patch up the cycle now and simplify later.
Fixesrust-lang/rust#150464
I'll also lang-nominate this for visibility. I'm not sure there's much to discuss about this PR specifically, but it does represent a change in semantics. In Rust 1.87, the code below compiled. In Rust 1.88, it became a query cycle error. After this PR, it fails to borrow-check because the temporaries can no longer be promoted.
```rust
let (x, y, z);
// We only promote array indexing if the index is known to be in-bounds.
x = &([0][const { 0 }] & 0);
// We only promote integer division if the divisor is known not to be zero.
y = &(1 / const { 1 });
// Furthermore, if the divisor is `-1`, we only promote if the dividend is
// known not to be `int::MIN`.
z = &(const { 1 } / -1);
// The borrowed temporaries can't be promoted, so they were dropped at the ends
// of their respective statements.
(x, y, z);
```
Reintroduce `QueryStackFrame` split.
I tried removing it in rust-lang/rust#151203, to replace it with something simpler. But a couple of fuzzing failures have come up and I don't have a clear picture on how to fix them. So I'm reverting the main part of rust-lang/rust#151203.
This commit also adds the two fuzzing tests.
Fixesrust-lang/rust#151226, rust-lang/rust#151358.
r? @oli-obk
slice/ascii: Optimize `eq_ignore_ascii_case` with auto-vectorization
- Refactor the current functionality into a helper function
- Use `as_chunks` to encourage auto-vectorization in the optimized chunk processing function
- Add a codegen test checking for vectorization and no panicking
- Add benches for `eq_ignore_ascii_case`
---
The optimized function is initially only enabled for x86_64 which has `sse2` as part of its baseline, but none of the code is platform specific. Other platforms with SIMD instructions may also benefit from this implementation.
Performance improvements only manifest for slices of 16 bytes or longer, so the optimized path is gated behind a length check for greater than or equal to 16.
Benchmarks - Cases below 16 bytes are unaffected, cases above all show sizeable improvements.
```
before:
str::eq_ignore_ascii_case::bench_large_str_eq 4942.30ns/iter +/- 48.20
str::eq_ignore_ascii_case::bench_medium_str_eq 632.01ns/iter +/- 16.87
str::eq_ignore_ascii_case::bench_str_17_bytes_eq 16.28ns/iter +/- 0.45
str::eq_ignore_ascii_case::bench_str_31_bytes_eq 35.23ns/iter +/- 2.28
str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq 7.56ns/iter +/- 0.22
str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq 2.64ns/iter +/- 0.06
after:
str::eq_ignore_ascii_case::bench_large_str_eq 611.63ns/iter +/- 28.29
str::eq_ignore_ascii_case::bench_medium_str_eq 77.10ns/iter +/- 19.76
str::eq_ignore_ascii_case::bench_str_17_bytes_eq 3.49ns/iter +/- 0.39
str::eq_ignore_ascii_case::bench_str_31_bytes_eq 3.50ns/iter +/- 0.27
str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq 7.27ns/iter +/- 0.09
str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq 2.60ns/iter +/- 0.05
```
Try to reduce rustdoc GUI tests flakyness
Should help with https://github.com/rust-lang/rust/issues/93784.
I replaced a use of `puppeteer.wait` function with a loop instead (like the rest of `browser-ui-test`).
r? @jieyouxu
lint: Use rustc_apfloat for `overflowing_literals`, add f16 and f128
Switch to parsing float literals for overflow checks using `rustc_apfloat` rather than host floats. This avoids small variations in platform support and makes it possible to start checking `f16` and `f128` as well.
Using APFloat matches what we try to do elsewhere to avoid platform inconsistencies.
Adds two new Tier 3 targets - `aarch64v8r-unknown-none{,-softfloat}`
## New Tier 3 targets - `aarch64v8r-unknown-none` and `aarch64v8r-unknown-none-softfloat`
This PR adds two new Tier 3 targets - `aarch64v8r-unknown-none` and `aarch64v8r-unknown-none-softfloat`.
The existing `aarch64-unknown-none` target assumes Armv8.0-A as a baseline. However, Arm recently released the Arm Cortex-R82 processor which is the first to implement the Armv8-R AArch64 mode architecture. This architecture is similar to Armv8-A AArch64, however it has a different set of mandatory features, and is based off of Armv8.4. It is largely unrelated to the existing Armv8-R architecture target (`armv8r-none-eabihf`), which only operates in AArch32 mode.
The second `aarch64v8r-unknown-none-softfloat` target allows for possible Armv8-R AArch64 CPUs with no FPU, or for use-cases where FPU register stacking is not desired. As with the existing `aarch64-unknown-none` target we have coupled FPU support and Neon support together - there is no 'has FPU but does not have NEON' target proposed even though the architecture technically allows for it.
These targets are in support of firmware development on upcoming systems using the Arm Cortex-R82, particularly safety-critical firmware development. For now, it can be tested using the Arm's Armv8-R AArch64 Fixed Virtual Platform emulator, which we have used to test this target. We are also in the process of testing this target with the full compiler test suite as part of Ferrocene, in the same way we test `aarch64-unknown-none` to a safety-qualified standard. We have not identified any issues as yet, but if we do, we will send the fixes upstream to you.
## Ownership
This PR was developed by Ferrous Systems on behalf of Arm. Arm is the owner of these changes.
## Tier 3 Policy Notes
To cover off the Tier 3 requirements:
> A tier 3 target must have a designated developer or developers
Arm will maintain this target, and I have presumed the Embedded Devices Working Group will also take an interest, as they maintain the existing Arm bare-metal targets.
> Targets must use naming consistent with any existing targets
We prefix this target with `aarch64` because it generates A64 machine code (like `arm*` generates A32 and `thumb*` generates T32). In an ideal world I'd get to rename the existing target `aarch64v8a-unknown-none` but that's basically impossible at this point. You can assume `v6` for any `arm*` target where unspecified, and you can assume `v8a` for any `aarch64*` target where not specified.
> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.
It works just like the existing AArch64 bare-metal target.
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
Noted.
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate.
It's a bare-metal target, offering libcore and liballoc.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible.
Done
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target.
AArch64 is a Tier 1 architecture, so I don't expect this target to cause any issues.
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
Noted.
> Tier 3 targets must be able to produce assembly using at least one of rustc's supported backends from any host target.
It's AArch64 and so works with LLVM.
checksum-freshness: Fix invalid checksum calculation for binary files
Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: https://github.com/rust-lang/cargo/issues/14136Closes: rust-lang/rust#151090
Switch to parsing float literals for overflow checks using
`rustc_apfloat` rather than host floats. This avoids small variations in
platform support and makes it possible to start checking `f16` and
`f128` as well.
Using APFloat matches what we try to do elsewhere to avoid platform
inconsistencies.
I tried removing it in #151203, to replace it with something simpler.
But a couple of fuzzing failures have come up and I don't have a clear
picture on how to fix them. So I'm reverting the main part of #151203.
This commit also adds the two fuzzing tests.
Fixes#151226, #151358.
Suggest changing `iter`/`into_iter` when the other was meant
When encountering a call to `iter` that should have been `into_iter` and vice-versa, provide a structured suggestion:
```
error[E0271]: type mismatch resolving `<IntoIter<{integer}, 3> as IntoIterator>::Item == &{integer}`
--> $DIR/into_iter-when-iter-was-intended.rs:5:37
|
LL | let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
| ----- ^^^^^^^^^^^^^^^^^^^^^ expected `&{integer}`, found integer
| |
| required by a bound introduced by this call
|
note: the method call chain might not have had the expected associated types
--> $DIR/into_iter-when-iter-was-intended.rs:5:47
|
LL | let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
| --------- ^^^^^^^^^^^ `IntoIterator::Item` is `{integer}` here
| |
| this expression has type `[{integer}; 3]`
note: required by a bound in `std::iter::Iterator::chain`
--> $SRC_DIR/core/src/iter/traits/iterator.rs:LL:COL
help: consider not consuming the `[{integer}, 3]` to construct the `Iterator`
|
LL - let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
LL + let _a = [0, 1, 2].iter().chain([3, 4, 5].iter());
|
```
Finish addressing the original case in rust-lang/rust#68095. Only the case of chaining a `Vec` or `[]` is left unhandled.
The existing `aarch64-unknown-none` target assumes Armv8.0-A as a baseline. However, Arm recently released the Arm Cortex-R82 processor which is the first to implement the Armv8-R AArch64 mode architecture. This architecture is similar to Armv8-A AArch64, however it has a different set of mandatory features, and is based off of Armv8.4. It is largely unrelated to the existing Armv8-R architecture target (`armv8r-none-eabihf`), which only operates in AArch32 mode.
The second `aarch64v8r-unknown-none-softfloat` target allows for possible Armv8-R AArch64 CPUs with no FPU, or for use-cases where FPU register stacking is not desired. As with the existing `aarch64-unknown-none` target we have coupled FPU support and Neon support together - there is no 'has FPU but does not have NEON' target proposed even though the architecture technically allows for it.
This PR was developed by Ferrous Systems on behalf of Arm. Arm is the owner of these changes.
Add a `documentation` remapping path scope for rustdoc usage
This PR adds a new remapping path scope for rustdoc usage: `documentation`, instead of rustdoc abusing the other scopes for it's usage.
Like remapping paths in rustdoc, this scope is unstable. (rustdoc doesn't even have yet an equivalent to [rustc `--remap-path-scope`](https://doc.rust-lang.org/nightly/rustc/remap-source-paths.html#--remap-path-scope)).
I also took the opportunity to add a bit of documentation in rustdoc book.
When encountering a call to `iter` that should have been `into_iter` and vice-versa, provide a structured suggestion:
```
error[E0271]: type mismatch resolving `<IntoIter<{integer}, 3> as IntoIterator>::Item == &{integer}`
--> $DIR/into_iter-when-iter-was-intended.rs:5:37
|
LL | let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
| ----- ^^^^^^^^^^^^^^^^^^^^^ expected `&{integer}`, found integer
| |
| required by a bound introduced by this call
|
note: the method call chain might not have had the expected associated types
--> $DIR/into_iter-when-iter-was-intended.rs:5:47
|
LL | let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
| --------- ^^^^^^^^^^^ `IntoIterator::Item` is `{integer}` here
| |
| this expression has type `[{integer}; 3]`
note: required by a bound in `std::iter::Iterator::chain`
--> $SRC_DIR/core/src/iter/traits/iterator.rs:LL:COL
help: consider not consuming the `[{integer}, 3]` to construct the `Iterator`
|
LL - let _a = [0, 1, 2].iter().chain([3, 4, 5].into_iter());
LL + let _a = [0, 1, 2].iter().chain([3, 4, 5].iter());
|
```
Based on earlier work by León Orell Valerian Liehr.
Co-authored-by: León Orell Valerian Liehr <me@fmease.dev>
Signed-off-by: Usman Akinyemi <uniqueusman@archlinux>
LoongArch: Fix direct-access-external-data test
On LoongArch targets, `-Cdirect-access-external-data` defaults to `no`. Since copy relocations are not supported, `dso_local` is not emitted under `-Crelocation-model=static`, unlike on other targets.
Fix suppression of `unused_assignment` in binding of `unused_variable`
Unused assignments to an unused variable should trigger only the `unused_variables` lint and not also the `unused_assignments` lint. This was previously implemented by checking whether the span of the assignee was within the span of the binding pattern, however that failed to capture situations was imported from elsewhere (eg from the input tokenstream of a proc-macro that generates the binding pattern).
By comparing the span of the assignee to those of the variable introductions instead, a reported stable-to-stable regression is resolved.
This fix also impacted some other preexisting tests, which had (undesirably) been triggering both the `unused_variables` and `unused_assignments` lints on the same initializing assignment; those tests have therefore now been updated to expect only the former lint.
Fixesrust-lang/rust#151514
r? cjgillot (as author of reworked liveness testing in rust-lang/rust#142390)
The SSE2 helper function is not inlined across crate boundaries,
so we cannot verify the codegen in an assembly test. The fix is
still verified by the absence of performance regression.
x86 soft-float feature: mark it as forbidden rather than unstable
I am not sure why I made it "unstable" in f755f4cd1a; I think at the time "forbidden" did not work for some reason.
Making it "forbidden" instead has no significant effect on `-Ctarget-feature` use, it just changes the warning. It *does* have the effect that one cannot query this using `cfg(target_feature)` on nightly any more, but that seems fine to me. It only ever worked as an accidental side-effect of f755f4cd1a anyway.
r? @workingjubilee
add CSE optimization tests for iterating over slice
This PR is regression test for issue rust-lang/rust#119573.
This PR introduces a new regression test to verify a critical optimization known as Common Subexpression Elimination (CSE) is correctly applied during various slice iteration patterns.
std: avoid tearing `dbg!` prints
Fixes https://github.com/rust-lang/rust/issues/136703.
This is an alternative to rust-lang/rust#149859. Instead of formatting everything into a string, this PR makes multi-expression `dbg!` expand into multiple nested matches, with the final match containing a single `eprint!`. By using macro recursion and relying on hygiene, this allows naming every bound value in that `eprint!`.
CC @orlp
r? libs
abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.
For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
Fix cstring-merging test for Hexagon target
Hexagon assembler uses `.string` directive instead of `.asciz` for null-terminated strings. Both are equivalent but the test was only checking for `.asciz`.
Update the CHECK patterns to accept both directives using `.{{asciz|string}}` regex pattern.
add `simd_splat` intrinsic
Add `simd_splat` which lowers to the LLVM canonical splat sequence.
```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```
Right now we try to fake it using one of
```rust
fn splat(x: u32) -> u32x8 {
u32x8::from_array([x; 8])
}
```
or (in `stdarch`)
```rust
fn splat(value: $elem_type) -> $name {
#[derive(Copy, Clone)]
#[repr(simd)]
struct JustOne([$elem_type; 1]);
let one = JustOne([value]);
// SAFETY: 0 is always in-bounds because we're shuffling
// a simd type with exactly one element.
unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```
Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:
- https://github.com/rust-lang/rust/issues/60637
- https://github.com/rust-lang/rust/issues/137407
- https://github.com/rust-lang/rust/issues/122623
- https://github.com/rust-lang/rust/issues/97804
---
As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.
Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.
Currently this just adds the intrinsic, it does not actually use it anywhere yet.
Add new Tier 3 targets for ARMv6
Adds three new targets to support ARMv6 processors running bare-metal:
* `armv6-none-eabi` - Arm ISA, soft-float
* `armv6-none-eabihf` - Arm ISA, hard-float
* `thumbv6-none-eabi` - Thumb-1 ISA, soft-float
There is no `thumbv6-none-eabihf` target because as far as I can tell, hard-float isn't support with the Thumb-1 instruction set (and you need the ARMv6T2 extension to enable Thumb-2 support).
The targets require ARMv6K as a minimum, which allows the two Arm ISA targets to have full CAS atomics. LLVM has a bug which means it emits some ARMv6K instructions even if you only call for ARMv6, and as no-one else has noticed the bug, and because basically all ARMv6 processors have ARMv6K, I think this is fine. The Thumb target also doesn't have any kind of atomics, just like the Armv5TE and Armv4 targets, because LLVM was emitting library calls to emulate them.
Testing will be added to https://github.com/rust-embedded/aarch32 once the target is accepted. I already have tests for the other non-M arm-none-eabi targets, and those tests pass on these targets.
> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)
I have listed myself. If accepted, I'll talk to the Embedded Devices Working Group about adding this one to the rosta with all the others they support.
> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.
You might prefer `arm-none-eabi`, because `arm-unknown-linux-gnu` is an ARMv6 target - the implicit rule seems to be that if the Arm architecture version isn't specified, it's assumed to be v6. However, `armv6-none-eabi` seemed to fit better between `armv5te-none-eabi` and `armv7a/armv7r-none-eabi`.
The hamming distance between `thumbv6-none-eabi` and `thumbv6m-none-eabi` is unfortunately low, but I don't know how to make it better. They *are* the ARMv6 and ARMv6-M targets, and its perhaps not worse than `armv7a-none-eabi` and `armv7r-none-eabi`.
> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.
No different to any other arm-none-eabi target.
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
Noted.
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate...
Same as other arm-none-eabi targets.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible.
Same as other arm-none-eabi targets.
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via @) to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
Noted.
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
Noted
> Tier 3 targets must be able to produce assembly using at least one of rustc's supported backends from any host target. (Having support in a fork of the backend is not sufficient, it must be upstream.)
Noted
Three targets, covering A32 and T32 instructions, and soft-float and
hard-float ABIs. Hard-float not available in Thumb mode. Atomics
in Thumb mode require __sync* functions from compiler-builtins.
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.
For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
THIR patterns: Explicitly distinguish `&pin` from plain `&`/`&mut`
Currently, `thir::PatKind::Deref` is used for ordinary `&`/`&mut` patterns, and also for `&pin const` and `&pin mut` patterns under `feature(pin_ergonomics)`. The only way to distinguish between them is by inspecting the `Ty` attached to the pattern node.
That's non-obvious, making it easy to miss, and is also a bit confusing to read when it does occur.
This PR therefore adds an explicit `pin: hir::Pinnedness` field to `thir::PatKind::Deref`, to explicitly distinguish pin-deref nodes from ordinary builtin-deref nodes.
(I'm not deeply familiar with the future of pin-patterns, so I'm not sure whether that information is best carried as a field or as a separate `PatKind`, but I think this approach is at least an improvement over the status quo.)
r? Nadrieril (or compiler)
`const` blocks as a `mod` item
Tracking issue: rust-lang/rust#149226
This adds support for writing `const { ... }` as an item in a module. In the current implementation, this is a unique AST item that gets lowered to `const _: () = const { ... };` in HIR.
rustfmt support included.
TODO:
- `pub const { ... }` does not make sense (see rust-lang/rust#147136). Reject it. Should this be rejected by the parser or smth?
- Improve diagnostics (preferably they should not mention the fake `_` ident).