In `BTreeMap::eq`, do not compare the elements if the sizes are different.
Reverts rust-lang/rust#147101 in library/alloc/src/btree/
rust-lang/rust#147101 replaced some instances of code like `a.len() == b.len() && a.iter().eq(&b)` with just `a.iter().eq(&b)`, but the optimization that PR introduced only applies for `TrustedLen` iterators, and `BTreeMap`'s itertors are not `TrustedLen`, so this theoretically regressed perf for comparing large `BTreeMap`/`BTreeSet`s with unequal lengths but equal prefixes, (and also made it so that comparing two different-length `BTreeMap`/`BTreeSet`s with elements whose `PartialEq` impls that can panic now can panic, though this is not a "promised" behaviour either way (cc rust-lang/rust#149122))
Given that `TrustedLen` is an unsafe trait, I opted to not implement it for `BTreeMap`'s iterators, and instead just revert the change. If someone else wants to audit `BTreeMap`'s iterators to make sure they always return the right number of items (even in the face of incorrect user `Ord` impls) and then implement `TrustedLen` for them so that the optimization works for them, then this can be closed in favor of that (or if the perf regression is deemed too theoretical, this can be closed outright).
Example of theoretical perf regression: https://play.rust-lang.org/?version=beta&mode=release&edition=2024&gist=a37e3d61e6bf02669b251315c9a44fe2 (very rough estimates, using `Instant::elapsed`).
In release mode on stable the comparison takes ~23.68µs.
In release mode on beta/nightly the comparison takes ~48.351057ms.
At least on OVMF, some files copied over from linux file system seem
to have invalid time (year = 1980 and everything else 0). Since Rust
allows time to be optional and we can return error, that seems to be
the way to go for now.
Signed-off-by: Ayush Singh <ayush@beagleboard.org>
chore: Update annotate-snippets to 0.12.10
This PR updates `annotate-snippets` to `0.12.10`, which [includes a fix](756366223c/CHANGELOG.md (fixed)) that modifies some test output.
Bring back i686-pc-windows-gnullvm target
rust-lang/rust#148751 inadvertently removed i686-pc-windows-gnullvm std build when migrating to native CI runners. Since this change was not agreed upon, we should bring back prebuilt std builds for that target.
There are a few runners that could do it: dist-aarch64-llvm-mingw, dist-x86_64-llvm-mingw, dist-various-1 and dist-various-2.
dist-x86_64-llvm-mingw already takes slightly over 2 hours, so the faster dist-aarch64-llvm-mingw is a better choice.
We can also use dist-various-x job, they don't have llvm-mingw toolchain, but it's trivial to install one.
As discussed extensively in libs-api, the initialized-bytes tracking
primarily benefits calls to `read_buf` that end up initializing the
buffer and calling `read`, at the expense of calls to `read_buf` that
*don't* need to initialize the buffer. Essentially, this optimizes for
the past at the expense of the future. If people observe performance
issues using `read_buf` (or something that calls it) with a given `Read`
impl, they can fix those performance issues by implementing `read_buf`
for that `Read`.
Update the documentation to stop talking about initialized-but-unfilled
bytes.
Remove all functions that just deal with those bytes and their tracking,
and remove usage of those methods.
Remove `BorrowedCursor::advance` as there's no longer a safe case for
advancing within initialized-but-unfilled bytes. Rename
`BorrowedCursor::advance_unchecked` to `advance`.
Update tests.
Rollup of 8 pull requests
Successful merges:
- rust-lang/rust#145628 ([std][BTree] Fix behavior of `::append` to match documentation, `::insert`, and `::extend`)
- rust-lang/rust#149241 (Fix armv4t- and armv5te- bare metal targets)
- rust-lang/rust#149470 (compiletest: Prepare ignore/only conditions once in advance, without a macro)
- rust-lang/rust#149507 (Mark windows-gnu* as lacking build with assertions)
- rust-lang/rust#149508 (Prefer helper functions to identify MinGW targets)
- rust-lang/rust#149516 (Stop adding MSYS2 to PATH)
- rust-lang/rust#149525 (debuginfo/macro-stepping test: extend comments)
- rust-lang/rust#149526 (Add myself (mati865) to the review rotation)
r? `@ghost`
`@rustbot` modify labels: rollup
debuginfo/macro-stepping test: extend comments
Those `#locN` markers look like they are debuginfo compiletest magic (since `#break` is debuginfo compiletest magic). However, they are actually just magic strings used by the `check` commands in the test itself. This threw me off when I looked at the test (prompted by a CI failure), so let's leave a comment for the next poor soul that ends up lost in this test.
Also, for some reason the lldb instructions do not check for `#loc6`, unlike the gdb instructions. I do not know of an lldb version that actually makes the test pass (do we even run it with lldb at all on CI?), so I won't try to add a check for loc6, but let's at least add a comment to increase the chance that someone more knowledgeable about lldb and our test suite notices this in the future.
Mark windows-gnu* as lacking build with assertions
Knowing that `x86_64-pc-windows-gnu` has no builds with assertions, I have just copied it as `x86_64-pc-windows-gnullvm` and called a day. Obviously it should have been `false`, sorry for that.
While at it, also fix `x86_64-pc-windows-gnu`.
compiletest: Prepare ignore/only conditions once in advance, without a macro
Compiletest has historically handled `ignore-*` and `only-*` directives in an extremely confusing way that makes the code hard to understand and hard to modify.
This PR therefore takes an important step away from that older design by instead evaluating a set of named boolean "conditions" in advance, and then using those conditions to help determine whether a particular directive should cause its test to be ignored or not.
As usual, there's more cleanup that I want to do here, but I've left most of it for future work to help keep this PR manageable.
r? jieyouxu
Fix armv4t- and armv5te- bare metal targets
These two targets currently force on the LLVM feature `+atomics-32`. LLVM doesn't appear to actually be able to emit 32-bit load/store atomics for these targets despite this feature, and emits calls to a shim function called `__sync_lock_test_and_set_4`, which nothing in the Rust standard library supplies.
See [#t-compiler/arm > __sync_lock_test_and_set_4 on Armv5TE](https://rust-lang.zulipchat.com/#narrow/channel/242906-t-compiler.2Farm/topic/__sync_lock_test_and_set_4.20on.20Armv5TE/with/553724827) for more details.
Experimenting with clang and gcc (as logged in that zulip thread) shows that C code cannot do atomic load/stores on that architecture either (at least, not without a library call inserted).
So, the safest thing to do is probably turn off `+atomics-32` for these two Tier 3 targets.
I asked `@Lokathor` and he said he didn't even use atomics on the `armv4t-none-eabi`/`thumbv4t-none-eabi` target he maintains.
I was unable to reach `@QuinnPainter` for comment for `armv5te-none-eabi`/`thumbv5te-none-eabi`.
The second commit renames the base target spec `spec::base::thumb` to `spec::base::arm_none` and changes `armv4t-none-eabi`/`thumbv4t-none-eabi` and `armv5te-none-eabi`/`thumbv5te-none-eabi` to use it. This harmonises the frame-pointer and linker options across the bare-metal Arm EABI and EABIHF targets.
You could make an argument for harmonising `armv7a-none-*`, `armv7r-none*` and `armv8r-none-*` as well, but that can be another PR.
Compute jump threading opportunities in a single pass
The current implementation of jump threading walks MIR CFG backwards from each `SwitchInt` terminator. This PR replaces this by a single postorder traversal of MIR. In theory, we could do a full fixpoint dataflow analysis, but this has low returns as we forbid threading through a loop header.
The second commit in this PR modifies the carried state to a lighter data structure. The current implementation uses some kind of `IndexVec<ValueIndex, &[Condition]>`. This is needlessly heavy, as the state rarely ever carries more than a few `Condition`s. The first commit replaces this state with a simpler `&[Condition]`, and puts the corresponding `ValueIndex` inside `Condition`.
The three later commits are perf tweaks.
The sixth commit is the main change. Instead of carrying the goto target inside the condition, we maintain a set of conditions associated with each block, and their consequences in following blocks. Think: if this condition is fulfilled in this block, then that condition is fulfilled in that block. This makes the threading algorithm much easier to implement, without the extra bookkeeping of `ThreadingOpportunity` we had.
Later commits modify that algorithm to shrink the set of duplicated blocks. By propagating fulfilled conditions down the CFG, and trimming costly threads.