Update cargo
7 commits in 71cd3a926f0cf41eeaf9f2a7f2194b2aff85b0f6..9b13310ca596020a737aaa47daa4ed9ff8898a2f
2023-11-20 15:30:57 +0000 to 2023-11-24 16:20:51 +0000
- feat: Add `CARGO_RUSTC_CURRENT_DIR` (unstable) (rust-lang/cargo#12996)
- Exited with hard error when custom build file no existence or not in package (rust-lang/cargo#12995)
- try running on windows (rust-lang/cargo#13042)
- refactor(toml): Better abstract inheritance details (rust-lang/cargo#13021)
- cargo-test-support: Add features to the default Cargo.toml file (rust-lang/cargo#12997)
- Migrate rustfix to the cargo repo (rust-lang/cargo#13005)
- typo: rusc -> rustc (rust-lang/cargo#13019)
---
This also removes the check to ensure that `rustfix` between
* src/tools/cargo
* src/tools/compiletest
has the same version,
since `rust-lang/rustfix` has migrated to under `rust-lang/cargo`.
r? ghost
run the provenance-gc=1 test on all targets, but only for the host tests
No need to slow down *all those tests* running on the Linux host... but lets cover each major OS at least once. We've had bugs that only some macOS-specific code in `getrandom` found, after all.
Let's see how much this affects timing on the macOS / Windows runners.
before: only on Linux host, all tests
after: only the test suite itself (not cargo-miri or the mir-opt-level=4 run),
on all hosts for the host target and on Linux for all "full" targets.
Replace `option.map(cond) == Some(true)` with `option.is_some_and(cond)`
Requested by `@fmease` in https://github.com/rust-lang/rust/pull/118226#pullrequestreview-1747432292.
There is also a much larger number of `option.map_or(false, cond)` that can be changed separately if someone wants.
r? fmease
This removes the check to ensure that `rustfix` between
* src/tools/cargo
* src/tools/compiletest
has the same version,
since `rust-lang/rustfix` has migrated to under `rust-lang/cargo`.
Fixes error count display is different when there's only one error left
Supersedes #114759
### What did I do?
I did the small change in `rustc_errors` by hand. Then I did the other changes in `/compiler` by hand, those were just find replace on `*.rs` in the workspace. The changes in run-make are find replace for `run-make` in the workspace.
All other changes are blessed using `x test TEST --bless`. I blessed the tests that were blessed in #114759.
### how to review this nightmare
ping bors with an `r+`. You should check that my logic is sound and maybe quickly scroll through the diff, but fully verifying it seems fairly hard to impossible. I did my best to do this correctly.
Thank you `@adrianEffe` for bringing this up and your initial implementation.
cc `@flip1995,` you said you want to do a subtree sync asap
cc `@RalfJung` maybe you want to do a quick subtree sync afterwards as well for Miri
r? `@WaffleLapkin`
Update windows-bindgen and define `INVALID_HANDLE_VALUE` ourselves
We generate bindings to the Windows API via the `windows-bindgen` crate, which is ultimately what's also used to generate the `windows-sys` and `windows` crates. However, there currently is some custom sauce just for std which makes it a bit different from the vanilla bindings. I would love for us to reduce and eventually remove the differences entirely so that std is using the exact same bindings as everyone else. Maybe in the future we can even just have a normal dependency on `windows-sys`.
This PR removes one of those special things. Our definition of `INVALID_HANDLE_VALUE` relies on an experimental nightly feature for strict provenance, so lets bring that back in house. It also excludes it from the codegen step though that isn't strictly necessary as we override it in any case.
This PR also updates windows-bingen to 0.52.0.
Miri: GC the dead_alloc_map too
dead_alloc_map is the last piece of state in the interpreter I can find that leaks. With this PR, all of the long-term memory growth I can find in Miri with programs that do things like run a big `loop {` or run property tests is attributable to some data structure properties in borrow tracking, and is _extremely_ slow.
My only gripe with the commit in this PR is that I don't have a new test for it. I'd like to have a regression test for this, but it would have to be statistical I think because the peak memory of a process that Linux reports is not exactly the same run-to-run. Which means it would have to not be very sensitive to slow leaks (some guesswork suggests for acceptable CI time we would be checking for like 10% memory growth over a minute or two, which is still pretty fast IMO).
Unless someone has a better idea for how to detect a regression, I think on balance I'm fine with manually keeping an eye on the memory use situation.
r? RalfJung
detect and test for data races between setenv and getenv
But only on Unix; Windows doesn't have such a data race. Also make target_os_is_unix properly check for unix, which then makes our completely empty android files useless.
Expand Miri's BorTag GC to a Provenance GC
As suggested in https://github.com/rust-lang/miri/issues/3080#issuecomment-1732505573
We previously solved memory growth issues associated with the Stacked Borrows and Tree Borrows runtimes with a GC. But of course we also have state accumulation associated with whole allocations elsewhere in the interpreter, and this PR starts tackling those.
To do this, we expand the visitor for the GC so that it can visit a BorTag or an AllocId. Instead of collecting all live AllocIds into a single HashSet, we just collect from the Machine itself then go through an accessor `InterpCx::is_alloc_live` which checks a number of allocation data structures in the core interpreter. This avoids the overhead of all the inserts that collecting their keys would require.
r? ``@RalfJung``
rustdoc-search: add support for traits and associated types
# Summary
Trait associated type queries work in rustdoc's type driven search. The data is included in the search-index.js file, and the queries are designed to "do what I mean" when users type them in, so, for example, `Iterator<Item=T> -> Option<T>` includes `Iterator::next` in the SERP[^SERP], and `Iterator<T> -> Option<T>` also includes `Iterator::next` in the SERP.
[^SERP]: search engine results page
## Sample searches
* [`iterator<Item=T>, fnmut -> T`][iterreduce]
* [`iterator<T>, fnmut -> T`][iterreduceterse]
[iterreduce]: http://notriddle.com/rustdoc-html-demo-5/associated-types/std/index.html?search=iterator%3CItem%3DT%3E%2C%20fnmut%20-%3E%20T&filter-crate=std
[iterreduceterse]: http://notriddle.com/rustdoc-html-demo-5/associated-types/std/index.html?search=iterator%3CT%3E%2C%20fnmut%20-%3E%20T&filter-crate=std
# Motivation
My primary motivation for working on search.js at all is to make it easier to use highly generic APIs, like the Iterator API. The type signature describes these functions pretty well, while the names are almost arbitrary.
Before this PR, type bindings were not consistently included in search-index.js at all (you couldn't find Iterator::next by typing in its function signature) and you couldn't explicitly search for them. This PR fixes both of these problems.
# Guide-level explanation
*Excerpt from [the Rustdoc book](http://notriddle.com/rustdoc-html-demo-5/associated-types/rustdoc/read-documentation/search.html), included in this PR.*
> Function signature searches can query generics, wrapped in angle brackets, and traits will be normalized like types in the search engine if no type parameters match them. For example, a function with the signature `fn my_function<I: Iterator<Item=u32>>(input: I) -> usize` can be matched with the following queries:
>
> * `Iterator<Item=u32> -> usize`
> * `Iterator<u32> -> usize` (you can leave out the `Item=` part)
> * `Iterator -> usize` (you can leave out iterator's generic entirely)
> * `T -> usize` (you can match with a generic parameter)
>
> Each of the above queries is progressively looser, except the last one would not match `dyn Iterator`, since that's not a type parameter.
# Reference-level explanation
Inside the angle brackets, you can choose whether to write a name before the parameter and the equal sign. This syntax is called [`GenericArgsBinding`](https://doc.rust-lang.org/reference/paths.html#paths-in-expressions) in the Rust Reference, and it allows you to constrain a trait's associated type.
As a convenience, you don't actually have to put the name in (Rust requires it, but Rustdoc Search doesn't). This works about the same way unboxing already works in Search: the terse `Iterator<u32>` is a match for `Iterator<Item=u32>`, but the opposite is not true, just like `u32` is a match for `Iterator<u32>`.
When converting a trait method for the search index, the trait is substituted for `Self`, and all associated types are bound to generics. This way, if you have the following trait definition:
```rust
pub trait MyTrait {
type Output;
fn method(self) -> Self::Output;
}
```
The following queries will match its method:
* `MyTrait<Output=T> -> T`
* `MyTrait<T> -> T`
* `MyTrait -> T`
But these queries will not match it:
* <i>`MyTrait<Output=u32> -> u32`</i>
* <i>`MyTrait<Output> -> Output`</i>
* <i>`MyTrait -> MyTrait::Output`</i>
# Drawbacks
It's a little bit bigger:
```console
$ du before/search-index1.74.0.js after/search-index1.74.0.js
4020 before/search-index1.74.0.js
4068 after/search-index1.74.0.js
```
# Rationale and alternatives
I don't want to just not do this. On it's own, it's not terribly useful, but in addition to searching by normal traits, this is also intended as a desugaring target for closures. That's why it needs to actually distinguish the two: it allows the future desugaring to distinguish function output and input.
The other alternative would be to not allow users to leave out the name, so `iterator<u32>` doesn't work. That would be unfortunate, because mixing up which ones have out params and which ones are plain generics is an easy enough mistake that the Rust compiler itself helps people out with it.
# Prior art
* <http://neilmitchell.blogspot.com/2020/06/hoogle-searching-overview.html>
The current Rustdoc algorithm, both before this PR and after it, has a fairly expensive matching algorithm over a fairly simple file format. Luckily, we aren't trying to scale to all of crates.io, so it's usable, but it's not great when I throw it at docs.servo.org
# Unresolved questions
Okay, but *how do we want to handle closures?* I know the system will desugar `FnOnce(T) -> U` into `trait:FnOnce<Output=U, primitive:tuple<T>>`, but what if I don't know what trait I'm looking for? This PR can merge with nothing, but it'd be nice to have a plan.
Specifically, how should the special form used to handle all varieties of basic callable: primitive:fn (function pointers), and trait:Fn, trait:FnOnce, and trait:FnMut should all be searchable using a single syntax, because I'm always forgetting which one is used in the function I'm looking for.
The essential question is how closely we want to copy Rust's own syntax. The tersest way to expression Option::map might be:
Option<T>, (T -> U) -> Option<U>
That's the approach I would prefer, but nobody's going to attempt it without being told, so maybe this would be better?
Option<T>, (fn(T) -> U) -> Option<U>
It does require double parens, but at least it's mostly unambiguous. Unfortunately, it looks like the syntax you'd use for function pointers, implying that if you specifically wanted to limit your search to function pointers, you'd need to use `primitive:fn(T) -> U`. Then again, searching is normally case-insensitive, so you'd want that anyway to disambiguate from `trait:Fn(T) -> U`.
# Future possibilities
## This thing really needs a ranking algorithm
That is, this PR increases the number of matches for some type-based queries. They're usually pretty good matches, but there's still more of them, and it's evident that if you have two functions, `foo(MyTrait<u8>)` and `bar(MyTrait<Item=u8>)`, if the user typed `MyTrait<u8>` then `foo` should show up first.
A design choice that these PRs have followed is that adding more stuff to the search query always reduces the number of functions that get matched. The advantage of doing it that way is that you can rank them by just counting how many atoms are in the function's signature (lowest score goes on top). Since it's impossible for a matching function to have fewer atoms than the search query, if there's a function with exactly the same set of atoms in it, then that'll be on top.
More complicated ranking algos tend to penalize long documents anyway, if the [distance metrics](https://www.benfrederickson.com/distance-metrics/?utm_source=flipboard&utm_content=other) I found through [Flipboard](https://flipboard.com/`@arnie0426/building-recommender-systems-nvue3iqtgrn10t45)` (and postgresql's `ts_rank_cd`) are anything to go by. Real-world data sets tend to have weird outliers, like they have God Functions with zillions of arguments of all sorts of types, and Rustdoc ought to put such a function at the bottom.
The other natural choice would be interleaving with `unifyFunctionTypes` to count the number of unboxings and reorderings. This would compute a distance function, and would do a fine job of ranking the results, as [described here](https://ndmitchell.com/downloads/slides-hoogle_finding_functions_from_types-16_may_2011.pdf) by the Hoogle dev, but is more complicated than it sounds. The current algorithm returns when it finds a result that *exists at all*, but a distance function should find an *optimal solution* to find the smallest sequence of edits.
## This could also use a benchmark suite and some optimization
This approach also lends itself to layering a bloom filter in front of the backtracking unification engine.
* At load time, hash the typeNameIdMap ID for each atom and set the matching entry in a fixed-size byte array for each function to 1. Call it `fnType.bloomFilter`
* At search time, do the same for the atoms in the query (excluding special forms like `[]` that can match more than one thing). Call it `parsedQuery.bloomFilter`
* For each function, `if (fnType.bloomFilter | (~parsedQuery.bloomFilter) !== ~0) { return false; }`
There's also room to optimize the unification engine itself, by using stacks and persistent data structures instead of copying arrays around, or by using hashing instead of linear scans (the current algorithm was rewritten from one that tried to do that, but was too much to fit in my head and had a bunch of bugs). The advantage of Just Backtracking Better over the bloom filter is that it doesn't require the engine to retain any special algebraic properties.
But, first, we need a set of benchmarks to be able to judge if such a thing will actually help.
## Referring to associated types by path
*I don't want to implement this one, but if I did, this is how I'd do it.*
In Rust, this is represented by a structure called a qualified path, or QPath. They look like this:
<Self as Iterator>::Item
<F as FnOnce>::Output
They can also, if it's unambiguous, use a plain path and just let the system figure it out:
Self::Item
F::Output
In Rustdoc Type-Driven Search, we don't want to force people to be unambiguous. Instead, we should try *all reasonable interpretations*, return results whenever any of them match, and let users make their query more specific if too many results are matches.
To enable associated type path searches in Rustdoc, we need to:
1. When lowering a trait method to a search-index.js function signature, Self should be explicitly represented as a generic argument. It should always be assigned `-1`, so that if the user uses `Self` in their search query, we can ensure it always matches the real Self and not something else. Any functions that don't *have* a Self should drop a `0` into the first position of the where clause, to express that there isn't one and reserve the `-1` position.
* Reminder: generics are negative, concrete types are positive, and zero is a reserved sentinel.
* Right now, `Iterator::next` is lowered as if it were `fn next<T>(self: Iterator<Item=T>) -> Option<T>`.
It should become `fn next<Self, T>(self: Self) -> Option<T> where Self: Iterator<Item=T>` instead.
3. Add another backtracking edge to the unification engine, so that when the user writes something like `some::thing`, the interpretation where `some` is a module and `thing` is a standalone item becomes one possible match candidate, while the interpretation where `some` is a trait and `thing` is an associated type is a separate match candidate. The backtracking engine is basically powerful enough to do this already, since unboxing generic type parameters into their traits already requires the ability to do this kind of thing.
* When interpreting `some::thing` where `some` is a trait and `thing` is an associated type, it should be treated equivalently to `<self as some>::thing`. If you want to bind it to some generic parameter other than `Self`, you need to explicitly say so.
* If no trait called `some` actually exists, treat it as a generic type parameter instead. Track every trait mentioned in the current working function signature, and add a match candidate for each one.
* A user that explicitly wants the trait-associated-type interpretation could write a qpath (like `<self as trait>::type`), and a user that explicitly wants the module-item interpretation should use an item type filter (like `struct:module::type`).
4. To actually do the matching, maintain a `Map<(QueryGenericParamId, TraitId), FnGenericParamId>` alongside the existing `Map<QueryGenericParamId, FnGenericParamId>` that is already used to handle plain generic parameters. This works, because, when a trait function signature is lowered to search-index.js, the `rustdoc` backend always generates an FnGenericParamId for every trait associated type it sees mentioned in the function's signature.
5. Parse QPaths. Specifically,
* QueryElem adds three new fields. `isQPath` is a boolean flag, and `traitNameId` contains an entry for `typeNameIdMap` corresponding to the trait part of a qpath, and `parentId` may contain either a concrete type ID or a negative number referring to a generic type parameter. The actual `id` of the query elem will always be a negative number, because this is essentially a funny way to add a generic type constraint.
* If it's a QPath, then both of those IDs get filled in with the respective parts of the map. The unification engine will check the where clause to ensure the trait actually applies to the generic parameter in question, will check the type parameter constraint, and will add a mapping to `mgens` recording this as a solution.
* If it's just a regular path, then `isQPath` is false, and the parser will fill in both `traitNameId` and `parentId` based on the same path. The unification engine, seeing isQPath is false and that these IDs were filled in, will try all three solutions: the path might be part of a concrete type name, or it might be referring to a trait, or it might be referring to a generic type parameter.
### Why not implement QPath searches?
I'm not sure if anybody really wants to write such complicated queries. You can do a pretty good job of describing the generic functions in the standard library without resorting to FQPs.
These two queries, for example, would both match the Iterator::map function if we added support for higher order function queries and a rule that allows a type to match its *notable traits*.
// I like this version, because it's identical to how `Option::map` would be written.
// There's a reason why Iterator::map and Option::map have the same name.
Iterator<T>, (T -> U) -> Iterator<U>
// This version explicitly uses the type parameter constraints.
Iterator<Item=T>, (T -> U) -> Iterator<Item=U>
If I try to write this one using FQP, however, the results seem worse:
// This one is less expressive than the versions that don't use associated type paths.
// It matches `Iterator::filter`, while the above two example queries don't.
Iterator, (Iterator::Item -> Iterator::Item) -> Iterator
// This doesn't work, because the return type of `Iterator::map` is not a generic
// parameter with an `Iterator` trait bound. It's a concrete type that
// implements `Iterator`. Return-Position-Impl-Trait is the same way.
//
// There's a difference between something like `map`, whose return value
// implements Iterator, and something like `collect`, where the caller
// gets to decide what the concrete type is going to be.
//Self, (Self::Item -> I::Item) -> I where Self: Iterator, I: Iterator
// This works, but it seems subjectively ugly, complex, and counterintuitive to me.
Self, (<Self as Iterator>::Item -> T) -> Iterator<Item=T>
Implement all 16 AVX compare operators for 128-bit SIMD vectors
`_mm_cmp_{ss,ps,sd,pd}` functions are AVX functions that use `llvm.x86.sse{,2}.` prefixed intrinsics, so they were "accidentally" partially implemented when SSE and SSE2 intrinsics were implemented.
The 16 AVX compare operators are now implemented and tested.
`_mm_cmp_{ss,ps,sd,pd}` functions are AVX functions that use `llvm.x86.sse{,2}` prefixed intrinsics, so they were "accidentally" partially implemented when SSE and SSE2 intrinsics were implemented.
The 16 AVX compare operators are now implemented and tested.