The descriptions are, on almost all crates[^1], the majority
of the size of the search index, even though they aren't really
used for searching. This makes it relatively easy to separate
them into their own files.
This commit also bumps us to ES8. Out of the browsers we support,
all of them support async functions according to caniuse.
https://caniuse.com/async-functions
[^1]:
<https://microsoft.github.io/windows-docs-rs/>, a crate with
44MiB of pure names and no descriptions for them, is an outlier
and should not be counted.
Visually mark 👻hidden👻 items with document-hidden-items
Fixes#122485
This adds a 👻 in the item list (much like the 🔒 used for private items), and also shows `#[doc(hidden)]` in the code view, where `pub(crate)` etc gets shown for private items.
This does not do anything for enum variants, if people have ideas. I think we can just show the attribute.
rustdoc-search: depth limit `T<U>` -> `U` unboxing
Profiler output:
https://notriddle.com/rustdoc-html-demo-9/search-unbox-limit/ (the only significant change is that one of the `rust` tests went from 378416ms to 16ms).
This is a performance enhancement aimed at a problem I found while using type-driven search on the Rust compiler. It is caused by [`Interner`], a trait with 41 associated types, many of which recurse back to `Self` again.
This caused search.js to struggle. It eventually terminates, after about 10 minutes of turning my PC into a space header, but it's doing `41!` unifications and that's too slow.
[`Interner`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/trait.Interner.html
rustdoc-search: search types by higher-order functions
This feature extends rustdoc with syntax and search index information for searching function pointers and closures (Higher-Order Functions, or HOF). Part of https://github.com/rust-lang/rust/issues/60485
This PR adds two syntaxes: a high-level one for finding any kind of HOF, and a direct implementation of the parenthesized path syntax that Rust itself uses.
## Preview pages
| Query | Results |
|-------|---------|
| [`option<T>, (fnonce (T) -> bool) -> option<T>`][optionfilter] | `Option::filter` |
| [`option<T>, (T -> bool) -> option<T>`][optionfilter2] | `Option::filter` |
Updated chapter of the book: https://notriddle.com/rustdoc-html-demo-9/search-hof/rustdoc/read-documentation/search.html
[optionfilter]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=option<T>%2C+(fnonce+(T)+->+bool)+->+option<T>&filter-crate=std
[optionfilter2]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=option<T>%2C+(T+->+bool)+->+option<T>&filter-crate=std
## Motivation
When type-based search was first landed, it was directly [described as incomplete][a comment].
[a comment]: https://github.com/rust-lang/rust/pull/23289#issuecomment-79437386
Filling out the missing functionality is going to mean adding support for more of Rust's [type expression] syntax, such as references, raw pointers, function pointers, and closures. This PR adds function pointers and closures.
[type expression]: https://doc.rust-lang.org/reference/types.html#type-expressions
There's been demand for something "like Hoogle, but for Rust" expressed a few times [1](https://www.reddit.com/r/rust/comments/y8sbid/is_there_a_website_like_haskells_hoogle_for_rust/) [2](https://users.rust-lang.org/t/rust-equivalent-of-haskells-hoogle/102280) [3](https://internals.rust-lang.org/t/std-library-inclusion-policy/6852/2) [4](https://discord.com/channels/442252698964721669/448238009733742612/1109502307495858216). Some of them just don't realize what functionality already exists ([`Duration -> u64`](https://doc.rust-lang.org/nightly/std/?search=duration%20-%3E%20u64) already works), but a lot of them specifically want to search for higher-order functions like option combinators.
## Guide-level explanation (from the Rustdoc book)
To search for a function that accepts a function as a parameter, like `Iterator::all`, wrap the nested signature in parenthesis, as in [`Iterator<T>, (T -> bool) -> bool`][iterator-all]. You can also search for a specific closure trait, such as `Iterator<T>, (FnMut(T) -> bool) -> bool`, but you need to know which one you want.
[iterator-all]: https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=Iterator<T>%2C+(T+->+bool)+->+bool&filter-crate=std
## Reference-level description (also from the Rustdoc book)
### Primitives with Special Syntax
<table>
<thead>
<tr>
<th>Shorthand</th>
<th>Explicit names</th>
</tr>
</thead>
<tbody>
<tr><td colspan="2">Before this PR</td></tr>
<tr>
<td><code>[]</code></td>
<td><code>primitive:slice</code> and/or <code>primitive:array</code></td>
</tr>
<tr>
<td><code>[T]</code></td>
<td><code>primitive:slice<T></code> and/or <code>primitive:array<T></code></td>
</tr>
<tr>
<td><code>!</code></td>
<td><code>primitive:never</code></td>
</tr>
<tr>
<td><code>()</code></td>
<td><code>primitive:unit</code> and/or <code>primitive:tuple</code></td>
</tr>
<tr>
<td><code>(T)</code></td>
<td><code>T</code></td>
</tr>
<tr>
<td><code>(T,)</code></td>
<td><code>primitive:tuple<T></code></td>
</tr>
<tr><td colspan="2">After this PR</td></tr>
<tr>
<td><code>(T, U -> V, W)</code></td>
<td><code>fn(T, U) -> (V, W)</code>, Fn, FnMut, and FnOnce</td>
</tr>
</tbody>
</table>
The `->` operator has lower precedence than comma. If it's not wrapped in brackets, it delimits the return value for the function being searched for. To search for functions that take functions as parameters, use parenthesis.
### Search query grammar
```ebnf
ident = *(ALPHA / DIGIT / "_")
path = ident *(DOUBLE-COLON ident) [BANG]
slice-like = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
tuple-like = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like)
type-sep = COMMA/WS *(COMMA/WS)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep) [ return-args ]
generic-arg-list = *(type-sep) arg [ EQUAL arg ] *(type-sep arg [ EQUAL arg ]) *(type-sep)
normal-generics = OPEN-ANGLE-BRACKET [ generic-arg-list ] *(type-sep)
CLOSE-ANGLE-BRACKET
fn-like-generics = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN [ RETURN-ARROW arg ]
generics = normal-generics / fn-like-generics
return-args = RETURN-ARROW *(type-sep) nonempty-arg-list
exact-search = [type-filter *WS COLON] [ RETURN-ARROW ] *WS QUOTE ident QUOTE [ generics ]
type-search = [ nonempty-arg-list ]
query = *WS (exact-search / type-search) *WS
; unchanged parts of the grammar, like the full list of type filters, are omitted
```
## Future direction
### The remaining type expression grammar
As described in https://github.com/rust-lang/rust/pull/118194, this is another step in the type expression grammar: BareFunction, and the function-like mode of TypePath, are now supported.
* RawPointerType and ReferenceType actually are a priority.
* ImplTraitType and TraitObjectType (and ImplTraitTypeOneBound and TraitObjectTypeOneBound) aren't as much of a priority, since they desugar pretty easily.
### Search subtyping and traits
This is the other major factor that makes it less useful than it should be.
* `iterator<result<t>> -> result<t>` doesn't find `Result::from_iter`. You have to search [`intoiterator<result<t>> -> result<t>`](https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=intoiterator%3Cresult%3Ct%3E%3E%20-%3E%20result%3Ct%3E&filter-crate=std). Nobody's going to search for IntoIterator unless they basically already know about it and don't need the search engine anyway.
* Iterator combinators are usually structs that happen to implement Iterator, like `std::iter::Map`.
To solve these cases, it needs to look at trait implementations, knowing that Iterator is a "subtype of" IntoIterator, and Map is a "subtype of" Iterator, so `iterator -> result` is a subtype of `intoiterator -> result` and `iterator<t>, (t -> u) -> iterator<u>` is a subtype of [`iterator<t>, (t -> u) -> map<t -> u>`](https://notriddle.com/rustdoc-html-demo-9/search-hof/std/vec/struct.Vec.html?search=iterator%3Ct%3E%2C%20(t%20-%3E%20u)%20-%3E%20map%3Ct%20-%3E%20u%3E&filter-crate=std).
This is implemented, in addition to the ML-style one,
because Rust does it. If we don't, we'll never hear the end of it.
This commit also refactors some duplicate parts of the parser
into a dedicated function.
Option::map, for example, looks like this:
option<t>, (t -> u) -> option<u>
This syntax searches all of the HOFs in Rust: traits Fn, FnOnce,
and FnMut, and bare fn primitives.
[rustdoc] Prevent inclusion of whitespace character after macro_rules ident
Discovered this bug randomly when looking at:

We were too eagerly trying to merge tokens that shouldn't be merged together (for example if you have a code comment followed by a code comment, we merge them in one attribute to reduce the DOM size).
r? ``@notriddle``
Diagnostic renaming
Renaming various diagnostic types from `Diagnostic*` to `Diag*`. Part of https://github.com/rust-lang/compiler-team/issues/722. There are more to do but this is enough for one PR.
r? `@davidtwco`
Correctly handle if rustdoc JS script hash changed
It's something that annoyed me for quite some time: I have nightly docs open (for both std and compiler). And often, I don't look at the page for some days. Then when I come back to it, I make a search... except nothing happens. Took me a while to figure out that it was because the hash of one of the JS files we load for the search (either `search.js` or `search-index.js`) was updated in the meantime, preventing the search to be done. To go around it, I added to press `ENTER` to make the form submitted (which would reload the same page but with the correct hashes this time and the search being run).
r? `@notriddle`
It makes the link easier to use in cases in which
the path of the page where it will be embedded is not
known beforehand such as when we generate impls
dynamically from `register_type_impls` method in
`main.js`
Earlier for local primitives we would generate a path
that was relative to the current page depth passed in `cx.current`
. e.g if the current page was `std::simd::prelude::Simd` the
generated path would be `../../primitive.<prim>.html` After this
change the path will first take you to the the wesite root and add
the crate name. e.g. for `std::simd::prelude::Simd` the path now
will be `../../../std/primitive.<prim>.html`
rustdoc: fix and refactor HTML rendering a bit
* refactoring: get rid of a bunch of manual `f.alternate()` branches
* not sure why this wasn't done so already, is this perf-sensitive?
* fix an ICE in debug builds of rustdoc
* rustdoc used to crash on empty outlives-bounds: `where 'a:`
* properly escape const generic defaults
* actually print empty trait and outlives-bounds (doesn't work for cross-crate reexports yet, will fix that at some other point) since they can have semantic significance
* outlives-bounds: forces lifetime params to be early-bound instead of late-bound which is technically speaking part of the public API
* trait-bounds: can affect the well-formedness, consider
* makeshift “const-evaluatable” bounds under `generic_const_exprs`
* bounds to force wf-checking in light of #100041 (quite artificial I know, I couldn't figure out something better), see https://github.com/rust-lang/rust/pull/121160#discussion_r1491563816
rustdoc: trait.impl, type.impl: sort impls to make it not depend on serialization order
Can be tested by running `cargo doc` with different rust versions on some crate and comparing `doc` folders: files in `trait.impl` and `type.impl` will sometimes have different order of impls.
rustdoc: Prevent JS injection from localStorage
It turns out that you can execute arbitrary JavaScript on the rustdocs settings page. Here's how:
1. Open `settings.html` on a rustdocs site.
2. Set "preferred light theme" to "dark" to initialize the corresponding localStorage value.
3. Plant a payload by executing this in your browser's dev console: ``Object.keys(localStorage).forEach(key=>localStorage.setItem(key,`javascript:alert()//*/javascript:javascript:"/*'/*\`/*--></noscript></title></textarea></style></template></noembed></script><html " onmouseover=/*<svg/*/onload=alert()onload=alert()//><svg onload=alert()><svg onload=alert()>*/</style><script>alert()</script><style>`));``
4. Refresh the page -- you should see an alert.
This could be particularly dangerous if rustdocs are deployed on a domain hosting some other application. Malicious code could circumvent `same-origin` policies and do mischievous things with user data.
This change ensures that only defined themes can actually be selected (arbitrary strings from localStorage will not be written to the document), and for good measure sanitizes the theme name.