rustdoc-search: tighter encoding for f index
Depends on https://github.com/rust-lang/rust/pull/119457
Two optimizations for the function signature search:
* Instead of using JSON arrays, like `[1,20]`, it uses VLQ
hex with no commas, like `[aAd]`.
* This also adds backrefs: if you have more than one function
with exactly the same signature, it'll not only store it once,
it'll *decode* it once, and store in the typeIdMap only once.
Based partially on discussions on zulip:
https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/search.20index.20size
Performance
-----------
https://notriddle.com/rustdoc-html-demo-8/compression-perf-v2/index.html
### memory/time profiler output (for more details, consult the above link)
<table>
<thead><tr><th>benchmark<th>before<th>after</tr></thead>
<tbody>
<tr><th>arti<td>
```
user: 002.789 s
sys: 000.390 s
wall: 002.096 s
child_RSS_high: 440796 KiB
group_mem_high: 414924 KiB
```
</td><td>
```
user: 002.295 s
sys: 000.278 s
wall: 001.738 s
child_RSS_high: 314588 KiB
group_mem_high: 285220 KiB
```
</td></tr><tr><th>cortex-m<td>
```
user: 000.127 s
sys: 000.030 s
wall: 000.134 s
child_RSS_high: 60264 KiB
group_mem_high: 23824 KiB
```
</td><td>
```
user: 000.136 s
sys: 000.038 s
wall: 000.137 s
child_RSS_high: 59204 KiB
group_mem_high: 22712 KiB
```
</td></tr><tr><th>sqlx<td>
```
user: 000.887 s
sys: 000.118 s
wall: 000.592 s
child_RSS_high: 190408 KiB
group_mem_high: 157804 KiB
```
</td><td>
```
user: 000.798 s
sys: 000.101 s
wall: 000.525 s
child_RSS_high: 159292 KiB
group_mem_high: 126292 KiB
```
</td></tr><tr><th>stm32f4<td>
```
user: 013.884 s
sys: 005.399 s
wall: 013.149 s
child_RSS_high: 1942244 KiB
group_mem_high: 1954916 KiB
```
</td><td>
```
user: 006.128 s
sys: 003.297 s
wall: 007.994 s
child_RSS_high: 1038108 KiB
group_mem_high: 1023900 KiB
```
</td></tr><tr><th>ripgrep<td>
```
user: 000.441 s
sys: 000.063 s
wall: 000.264 s
child_RSS_high: 109180 KiB
group_mem_high: 74272 KiB
```
</td><td>
```
user: 000.408 s
sys: 000.044 s
wall: 000.238 s
child_RSS_high: 101488 KiB
group_mem_high: 66000 KiB
```
</td></tr></tbody></table>
Size change
-----------
standard library without gzip:
```console
$ du -bs search-index-old.js search-index-new.js
4976370 search-index-old.js
4404391 search-index-new.js
```
((4976370-4404391)/4404391)*100% = 12.9%
with gzip:
```console
$ du -hs search-index-old.js.gz search-index-new.js.gz
520K search-index-old.js.gz
504K search-index-new.js.gz
$ du -bs search-index-old.js.gz search-index-new.js.gz
522092 search-index-old.js.gz
507654 search-index-new.js.gz
```
((522092-507654)/507654)*100% = 2.8%
Benchmarks are similarly shrunk.
Without gzip:
```console
$ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js
10555067 tmp/arti/toolchain_old/doc/search-index.js
8921236 tmp/arti/toolchain_new/doc/search-index.js
77018 tmp/cortex-m/toolchain_old/doc/search-index.js
66676 tmp/cortex-m/toolchain_new/doc/search-index.js
2876330 tmp/sqlx/toolchain_old/doc/search-index.js
2436812 tmp/sqlx/toolchain_new/doc/search-index.js
63632890 tmp/stm32f4/toolchain_old/doc/search-index.js
52337438 tmp/stm32f4/toolchain_new/doc/search-index.js
631150 tmp/ripgrep/toolchain_old/doc/search-index.js
541646 tmp/ripgrep/toolchain_new/doc/search-index.js
```
With gzip:
```console
$ du -bs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js.gz
1618852 tmp/arti/toolchain_old/doc/search-index.js.gz
1582007 tmp/arti/toolchain_new/doc/search-index.js.gz
16109 tmp/cortex-m/toolchain_old/doc/search-index.js.gz
15831 tmp/cortex-m/toolchain_new/doc/search-index.js.gz
422257 tmp/sqlx/toolchain_old/doc/search-index.js.gz
411507 tmp/sqlx/toolchain_new/doc/search-index.js.gz
4454761 tmp/stm32f4/toolchain_old/doc/search-index.js.gz
4334924 tmp/stm32f4/toolchain_new/doc/search-index.js.gz
98312 tmp/ripgrep/toolchain_old/doc/search-index.js.gz
96864 tmp/ripgrep/toolchain_new/doc/search-index.js.gz
$ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.j
s.gz
1.6M tmp/arti/toolchain_old/doc/search-index.js.gz
1.6M tmp/arti/toolchain_new/doc/search-index.js.gz
24K tmp/cortex-m/toolchain_old/doc/search-index.js.gz
24K tmp/cortex-m/toolchain_new/doc/search-index.js.gz
424K tmp/sqlx/toolchain_old/doc/search-index.js.gz
412K tmp/sqlx/toolchain_new/doc/search-index.js.gz
4.3M tmp/stm32f4/toolchain_old/doc/search-index.js.gz
4.2M tmp/stm32f4/toolchain_new/doc/search-index.js.gz
108K tmp/ripgrep/toolchain_old/doc/search-index.js.gz
104K tmp/ripgrep/toolchain_new/doc/search-index.js.gz
```
[rustdoc] Fix invalid handling for static method calls in jump to definition feature
I realized when working on a clippy lint that static method calls on `Self` could not give me the method `Res`. For that, we need to use `typeck` and so that's what I did in here.
It fixes the linking to static method calls.
r? ````@notriddle````
Reorder check_item_type diagnostics so they occur next to the corresponding `check_well_formed` diagnostics
The first commit is just a cleanup.
The second commit moves most checks from `check_mod_item_types` into `check_well_formed`, invoking the checks in lockstep per-item instead of iterating over all items twice.
`Diagnostic` has 40 methods that return `&mut Self` and could be
considered setters. Four of them have a `set_` prefix. This doesn't seem
necessary for a type that implements the builder pattern. This commit
removes the `set_` prefixes on those four methods.
Two optimizations for the function signature search:
* Instead of using JSON arrays, like `[1,20]`, it uses VLQ
hex with no commas, like `[aAd]`.
* This also adds backrefs: if you have more than one function
with exactly the same signature, it'll not only store it once,
it'll *decode* it once, and store in the typeIdMap only once.
Size change
-----------
standard library
```console
$ du -bs search-index-old.js search-index-new.js
4976370 search-index-old.js
4404391 search-index-new.js
```
((4976370-4404391)/4404391)*100% = 12.9%
Benchmarks are similarly shrunk:
```console
$ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js
10555067 tmp/arti/toolchain_old/doc/search-index.js
8921236 tmp/arti/toolchain_new/doc/search-index.js
77018 tmp/cortex-m/toolchain_old/doc/search-index.js
66676 tmp/cortex-m/toolchain_new/doc/search-index.js
2876330 tmp/sqlx/toolchain_old/doc/search-index.js
2436812 tmp/sqlx/toolchain_new/doc/search-index.js
63632890 tmp/stm32f4/toolchain_old/doc/search-index.js
52337438 tmp/stm32f4/toolchain_new/doc/search-index.js
631150 tmp/ripgrep/toolchain_old/doc/search-index.js
541646 tmp/ripgrep/toolchain_new/doc/search-index.js
```
rustdoc-search: count path edits with separate edit limit
Avoids strange-looking results like this one, where the path component seems to be ignored:

Since the two are counted separately elsewhere, they should get their own limits, too. The biggest problem with combining them is that paths are loosely checked by not requiring every component to match, which means that if they are short and matched loosely, they can easily find "drunk typist" matches that make no sense, like this old result:
std::collections::btree_map::itermut matching slice::itermut
maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4
editDistance("std", "slice") = 4
editDistance("itermut", "itermut") = 0
4 + 0 <= 4 PASS
Of course, `slice::itermut` should not match stuff from btreemap. `slice` should not match `std`.
The new result counts them separately:
maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1
maxEditDistance = "itermut".length / 3 = 7 / 3 = 2
editDistance("std", "slice") = 4
4 <= 1 FAIL
Effectively, this makes path queries less "typo-resistant". It's not zero, but it means `vec` won't match the `v1` prelude.
This commit also adds substring matching to paths. It's stricter than the substring matching in the main part, but loose enough that what I expect to match does.
Queries without parent paths are unchanged.
Introduce `const Trait` (always-const trait bounds)
Feature `const_trait_impl` currently lacks a way to express “always const” trait bounds. This makes it impossible to define generic items like fns or structs which contain types that depend on const method calls (\*). While the final design and esp. the syntax of effects / keyword generics isn't set in stone, some version of “always const” trait bounds will very likely form a part of it. Further, their implementation is trivial thanks to the `effects` backbone.
Not sure if this needs t-lang sign-off though.
(\*):
```rs
#![feature(const_trait_impl, effects, generic_const_exprs)]
fn compute<T: const Trait>() -> Type<{ T::generate() }> { /*…*/ }
struct Store<T: const Trait>
where
Type<{ T::generate() }>:,
{
field: Type<{ T::generate() }>,
}
```
Lastly, “always const” trait bounds are a perfect fit for `generic_const_items`.
```rs
#![feature(const_trait_impl, effects, generic_const_items)]
const DEFAULT<T: const Default>: T = T::default();
```
Previously, we (oli, fee1-dead and I) wanted to reinterpret `~const Trait` as `const Trait` in generic const items which would've been quite surprising and not very generalizable.
Supersedes #117530.
---
cc `@oli-obk`
As discussed
r? fee1-dead (or compiler)
Since the two are counted separately elsewhere, they should get
their own limits, too. The biggest problem with combining them
is that paths are loosely checked by not requiring every component
to match, which means that if they are short and matched loosely,
they can easily find "drunk typist" matches that make no sense,
like this old result:
std::collections::btree_map::itermut matching slice::itermut
maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4
editDistance("std", "slice") = 4
editDistance("itermut", "itermut") = 0
4 + 0 <= 4 PASS
Of course, `slice::itermut` should not match stuff from btreemap.
`slice` should not match `std`.
The new result counts them separately:
maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1
maxEditDistance = "itermut".length / 3 = 7 / 3 = 2
editDistance("std", "slice") = 4
4 <= 1 FAIL
Effectively, this makes path queries less "typo-resistant".
It's not zero, but it means `vec` won't match the `v1` prelude.
Queries without parent paths are unchanged.
Remove `DiagCtxt` API duplication
`DiagCtxt` defines the internal API for creating and emitting diagnostics: methods like `struct_err`, `struct_span_warn`, `note`, `create_fatal`, `emit_bug`. There are over 50 methods.
Some of these methods are then duplicated across several other types: `Session`, `ParseSess`, `Parser`, `ExtCtxt`, and `MirBorrowckCtxt`. `Session` duplicates the most, though half the ones it does are unused. Each duplicated method just calls forward to the corresponding method in `DiagCtxt`. So this duplication exists to (in the best case) shorten chains like `ecx.tcx.sess.parse_sess.dcx.emit_err()` to `ecx.emit_err()`.
This API duplication is ugly and has been bugging me for a while. And it's inconsistent: there's no real logic about which methods are duplicated, and the use of `#[rustc_lint_diagnostic]` and `#[track_caller]` attributes vary across the duplicates.
This PR removes the duplicated API methods and makes all diagnostic creation and emission go through `DiagCtxt`. It also adds `dcx` getter methods to several types to shorten chains. This approach scales *much* better than API duplication; indeed, the PR adds `dcx()` to numerous types that didn't have API duplication: `TyCtxt`, `LoweringCtxt`, `ConstCx`, `FnCtxt`, `TypeErrCtxt`, `InferCtxt`, `CrateLoader`, `CheckAttrVisitor`, and `Resolver`. These result in a lot of changes from `foo.tcx.sess.emit_err()` to `foo.dcx().emit_err()`. (You could do this with more types, but it gets into diminishing returns territory for types that don't emit many diagnostics.)
After all these changes, some call sites are more verbose, some are less verbose, and many are the same. The total number of lines is reduced, mostly because of the removed API duplication. And consistency is increased, because calls to `emit_err` and friends are always preceded with `.dcx()` or `.dcx`.
r? `@compiler-errors`
Lots of vectors of messages called `message` or `msg`. This commit
pluralizes them.
Note that `emit_message_default` and `emit_messages_default` both
already existed, and both process a vector, so I renamed the former
`emit_messages_default_inner` because it's called by the latter.
It doesn't look quite right, because the lines are too far apart,
and it's not going to be announced by screenreaders as a menu button,
since that's not what the symbol means.
This adds a real tooltip and uses a better drawing of the icon.
This is a redesign of the feature, with parts pulled from
https://github.com/rust-lang/rust/pull/119049
but with a button that looks more like a button and matches the
one used on other sidebar pages.