Find a file
Michael Goulet b3f307434e
Rollup merge of #119468 - notriddle:notriddle/compression, r=GuillaumeGomez
rustdoc-search: tighter encoding for f index

Depends on https://github.com/rust-lang/rust/pull/119457

Two optimizations for the function signature search:

* Instead of using JSON arrays, like `[1,20]`, it uses VLQ
  hex with no commas, like `[aAd]`.
* This also adds backrefs: if you have more than one function
  with exactly the same signature, it'll not only store it once,
  it'll *decode* it once, and store in the typeIdMap only once.

Based partially on discussions on zulip:
https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/search.20index.20size

Performance
-----------

https://notriddle.com/rustdoc-html-demo-8/compression-perf-v2/index.html

### memory/time profiler output (for more details, consult the above link)

<table>
<thead><tr><th>benchmark<th>before<th>after</tr></thead>
<tbody>
<tr><th>arti<td>

```
user: 002.789 s
sys:  000.390 s
wall: 002.096 s
child_RSS_high:     440796 KiB
group_mem_high:     414924 KiB
```

</td><td>

```
user: 002.295 s
sys:  000.278 s
wall: 001.738 s
child_RSS_high:     314588 KiB
group_mem_high:     285220 KiB
```

</td></tr><tr><th>cortex-m<td>

```
user: 000.127 s
sys:  000.030 s
wall: 000.134 s
child_RSS_high:      60264 KiB
group_mem_high:      23824 KiB
```

</td><td>

```
user: 000.136 s
sys:  000.038 s
wall: 000.137 s
child_RSS_high:      59204 KiB
group_mem_high:      22712 KiB
```

</td></tr><tr><th>sqlx<td>

```
user: 000.887 s
sys:  000.118 s
wall: 000.592 s
child_RSS_high:     190408 KiB
group_mem_high:     157804 KiB
```

</td><td>

```
user: 000.798 s
sys:  000.101 s
wall: 000.525 s
child_RSS_high:     159292 KiB
group_mem_high:     126292 KiB
```

</td></tr><tr><th>stm32f4<td>

```
user: 013.884 s
sys:  005.399 s
wall: 013.149 s
child_RSS_high:    1942244 KiB
group_mem_high:    1954916 KiB
```

</td><td>

```
user: 006.128 s
sys:  003.297 s
wall: 007.994 s
child_RSS_high:    1038108 KiB
group_mem_high:    1023900 KiB
```

</td></tr><tr><th>ripgrep<td>

```
user: 000.441 s
sys:  000.063 s
wall: 000.264 s
child_RSS_high:     109180 KiB
group_mem_high:      74272 KiB
```

</td><td>

```
user: 000.408 s
sys:  000.044 s
wall: 000.238 s
child_RSS_high:     101488 KiB
group_mem_high:      66000 KiB
```

</td></tr></tbody></table>

Size change
-----------

standard library without gzip:

```console
$ du -bs search-index-old.js search-index-new.js
4976370 search-index-old.js
4404391 search-index-new.js
```

((4976370-4404391)/4404391)*100% = 12.9%

with gzip:

```console
$ du -hs search-index-old.js.gz search-index-new.js.gz
520K    search-index-old.js.gz
504K    search-index-new.js.gz

$ du -bs search-index-old.js.gz search-index-new.js.gz
522092  search-index-old.js.gz
507654  search-index-new.js.gz
```

((522092-507654)/507654)*100% = 2.8%

Benchmarks are similarly shrunk.

Without gzip:

```console
$ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js
10555067        tmp/arti/toolchain_old/doc/search-index.js
8921236 tmp/arti/toolchain_new/doc/search-index.js
77018   tmp/cortex-m/toolchain_old/doc/search-index.js
66676   tmp/cortex-m/toolchain_new/doc/search-index.js
2876330 tmp/sqlx/toolchain_old/doc/search-index.js
2436812 tmp/sqlx/toolchain_new/doc/search-index.js
63632890        tmp/stm32f4/toolchain_old/doc/search-index.js
52337438        tmp/stm32f4/toolchain_new/doc/search-index.js
631150  tmp/ripgrep/toolchain_old/doc/search-index.js
541646  tmp/ripgrep/toolchain_new/doc/search-index.js
```

With gzip:

```console
$ du -bs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js.gz
1618852 tmp/arti/toolchain_old/doc/search-index.js.gz
1582007 tmp/arti/toolchain_new/doc/search-index.js.gz
16109   tmp/cortex-m/toolchain_old/doc/search-index.js.gz
15831   tmp/cortex-m/toolchain_new/doc/search-index.js.gz
422257  tmp/sqlx/toolchain_old/doc/search-index.js.gz
411507  tmp/sqlx/toolchain_new/doc/search-index.js.gz
4454761 tmp/stm32f4/toolchain_old/doc/search-index.js.gz
4334924 tmp/stm32f4/toolchain_new/doc/search-index.js.gz
98312   tmp/ripgrep/toolchain_old/doc/search-index.js.gz
96864   tmp/ripgrep/toolchain_new/doc/search-index.js.gz

$ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.j
s.gz
1.6M    tmp/arti/toolchain_old/doc/search-index.js.gz
1.6M    tmp/arti/toolchain_new/doc/search-index.js.gz
24K     tmp/cortex-m/toolchain_old/doc/search-index.js.gz
24K     tmp/cortex-m/toolchain_new/doc/search-index.js.gz
424K    tmp/sqlx/toolchain_old/doc/search-index.js.gz
412K    tmp/sqlx/toolchain_new/doc/search-index.js.gz
4.3M    tmp/stm32f4/toolchain_old/doc/search-index.js.gz
4.2M    tmp/stm32f4/toolchain_new/doc/search-index.js.gz
108K    tmp/ripgrep/toolchain_old/doc/search-index.js.gz
104K    tmp/ripgrep/toolchain_new/doc/search-index.js.gz
```
2024-01-05 23:41:42 -05:00
.github Temporarily disable M1 runners on GitHub Actions 2024-01-03 15:20:03 +01:00
.reuse Reinstate the names of the Ayu dark theme developers. 2023-12-11 16:07:40 +00:00
compiler Rollup merge of #119420 - cjgillot:issue-119295, r=compiler-errors 2024-01-05 23:41:42 -05:00
library Rollup merge of #119216 - weiznich:use_diagnostic_namespace_in_stdlib, r=compiler-errors 2024-01-05 23:41:41 -05:00
LICENSES Add missing CC-BY-SA-4.0. 2023-11-27 11:03:53 +00:00
src Rollup merge of #119468 - notriddle:notriddle/compression, r=GuillaumeGomez 2024-01-05 23:41:42 -05:00
tests Rollup merge of #119420 - cjgillot:issue-119295, r=compiler-errors 2024-01-05 23:41:42 -05:00
.editorconfig Only use max_line_length = 100 for *.rs 2023-07-10 15:18:36 -07:00
.git-blame-ignore-revs Ignore let-chains formatting 2023-10-15 18:30:34 +00:00
.gitattributes Rename config.toml.example to config.example.toml 2023-03-11 14:10:00 -08:00
.gitignore don't globally ignore rustc-ice files 2023-09-16 09:44:44 +02:00
.gitmodules Update to LLVM 17.0.6 2023-12-14 09:54:14 +01:00
.mailmap Add my work email to the mailmap 2023-11-26 18:39:38 +01:00
Cargo.lock Expose whether a channel has been dropped in lsp-server errors 2024-01-01 14:10:46 +01:00
Cargo.toml Auto merge of #16211 - tetsuharuohzeki:update-lint, r=Veykril 2024-01-02 14:53:22 +00:00
CODE_OF_CONDUCT.md Remove the code of conduct; instead link https://www.rust-lang.org/conduct.html 2019-10-05 22:55:19 +02:00
config.example.toml utilize the unused llvm-tools option 2023-12-28 17:27:59 +03:00
configure Enforce Python 3 as much as possible 2020-04-10 09:09:58 -04:00
CONTRIBUTING.md fix: Update CONTRIBUTING.md recommend -> recommended 2023-11-16 23:57:09 +05:30
COPYRIGHT Update COPYRIGHT file 2022-10-30 10:23:14 -04:00
LICENSE-APACHE Remove appendix from LICENCE-APACHE 2019-12-30 14:25:53 +00:00
LICENSE-MIT LICENSE-MIT: Remove inaccurate (misattributed) copyright notice 2017-07-26 16:51:58 -07:00
README.md Capitalize ToC in README.md 2023-11-29 23:03:31 -06:00
RELEASES.md apply last suggestions from code review 2023-12-21 13:26:15 +01:00
rust-bors.toml Add integration for new bors 2023-09-28 10:43:24 +02:00
rustfmt.toml Fix fn_sig_for_fn_abi and the coroutine transform for generators 2023-11-23 20:17:19 +00:00
triagebot.toml Mark myself as back from leave 2024-01-02 10:19:03 +00:00
x Make x capable of resolving symlinks 2023-10-14 17:53:33 +03:00
x.ps1 use & instead of start-process in x.ps1 2023-12-09 09:46:16 -05:00
x.py Fix recent python linting errors 2023-08-02 04:40:28 -04:00

The Rust Programming Language

Rust Community

This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.

Note: this README is for users rather than contributors. If you wish to contribute to the compiler, you should read CONTRIBUTING.md instead.

Table of Contents

Quick Start

Read "Installation" from The Book.

Installing from Source

The Rust build system uses a Python script called x.py to build the compiler, which manages the bootstrapping process. It lives at the root of the project. It also uses a file named config.toml to determine various configuration settings for the build. You can see a full list of options in config.example.toml.

The x.py command can be run directly on most Unix systems in the following format:

./x.py <subcommand> [flags]

This is how the documentation and examples assume you are running x.py. See the rustc dev guide if this does not work on your platform.

More information about x.py can be found by running it with the --help flag or reading the rustc dev guide.

Dependencies

Make sure you have installed the dependencies:

  • python 3 or 2.7
  • git
  • A C compiler (when building for the host, cc is enough; cross-compiling may need additional compilers)
  • curl (not needed on Windows)
  • pkg-config if you are compiling on Linux and targeting Linux
  • libiconv (already included with glibc on Debian-based distros)

To build Cargo, you'll also need OpenSSL (libssl-dev or openssl-devel on most Unix distros).

If building LLVM from source, you'll need additional tools:

  • g++, clang++, or MSVC with versions listed on LLVM's documentation
  • ninja, or GNU make 3.81 or later (Ninja is recommended, especially on Windows)
  • cmake 3.13.4 or later
  • libstdc++-static may be required on some Linux distributions such as Fedora and Ubuntu

On tier 1 or tier 2 with host tools platforms, you can also choose to download LLVM by setting llvm.download-ci-llvm = true. Otherwise, you'll need LLVM installed and llvm-config in your path. See the rustc-dev-guide for more info.

Building on a Unix-like system

Build steps

  1. Clone the source with git:

    git clone https://github.com/rust-lang/rust.git
    cd rust
    
  1. Configure the build settings:

    ./configure
    

    If you plan to use x.py install to create an installation, it is recommended that you set the prefix value in the [install] section to a directory: ./configure --set install.prefix=<path>

  2. Build and install:

    ./x.py build && ./x.py install
    

    When complete, ./x.py install will place several programs into $PREFIX/bin: rustc, the Rust compiler, and rustdoc, the API-documentation tool. By default, it will also include Cargo, Rust's package manager. You can disable this behavior by passing --set build.extended=false to ./configure.

Configure and Make

This project provides a configure script and makefile (the latter of which just invokes x.py). ./configure is the recommended way to programmatically generate a config.toml. make is not recommended (we suggest using x.py directly), but it is supported and we try not to break it unnecessarily.

./configure
make && sudo make install

configure generates a config.toml which can also be used with normal x.py invocations.

Building on Windows

On Windows, we suggest using winget to install dependencies by running the following in a terminal:

winget install -e Python.Python.3
winget install -e Kitware.CMake
winget install -e Git.Git

Then edit your system's PATH variable and add: C:\Program Files\CMake\bin. See this guide on editing the system PATH from the Java documentation.

There are two prominent ABIs in use on Windows: the native (MSVC) ABI used by Visual Studio and the GNU ABI used by the GCC toolchain. Which version of Rust you need depends largely on what C/C++ libraries you want to interoperate with. Use the MSVC build of Rust to interop with software produced by Visual Studio and the GNU build to interop with GNU software built using the MinGW/MSYS2 toolchain.

MinGW

MSYS2 can be used to easily build Rust on Windows:

  1. Download the latest MSYS2 installer and go through the installer.

  2. Run mingw32_shell.bat or mingw64_shell.bat from the MSYS2 installation directory (e.g. C:\msys64), depending on whether you want 32-bit or 64-bit Rust. (As of the latest version of MSYS2 you have to run msys2_shell.cmd -mingw32 or msys2_shell.cmd -mingw64 from the command line instead.)

  3. From this terminal, install the required tools:

    # Update package mirrors (may be needed if you have a fresh install of MSYS2)
    pacman -Sy pacman-mirrors
    
    # Install build tools needed for Rust. If you're building a 32-bit compiler,
    # then replace "x86_64" below with "i686". If you've already got Git, Python,
    # or CMake installed and in PATH you can remove them from this list.
    # Note that it is important that you do **not** use the 'python2', 'cmake',
    # and 'ninja' packages from the 'msys2' subsystem.
    # The build has historically been known to fail with these packages.
    pacman -S git \
                make \
                diffutils \
                tar \
                mingw-w64-x86_64-python \
                mingw-w64-x86_64-cmake \
                mingw-w64-x86_64-gcc \
                mingw-w64-x86_64-ninja
    
  4. Navigate to Rust's source code (or clone it), then build it:

    python x.py setup user && python x.py build && python x.py install
    

MSVC

MSVC builds of Rust additionally require an installation of Visual Studio 2017 (or later) so rustc can use its linker. The simplest way is to get Visual Studio, check the "C++ build tools" and "Windows 10 SDK" workload.

(If you're installing CMake yourself, be careful that "C++ CMake tools for Windows" doesn't get included under "Individual components".)

With these dependencies installed, you can build the compiler in a cmd.exe shell with:

python x.py setup user
python x.py build

Right now, building Rust only works with some known versions of Visual Studio. If you have a more recent version installed and the build system doesn't understand, you may need to force rustbuild to use an older version. This can be done by manually calling the appropriate vcvars file before running the bootstrap.

CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
python x.py build

Specifying an ABI

Each specific ABI can also be used from either environment (for example, using the GNU ABI in PowerShell) by using an explicit build triple. The available Windows build triples are:

  • GNU ABI (using GCC)
    • i686-pc-windows-gnu
    • x86_64-pc-windows-gnu
  • The MSVC ABI
    • i686-pc-windows-msvc
    • x86_64-pc-windows-msvc

The build triple can be specified by either specifying --build=<triple> when invoking x.py commands, or by creating a config.toml file (as described in Building on a Unix-like system), and passing --set build.build=<triple> to ./configure.

Building Documentation

If you'd like to build the documentation, it's almost the same:

./x.py doc

The generated documentation will appear under doc in the build directory for the ABI used. That is, if the ABI was x86_64-pc-windows-msvc, the directory will be build\x86_64-pc-windows-msvc\doc.

Notes

Since the Rust compiler is written in Rust, it must be built by a precompiled "snapshot" version of itself (made in an earlier stage of development). As such, source builds require an Internet connection to fetch snapshots, and an OS that can execute the available snapshot binaries.

See https://doc.rust-lang.org/nightly/rustc/platform-support.html for a list of supported platforms. Only "host tools" platforms have a pre-compiled snapshot binary available; to compile for a platform without host tools you must cross-compile.

You may find that other platforms work, but these are our officially supported build environments that are most likely to work.

Getting Help

See https://www.rust-lang.org/community for a list of chat platforms and forums.

Contributing

See CONTRIBUTING.md.

License

Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.

See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.

Trademark

The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the "Rust Trademarks").

If you want to use these names or brands, please read the media guide.

Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.