Commit graph

3014 commits

Author SHA1 Message Date
Ralf Jung
6ab64620a6 refactor atomic intrinsic handling to actually parse the intrinsic name 2022-07-18 08:55:50 -04:00
Ralf Jung
ad3010c449 make atomic intrinsic impl details private 2022-07-18 08:22:27 -04:00
Ralf Jung
c850ffe01a add support for new RMW orders 2022-07-18 08:20:06 -04:00
Ralf Jung
1174cda4f1 remove ret param from foreign_item hierarchy 2022-07-18 08:05:46 -04:00
Ralf Jung
53ead1b8c9 move simd intrinsics to their own file 2022-07-18 08:03:58 -04:00
Ralf Jung
52a6ac96b0 move atomic intrinsics to their own file 2022-07-18 07:55:11 -04:00
Ralf Jung
896f558f2b with isolation we want to be fully deterministic 2022-07-17 21:50:10 -04:00
bors
8ec3425a8a Auto merge of #2349 - saethlin:isatty, r=RalfJung
Improve isatty support

Per https://github.com/rust-lang/miri/issues/2292#issuecomment-1171858283, this is an attempt at

> do something more clever with Miri's `isatty` shim

Since Unix -> Unix is very simple, I'm starting with a patch that just does that. Happy to augment/rewrite this based on feedback.

The linked file in libtest specifically only supports stdout. If we're doing this to support terminal applications, I think it would be strange to support one but not all 3 of the standard streams.

The `atty` crate contains a bunch of extra logic that libtest does not contain, in order to support MSYS terminals: db8d55f88e so I think if we're going to do Windows support, we should probably access all that logic somehow. I think it's pretty clear that the implementation is not going to change, so I think if we want to, pasting the contents of the `atty` crate into Miri is on the table, instead of taking a dependency.
2022-07-18 01:37:38 +00:00
Ben Kimock
2f84cb34c1 Pass through isatty if the host is also unix 2022-07-17 16:53:14 -04:00
Ralf Jung
39866f817a remove a fastpath that does not seem to actually help 2022-07-17 10:35:19 -04:00
Ralf Jung
68510600a3 use PlaceTy visitor 2022-07-17 10:19:29 -04:00
Ralf Jung
e8ab64e424 make unused flags work like they used to 2022-07-17 08:18:55 -04:00
Ralf Jung
9782b7b039 rustup 2022-07-16 23:40:36 -04:00
bors
86911fd8f6 Auto merge of #2368 - RalfJung:debug, r=oli-obk
Make "./miri {build,run,test}" use debug assertions but "./miri install" not

This makes `./miri run`/`./miri test` use the full set of debug assertions (including the rather expensive ones that check consistency of the Stacked Borrows cache), but `./miri install` installs a Miri *without* those debug assertions.

That's the same behavior as cargo, and helps catch Miri bugs with the test suite while making installed Miri usable for larger runs.
2022-07-15 15:54:47 +00:00
Ralf Jung
98c401977b rustup 2022-07-15 08:09:43 -04:00
Ralf Jung
d6cbe5d014 ensure that RangeMap panics on OOB 2022-07-14 15:09:20 -04:00
Ralf Jung
421f92bee6 make some debug assertions in RangeObjectMap be full assertions 2022-07-14 13:23:35 -04:00
Ralf Jung
5d5999ab13 make cache consistency checks into regular debug assertions 2022-07-14 13:00:35 -04:00
Ralf Jung
eaa7f10cb1 rustup 2022-07-14 09:54:20 -04:00
bors
af2c50fb89 Auto merge of #2328 - RalfJung:perf, r=RalfJung
move checking ptr tracking on item pop into cold helper function

Before:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      6.729 s ±  0.050 s    [User: 6.608 s, System: 0.124 s]
  Range (min … max):    6.665 s …  6.799 s    5 runs

Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):     20.923 s ±  0.271 s    [User: 20.386 s, System: 0.537 s]
  Range (min … max):   20.580 s … 21.165 s    5 runs
```
After:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      6.562 s ±  0.023 s    [User: 6.430 s, System: 0.135 s]
  Range (min … max):    6.544 s …  6.594 s    5 runs

Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):     20.375 s ±  0.228 s    [User: 19.964 s, System: 0.413 s]
  Range (min … max):   20.201 s … 20.736 s    5 runs
```
Nothing major, but we'll take it I guess. 🤷

Fixes https://github.com/rust-lang/miri/issues/2132
2022-07-14 00:34:00 +00:00
Ralf Jung
cc42cb1b21 reborrow error: clarify that we are reborrowing *from* that tag 2022-07-13 19:40:53 -04:00
Ralf Jung
83b9172774 move stacked_borrows.rs together with the other files of its module 2022-07-13 19:37:41 -04:00
Ralf Jung
3bd0e8a2ca move checking ptr tracking on item pop into cold helper function 2022-07-13 18:07:33 -04:00
bors
db5a2b9747 Auto merge of #2315 - saethlin:shrink-item, r=saethlin
Optimizing Stacked Borrows (part 2): Shrink Item

This moves protectors out of `Item`, storing them both in a global `HashSet` which contains all currently-protected tags as well as a `Vec<SbTag>` on each `Frame` so that when we return from a function we know which tags to remove from the protected set.

This also bit-packs the 64-bit tag and the 2-bit permission together when they are stored in memory. This means we theoretically run out of tags sooner, but I doubt that limit will ever be hit.

Together these optimizations reduce the memory footprint of Miri when executing programs which stress Stacked Borrows by ~66%. For example, running a test with isolation off which only panics currently peaks at ~19 GB, with this PR it peaks at ~6.2 GB.

To-do
- [x] Enforce the 62-bit limit
- [x] Decide if there is a better order to pack the tag and permission in
- [x] Wait for `UnsafeCell` to become infectious, or express offsets + tags in the global protector set

Benchmarks before:
```
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/backtraces/Cargo.toml
  Time (mean ± σ):      8.948 s ±  0.253 s    [User: 8.752 s, System: 0.158 s]
  Range (min … max):    8.619 s …  9.279 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/mse/Cargo.toml
  Time (mean ± σ):      2.129 s ±  0.037 s    [User: 1.849 s, System: 0.248 s]
  Range (min … max):    2.086 s …  2.176 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      3.334 s ±  0.017 s    [User: 3.211 s, System: 0.103 s]
  Range (min … max):    3.315 s …  3.352 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde2/Cargo.toml
  Time (mean ± σ):      3.316 s ±  0.038 s    [User: 3.207 s, System: 0.095 s]
  Range (min … max):    3.282 s …  3.375 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):      6.391 s ±  0.323 s    [User: 5.928 s, System: 0.412 s]
  Range (min … max):    6.090 s …  6.917 s    5 runs
 ```
 After:
 ```
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/backtraces/Cargo.toml
  Time (mean ± σ):      6.955 s ±  0.051 s    [User: 6.807 s, System: 0.132 s]
  Range (min … max):    6.900 s …  7.038 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/mse/Cargo.toml
  Time (mean ± σ):      1.784 s ±  0.012 s    [User: 1.627 s, System: 0.156 s]
  Range (min … max):    1.772 s …  1.797 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
  Time (mean ± σ):      2.505 s ±  0.095 s    [User: 2.311 s, System: 0.096 s]
  Range (min … max):    2.405 s …  2.603 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde2/Cargo.toml
  Time (mean ± σ):      2.449 s ±  0.031 s    [User: 2.306 s, System: 0.100 s]
  Range (min … max):    2.395 s …  2.467 s    5 runs

Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
  Time (mean ± σ):      3.667 s ±  0.110 s    [User: 3.498 s, System: 0.140 s]
  Range (min … max):    3.564 s …  3.814 s    5 runs
 ```
 The decrease in system time is probably due to spending less time in the page fault handler.
2022-07-13 01:44:01 +00:00
Ben Kimock
4eff60ad6e Rearrange and document the new implementation
stacked_borrow now has an item module, and its own FrameExtra. These
serve to protect the implementation of Item (which is a bunch of
bit-packing tricks) from the primary logic of Stacked Borrows, and the
FrameExtra we have separates Stacked Borrows more cleanly from the
interpreter itself.

The new strategy for checking protectors also makes some subtle
performance tradeoffs, so they are now documented in Stack::item_popped
because that function primarily benefits from them, and it also touches
every aspect of them.

Also separating the actual CallId that is protecting a Tag from the Tag
makes it inconvienent to reproduce exactly the same protector errors, so
this also takes the opportunity to use some slightly cleaner English in
those errors. We need to make some change, might as well make it good.
2022-07-12 21:03:54 -04:00
Ben Kimock
afa1dddcf9 Store protectors outside Item, pack Tag and Perm
Previously, Item was a struct of a NonZeroU64, an Option which was
usually unset or irrelevant, and a 4-variant enum. So collectively, the
size of an Item was 24 bytes, but only 8 bytes were used for the most
part.

So this takes advantage of the fact that it is probably impossible to
exhaust the total space of SbTags, and steals 3 bits from it to pack the
whole struct into a single u64. This bit-packing means that we reduce
peak memory usage when Miri goes memory-bound by ~3x. We also get CPU
performance improvements of varying size, because not only are we simply
accessing less memory, we can now compare a Vec<Item> using a memcmp
because it does not have any padding.
2022-07-12 21:01:33 -04:00
Ralf Jung
c9b207eba6 extend a comment in readlink 2022-07-09 12:49:37 -04:00
Ralf Jung
23d1f1a5a3 rustup 2022-07-07 20:12:30 -04:00
Ralf Jung
6b3986f44d remove a dead optimization 2022-07-07 07:42:31 -04:00
Ralf Jung
b6602f5d11 rustup 2022-07-06 22:55:12 -04:00
Ralf Jung
5fed3ebc26 adjust code for copy_op changes 2022-07-06 21:40:31 -04:00
Ralf Jung
d5f1c26380 rustup; ptr atomics 2022-07-06 21:38:52 -04:00
Ralf Jung
501a6b4687 rustup 2022-07-06 14:06:15 -04:00
Ralf Jung
907a003f14 tweak format strings 2022-07-06 09:47:48 -04:00
Ralf Jung
6c8ad4abc9 fix comparing wide raw pointers 2022-07-05 21:21:02 -04:00
Ralf Jung
f3f4bafa1b rustup 2022-07-05 18:16:20 -04:00
bors
35399c6a5d Auto merge of #2323 - RalfJung:box-is-special, r=RalfJung
handle Box with allocators

This is the Miri side of https://github.com/rust-lang/rust/pull/98847.

Thanks `@DrMeepster` for doing most of the work of getting this test case to pass in Miri. :)
2022-07-05 12:35:03 +00:00
Ralf Jung
2931e0fd63 handle Box with allocators 2022-07-05 08:34:41 -04:00
Ralf Jung
a07398d441 we don't need HexRange any more 2022-07-05 07:38:42 -04:00
Oli Scherer
afb937ab25 Bump rust version 2022-07-05 10:17:43 +00:00
Ralf Jung
22aa7f98c5 call_function: make the unit-return-type case more convenient 2022-07-04 13:46:11 -04:00
Ralf Jung
a4e7e1e6b5 fix retagging of vtable ptrs 2022-07-03 11:56:29 -04:00
bors
cfad9d12f3 Auto merge of #1935 - saethlin:optimize-sb, r=RalfJung
Optimizing Stacked Borrows (part 1?): Cache locations of Tags in a Borrow Stack

Before this PR, a profile of Miri under almost any workload points quite squarely at these regions of code as being incredibly hot (each being ~40% of cycles):

dadcbebfbd/src/stacked_borrows.rs (L259-L269)

dadcbebfbd/src/stacked_borrows.rs (L362-L369)

This code is one of at least three reasons that stacked borrows analysis is super-linear: These are both linear in the number of borrows in the stack and they are positioned along the most commonly-taken paths.

I'm addressing the first loop (which is in `Stack::find_granting`) by adding a very very simple sort of LRU cache implemented on a `VecDeque`, which maps recently-looked-up tags to their position in the stack. For `Untagged` access we fall back to the same sort of linear search. But as far as I can tell there are never enough `Untagged` items to be significant.

I'm addressing the second loop by keeping track of the region of stack where there could be items granting `Permission::Unique`. This optimization is incredibly effective because `Read` access tends to dominate and many trips through this code path now skip the loop entirely.

These optimizations result in pretty enormous improvements:
Without raw pointer tagging, `mse` 34.5s -> 2.4s, `serde1` 5.6s -> 3.6s
With raw pointer tagging, `mse` 35.3s -> 2.4s, `serde1` 5.7s -> 3.6s

And there is hardly any impact on memory usage:
Memory usage on `mse` 844 MB -> 848 MB, `serde1` 184 MB -> 184 MB (jitter on these is a few MB).
2022-07-03 14:39:22 +00:00
Ben Kimock
b004a03bdb Typo 2022-07-02 20:45:27 -04:00
Ben Kimock
f3b479d556 Explain cache behavior a bit better, clean up diff 2022-07-02 19:44:55 -04:00
Ralf Jung
d9c441c5ab put call to stacked borrows end_call in a more sensible place 2022-07-02 18:38:07 -04:00
Ben Kimock
7cdbf98f57 Explain the behavior of the cache upon clear 2022-07-02 13:22:22 -04:00
Ralf Jung
98254f67af pointer tag tracking: on creation, log the offsets it is created for 2022-07-02 11:33:29 -04:00
bors
428245072e Auto merge of #2306 - RalfJung:unix, r=RalfJung
make some things available for all Unixes
2022-07-02 13:45:27 +00:00
Ralf Jung
5ba2c1e6be posix_fadvise is not Linux-specific 2022-07-02 09:45:00 -04:00