std::collections: Reexport libcollections's range module
This is overdue, even if range and RangeArgument is still unstable.
The stability attributes are the same ones as the other unstable item
(Bound) here, they don't seem to matter.
impl Debug for ReadDir
It is good practice to implement Debug for public types, and
indicating what directory you're reading seems useful.
Signed-off-by: David Henningsson <diwic@ubuntu.com>
This is overdue, even if range and RangeArgument is still unstable.
The stability attributes are the same ones as the other unstable item
(Bound) here, they don't seem to matter.
It is good practice to implement Debug for public types, and
indicating what directory you're reading seems useful.
Signed-off-by: David Henningsson <diwic@ubuntu.com>
Implement `read_offset` and `write_offset`
These functions allow to read from and write to a file from multiple
threads without changing the per-file cursor, avoiding the race between
the seek and the read.
Cache conscious hashmap table
Right now the internal HashMap representation is 3 unziped arrays hhhkkkvvv, I propose to change it to hhhkvkvkv (in further iterations kvkvkvhhh may allow inplace grow). A previous attempt is at #21973.
This layout is generally more cache conscious as it makes the value immediately accessible after a key matches. The separated hash arrays is a _no-brainer_ because of how the RH algorithm works and that's unchanged.
**Lookups**: Upon a successful match in the hash array the code can check the key and immediately have access to the value in the same or next cache line (effectively saving a L[1,2,3] miss compared to the current layout).
**Inserts/Deletes/Resize**: Moving values in the table (robin hooding it) is faster because it touches consecutive cache lines and uses less instructions.
Some backing benchmarks (besides the ones bellow) for the benefits of this layout can be seen here as well http://www.reedbeta.com/blog/2015/01/12/data-oriented-hash-table/
The obvious drawbacks is: padding can be wasted between the key and value. Because of that keys(), values() and contains() can consume more cache and be slower.
Total wasted padding between items (C being the capacity of the table).
* Old layout: C * (K-K padding) + C * (V-V padding)
* Proposed: C * (K-V padding) + C * (V-K padding)
In practice padding between K-K and V-V *can* be smaller than K-V and V-K. The overhead is capped(ish) at sizeof u64 - 1 so we can actually measure the worst case (u8 at the end of key type and value with aliment of 1, _hardly the average case in practice_).
Starting from the worst case the memory overhead is:
* `HashMap<u64, u8>` 46% memory overhead. (aka *worst case*)
* `HashMap<u64, u16>` 33% memory overhead.
* `HashMap<u64, u32>` 20% memory overhead.
* `HashMap<T, T>` 0% memory overhead
* Worst case based on sizeof K + sizeof V:
| x | 16 | 24 | 32 | 64 | 128 |
|----------------|--------|--------|--------|-------|-------|
| (8+x+7)/(8+x) | 1.29 | 1.22 | 1.18 | 1.1 | 1.05 |
I've a test repo here to run benchmarks https://github.com/arthurprs/hashmap2/tree/layout
```
➜ hashmap2 git:(layout) ✗ cargo benchcmp hhkkvv:: hhkvkv:: bench.txt
name hhkkvv:: ns/iter hhkvkv:: ns/iter diff ns/iter diff %
grow_10_000 922,064 783,933 -138,131 -14.98%
grow_big_value_10_000 1,901,909 1,171,862 -730,047 -38.38%
grow_fnv_10_000 443,544 418,674 -24,870 -5.61%
insert_100 2,469 2,342 -127 -5.14%
insert_1000 23,331 21,536 -1,795 -7.69%
insert_100_000 4,748,048 3,764,305 -983,743 -20.72%
insert_10_000 321,744 290,126 -31,618 -9.83%
insert_int_bigvalue_10_000 749,764 407,547 -342,217 -45.64%
insert_str_10_000 337,425 334,009 -3,416 -1.01%
insert_string_10_000 788,667 788,262 -405 -0.05%
iter_keys_100_000 394,484 374,161 -20,323 -5.15%
iter_keys_big_value_100_000 402,071 620,810 218,739 54.40%
iter_values_100_000 424,794 373,004 -51,790 -12.19%
iterate_100_000 424,297 389,950 -34,347 -8.10%
lookup_100_000 189,997 186,554 -3,443 -1.81%
lookup_100_000_bigvalue 192,509 189,695 -2,814 -1.46%
lookup_10_000 154,251 145,731 -8,520 -5.52%
lookup_10_000_bigvalue 162,315 146,527 -15,788 -9.73%
lookup_10_000_exist 132,769 128,922 -3,847 -2.90%
lookup_10_000_noexist 146,880 144,504 -2,376 -1.62%
lookup_1_000_000 137,167 132,260 -4,907 -3.58%
lookup_1_000_000_bigvalue 141,130 134,371 -6,759 -4.79%
lookup_1_000_000_bigvalue_unif 567,235 481,272 -85,963 -15.15%
lookup_1_000_000_unif 589,391 453,576 -135,815 -23.04%
merge_shuffle 1,253,357 1,207,387 -45,970 -3.67%
merge_simple 40,264,690 37,996,903 -2,267,787 -5.63%
new 6 5 -1 -16.67%
with_capacity_10e5 3,214 3,256 42 1.31%
```
```
➜ hashmap2 git:(layout) ✗ cargo benchcmp hhkkvv:: hhkvkv:: bench.txt
name hhkkvv:: ns/iter hhkvkv:: ns/iter diff ns/iter diff %
iter_keys_100_000 391,677 382,839 -8,838 -2.26%
iter_keys_1_000_000 10,797,360 10,209,898 -587,462 -5.44%
iter_keys_big_value_100_000 414,736 662,255 247,519 59.68%
iter_keys_big_value_1_000_000 10,147,837 12,067,938 1,920,101 18.92%
iter_values_100_000 440,445 377,080 -63,365 -14.39%
iter_values_1_000_000 10,931,844 9,979,173 -952,671 -8.71%
iterate_100_000 428,644 388,509 -40,135 -9.36%
iterate_1_000_000 11,065,419 10,042,427 -1,022,992 -9.24%
```
Add two functions to check type of given address
The is_v4 function returns true if the given IP is v4. The is_v6
function returns true if the IP is v6.
Add ThreadId for comparing threads
This adds the capability to store and compare threads with the current calling thread via a new struct, `std:🧵:ThreadId`. Addresses the need outlined in issue #21507.
This avoids the need to add any special checks to the existing thread structs and does not rely on the system to provide an identifier for a thread, since it seems that this approach is unreliable and undesirable. Instead, this simply uses a lazily-created, thread-local `usize` whose value is copied from a global atomic counter. The code should be simple enough that it should be as much reliable as the `#[thread_local]` attribute it uses (however much that is).
`ThreadId`s can be compared directly for equality and have copy semantics.
Also see these other attempts:
- rust-lang/rust#29457
- rust-lang/rust#29448
- rust-lang/rust#29447
And this in the RFC repo: rust-lang/rfcs#1435
These functions allow to read from and write to a file in one atomic
action from multiple threads, avoiding the race between the seek and the
read.
The functions are named `{read,write}_at` on non-Windows (which don't
change the file cursor), and `seek_{read,write}` on Windows (which
change the file cursor).
Fixed small typo in `BufRead` comments
`BufRead` comments, in the `Seek` trait implementation, was talking about allocating 8 *ebibytes*. It was a typo, the correct unit is *exbibytes*, since *ebibytes* don't even exist. The calculation is correct, though.
Restore `DISCONNECTED` state in `oneshot::Packet::send`
Closes#32114
I'm not sure if this is the best approach, but the current action of swapping `DISCONNECTED` with `DATA` seems wrong. Additionally, it is strange that the `send` method (and others in the `oneshot` module) takes `&mut self` despite performing atomic operations, as this requires extra discipline to avoid data races and lets us use methods like `AtomicUsize::get_mut` instead of methods that require a memory ordering.
`BufRead` comments, in the `Seek` trait implementation, was talking
about allocating 8 *ebibytes*. It was a typo, the correct unit is
*exbibytes*, since *ebibytes* don't even exist. The calculation is
correct, though.