* Remove duplicate alignment note that mentioned `AtomicBool` with other
types
* Update safety requirements about when non-atomic operations are
allowed
Optimize `AtomicBool` for target that don't support byte-sized atomics
`AtomicBool` is defined to have the same layout as `bool`, which means that we guarantee that it has a size of 1 byte. However on certain architectures such as RISC-V, LLVM will emulate byte atomics using a masked CAS loop on an aligned word.
We can take advantage of the fact that `bool` only ever has a value of 0 or 1 to replace `swap` operations with `and`/`or` operations that LLVM can lower to word-sized atomic `and`/`or` operations. This takes advantage of the fact that the incoming value to a `swap` or `compare_exchange` for `AtomicBool` is often a compile-time constant.
### Example
```rust
pub fn swap_true(atomic: &AtomicBool) -> bool {
atomic.swap(true, Ordering::Relaxed)
}
```
### Old
```asm
andi a1, a0, -4
slli a0, a0, 3
li a2, 255
sllw a2, a2, a0
li a3, 1
sllw a3, a3, a0
slli a3, a3, 32
srli a3, a3, 32
.LBB1_1:
lr.w a4, (a1)
mv a5, a3
xor a5, a5, a4
and a5, a5, a2
xor a5, a5, a4
sc.w a5, a5, (a1)
bnez a5, .LBB1_1
srlw a0, a4, a0
andi a0, a0, 255
snez a0, a0
ret
```
### New
```asm
andi a1, a0, -4
slli a0, a0, 3
li a2, 1
sllw a2, a2, a0
amoor.w a1, a2, (a1)
srlw a0, a1, a0
andi a0, a0, 255
snez a0, a0
ret
```
`AtomicBool` is defined to have the same layout as `bool`, which means
that we guarantee that it has a size of 1 byte. However on certain
architectures such as RISC-V, LLVM will emulate byte atomics using a
masked CAS loop on an aligned word.
We can take advantage of the fact that `bool` only ever has a value of 0
or 1 to replace `swap` operations with `and`/`or` operations that LLVM
can lower to word-sized atomic `and`/`or` operations. This takes
advantage of the fact that the incoming value to a `swap` or
`compare_exchange` for `AtomicBool` is often a compile-time constant.
Stabilize `atomic_as_ptr`
Fixes#66893
This stabilizes the `as_ptr` methods for atomics. The stabilization feature gate used here is `atomic_as_ptr` which supersedes `atomic_mut_ptr` to match the change in https://github.com/rust-lang/rust/pull/107736.
This needs FCP.
New stable API:
```rust
impl AtomicBool {
pub const fn as_ptr(&self) -> *mut bool;
}
impl AtomicI32 {
pub const fn as_ptr(&self) -> *mut i32;
}
// Includes all other atomic types
impl<T> AtomicPtr<T> {
pub const fn as_ptr(&self) -> *mut *mut T;
}
```
r? libs-api
``@rustbot`` label +needs-fcp
Before this change, the following functions and macros were annotated
with `#[cfg(target_has_atomic = "8")]` or
`#[cfg(target_has_atomic_load_store = "8")]`:
* `atomic_int`
* `strongest_failure_ordering`
* `atomic_swap`
* `atomic_add`
* `atomic_sub`
* `atomic_compare_exchange`
* `atomic_compare_exchange_weak`
* `atomic_and`
* `atomic_nand`
* `atomic_or`
* `atomic_xor`
* `atomic_max`
* `atomic_min`
* `atomic_umax`
* `atomic_umin`
However, none of those functions and macros actually depend on 8-bit
width and they are needed for all atomic widths (16-bit, 32-bit, 64-bit
etc.). Some targets might not support 8-bit atomics (i.e. BPF, if we
would enable atomic CAS for it).
This change fixes that by removing the `"8"` argument from annotations,
which results in accepting the whole variety of widths.
Fixes#106845Fixes#106795
Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
Warn about safety of `fetch_update`
Specifically as it relates to the ABA problem.
`fetch_update` is a useful function, and one that isn't provided by, say, C++. However, this does not mean the function is magic. It is implemented in terms of `compare_exchange_weak`, and in particular, suffers from the ABA problem. See the following code, which is a naive implementation of `pop` in a lock-free queue:
```rust
fn pop(&self) -> Option<i32> {
self.front.fetch_update(Ordering::Relaxed, Ordering::Acquire, |front| {
if front == ptr::null_mut() {
None
}
else {
Some(unsafe { (*front).next })
}
}.ok()
}
```
This code is unsound if called from multiple threads because of the ABA problem. Specifically, suppose nodes are allocated with `Box`. Suppose the following sequence happens:
```
Initial: Queue is X -> Y.
Thread A: Starts popping, is pre-empted.
Thread B: Pops successfully, twice, leaving the queue empty.
Thread C: Pushes, and `Box` returns X (very common for allocators)
Thread A: Wakes up, sees the head is still X, and stores Y as the new head.
```
But `Y` is deallocated. This is undefined behaviour.
Adding a note about this problem to `fetch_update` should hopefully prevent users from being misled, and also, a link to this common problem is, in my opinion, an improvement to our docs on atomics.