Use the 64b inner:monotonize() implementation not the 128b one for aarch64 aarch64 prior to v8.4 (FEAT_LSE2) doesn't have an instruction that guarantees untorn 128b reads except for completing a 128b load/store exclusive pair (ldxp/stxp) or compare-and-swap (casp) successfully. The requirement to complete a 128b read+write atomic is actually more expensive and more unfair than the previous implementation of monotonize() which used a Mutex on aarch64, especially at large core counts. For aarch64 switch to the 64b atomic implementation which is about 13x faster for a benchmark that involves many calls to Instant::now(). |
||
|---|---|---|
| .. | ||
| alloc | ||
| backtrace@cc89bb66f9 | ||
| core | ||
| panic_abort | ||
| panic_unwind | ||
| proc_macro | ||
| profiler_builtins | ||
| rtstartup | ||
| rustc-std-workspace-alloc | ||
| rustc-std-workspace-core | ||
| rustc-std-workspace-std | ||
| std | ||
| stdarch@5fdbc476af | ||
| test | ||
| unwind | ||