rust/library
Sebastian Pop c064b6560b [Arm64] use isb instruction instead of yield in spin loops
On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop.  The yield instruction is a
nop.  The isb instruction puts the processor to sleep for some short time.  isb
is a good equivalent to the pause instruction on x86.

Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors.  The micro-benchmarks use
https://github.com/google/benchmark.git

$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
  int i = 0;
  for (auto _ : state)
    benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();

$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run

--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------

AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment      0.485 ns        0.485 ns   1000000000
BM_yield                 0.400 ns        0.400 ns   1000000000
BM_isb                    13.2 ns         13.2 ns     52993304

AWS Graviton (A-72) processor:
BM_scalar_increment      0.897 ns        0.874 ns    801558633
BM_yield                 0.877 ns        0.875 ns    800002377
BM_isb                    13.0 ns         12.7 ns     55169412

Apple Arm64 M1 processor:
BM_scalar_increment      0.315 ns        0.315 ns   1000000000
BM_yield                 0.313 ns        0.313 ns   1000000000
BM_isb                    9.06 ns         9.06 ns     77259282

static void BM_pause(benchmark::State& state) {
  for (auto _ : state)
    asm volatile("pause"::);
}
BENCHMARK(BM_pause);

Intel Skylake processor:
BM_scalar_increment      0.295 ns        0.295 ns   1000000000
BM_pause                  41.7 ns         41.7 ns     16780553

Tested on Graviton2 aarch64-linux with `./x.py test`.
2021-04-29 23:05:40 +00:00
..
alloc Auto merge of #83530 - Mark-Simulacrum:bootstrap-bump, r=Mark-Simulacrum 2021-04-04 22:45:56 +00:00
backtrace@710fc18ddc Update backtrace to 0.3.56 2021-03-25 22:25:34 -06:00
core [Arm64] use isb instruction instead of yield in spin loops 2021-04-29 23:05:40 +00:00
panic_abort Add license metadata for std dependencies 2021-02-21 13:36:18 -05:00
panic_unwind Revert "Revert stabilizing integer::BITS." 2021-03-24 22:34:36 +01:00
proc_macro impl PartialEq<Punct> for char; symmetry for #78636 2021-02-19 17:28:19 -08:00
profiler_builtins Update the minimum external LLVM to 10 2021-03-22 11:33:43 -07:00
rtstartup Bump bootstrap compiler to 1.50 beta 2020-12-30 09:27:19 -05:00
rustc-std-workspace-alloc mv std libs to library/ 2020-07-27 19:51:13 -05:00
rustc-std-workspace-core Fix rustc-std-workspace-core documentation 2020-12-20 15:23:21 +08:00
rustc-std-workspace-std mv std libs to library/ 2020-07-27 19:51:13 -05:00
std Rollup merge of #83831 - AngelicosPhosphoros:issue-77583-inline-for-ip, r=m-ou-se 2021-04-05 13:03:42 +02:00
stdarch@9c732a56f6 Update stdarch submodule 2020-12-21 00:00:00 +00:00
term Allow/fix non_fmt_panic in tests. 2021-02-03 23:15:45 +01:00
test lazily calls some fns 2021-03-27 10:20:32 +03:00
unwind Rollup merge of #82374 - clehner:licenses, r=joshtriplett 2021-03-22 15:21:23 +01:00