Optimized vec::IntoIter::next_chunk impl ``` x86_64v1, default test vec::bench_next_chunk ... bench: 696 ns/iter (+/- 22) x86_64v1, pr test vec::bench_next_chunk ... bench: 309 ns/iter (+/- 4) znver2, default test vec::bench_next_chunk ... bench: 17,272 ns/iter (+/- 117) znver2, pr test vec::bench_next_chunk ... bench: 211 ns/iter (+/- 3) ``` On znver2 the default impl seems to be slow due to different inlining decisions. It goes through `core::array::iter_next_chunk` which has a deep call tree. |
||
|---|---|---|
| .. | ||
| cow.rs | ||
| drain.rs | ||
| drain_filter.rs | ||
| in_place_collect.rs | ||
| in_place_drop.rs | ||
| into_iter.rs | ||
| is_zero.rs | ||
| mod.rs | ||
| partial_eq.rs | ||
| set_len_on_drop.rs | ||
| spec_extend.rs | ||
| spec_from_elem.rs | ||
| spec_from_iter.rs | ||
| spec_from_iter_nested.rs | ||
| splice.rs | ||