rust/tests/codegen
bors 3b022d8cee Auto merge of #133852 - x17jiri:cold_path, r=saethlin
improve cold_path()

#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`

However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:

```
if let Some(x) = y { ... }
```

may generate 3-target switch:

```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```

and therefore marking a branch as cold will have no effect.

This PR improves `cold_path()` to work with arbitrary switch instructions.

Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-18 07:49:09 +00:00
..
asm
auxiliary
avr
bounds-checking
cffi
compiletest-self-test
cross-crate-inlining
debug-accessibility
debuginfo-proc-macro
dllimports
enum
float
hint
instrument-coverage
instrument-xray
intrinsics Auto merge of #133852 - x17jiri:cold_path, r=saethlin 2025-02-18 07:49:09 +00:00
issues
lib-optimizations
loongarch-abi
macos
meta-filecheck
naked-fn
non-terminate
patchable-function-entry
remap_path_prefix
repr
riscv-abi
sanitizer
simd
simd-intrinsic
src-hash-algorithm
unwind-abis
aarch64-softfloat.rs
aarch64-struct-align-128.rs
abi-efiapi.rs
abi-main-signature-16bit-c-int.rs
abi-main-signature-32bit-c-int.rs
abi-repr-ext.rs
abi-sysv64.rs
abi-win64-zst.rs use add-core-stubs / minicore for a few more tests 2025-02-16 18:37:50 +01:00
abi-x86-interrupt.rs
abi-x86_64_sysv.rs
addr-of-mutate.rs
adjustments.rs
align-byval-alignment-mismatch.rs use add-core-stubs / minicore for a few more tests 2025-02-16 18:37:50 +01:00
align-byval-vector.rs use add-core-stubs / minicore for a few more tests 2025-02-16 18:37:50 +01:00
align-byval.rs use add-core-stubs / minicore for a few more tests 2025-02-16 18:37:50 +01:00
align-enum.rs
align-fn.rs
align-offset.rs
align-struct.rs
alloc-optimisation.rs
amdgpu-addrspacecast.rs
array-clone.rs
array-cmp.rs
array-codegen.rs
array-equality.rs
array-from_fn.rs
array-map.rs
array-optimized.rs
array-repeat.rs
ascii-char.rs
async-closure-debug.rs
async-fn-debug-awaitee-field.rs
async-fn-debug-msvc.rs
async-fn-debug.rs
atomic-operations.rs
atomicptr.rs
autodiff.rs
autovectorize-f32x4.rs
bigint-helpers.rs
binary-heap-peek-mut-pop-no-panic.rs
binary-search-index-no-bound-check.rs
bool-cmp.rs
box-uninit-bytes.rs
bpf-alu32.rs
branch-protection-old-llvm.rs
branch-protection.rs
call-llvm-intrinsics.rs
call-metadata.rs
cast-optimized.rs
cast-target-abi.rs
catch-unwind.rs
cdylib-external-inline-fns.rs
cf-protection.rs
cfguard-checks.rs
cfguard-disabled.rs
cfguard-nochecks.rs
cfguard-non-msvc.rs
char-ascii-branchless.rs
checked_ilog.rs
checked_math.rs
clone-shims.rs
clone_as_copy.rs
codemodels.rs
coercions.rs
cold-call-declare-and-call.rs
common_prim_int_ptr.rs llvm: Tolerate captures in tests 2025-02-14 18:55:50 +00:00
comparison-operators-2-tuple.rs
comparison-operators-newtype.rs
const-array.rs
const-vector.rs
const_scalar_pair.rs
constant-branch.rs
consts.rs
coroutine-debug-msvc.rs
coroutine-debug.rs
dealloc-no-unwind.rs
debug-alignment.rs
debug-column-msvc.rs
debug-column.rs
debug-compile-unit-path.rs
debug-fndef-size.rs
debug-limited.rs
debug-line-directives-only.rs
debug-line-tables-only.rs
debug-linkage-name.rs
debug-vtable.rs
debuginfo-constant-locals.rs
debuginfo-generic-closure-env-names.rs
debuginfo-inline-callsite-location.rs
deduced-param-attrs.rs
default-requires-uwtable.rs
default-visibility.rs
direct-access-external-data.rs
dont-shuffle-bswaps.rs
dont_codegen_private_const_fn_only_used_in_const_eval.rs
drop-in-place-noalias.rs
drop.rs
dst-offset.rs
dst-vtable-align-nonzero.rs
dst-vtable-size-range.rs
ehcontguard_disabled.rs
ehcontguard_enabled.rs
emscripten-catch-unwind-js-eh.rs
emscripten-catch-unwind-wasm-eh.rs
enable-lto-unit-splitting.rs
error-provide.rs
export-no-mangle.rs
external-no-mangle-fns.rs
external-no-mangle-statics.rs
f128-wasm32-callconv.rs
fastcall-inreg.rs
fatptr.rs
fewer-names.rs
fixed-x18.rs
float_math.rs
fn-impl-trait-self.rs
force-frame-pointers.rs
force-no-unwind-tables.rs
force-unwind-tables.rs
frame-pointer.rs
function-arguments-noopt.rs
function-arguments.rs
function-return.rs
gdb_debug_script_load.rs
generic-debug.rs
gep-index.rs
gpu-kernel-abi.rs
i128-wasm32-callconv.rs
i128-x86-align.rs
i128-x86-callconv.rs
infallible-unwrap-in-opt-z.rs
inherit_overflow.rs
inline-always-works-always.rs
inline-debuginfo.rs
inline-function-args-debug-info.rs
inline-hint.rs
instrument-mcount.rs
integer-cmp.rs
integer-overflow.rs
internalize-closures.rs
intrinsic-no-unnamed-attr.rs
is_val_statically_known.rs
issue-97217.rs
iter-repeat-n-trivial-drop.rs
layout-size-checks.rs
lifetime_start_end.rs
link-dead-code.rs
link_section.rs
llvm-ident.rs
llvm_module_flags.rs
loads.rs
local-generics-in-exe-internalized.rs
lto-removes-invokes.rs
mainsubprogram.rs
match-optimized.rs
match-optimizes-away.rs
match-unoptimized.rs
maybeuninit-rvo.rs
mem-replace-big-type.rs
mem-replace-simple-type.rs
merge-functions.rs
method-declaration.rs
min-function-alignment.rs
mir-aggregate-no-alloca.rs
mir-inlined-line-numbers.rs
mir_zst_stores.rs
move-before-nocapture-ref-arg.rs
move-operands.rs
naked-asan.rs
no-alloca-inside-if-false.rs
no-assumes-on-casts.rs
no-dllimport-w-cross-lang-lto.rs
no-jump-tables.rs
no-plt.rs
no-redundant-item-monomorphization.rs
no_builtins-at-crate.rs
noalias-box-off.rs
noalias-box.rs
noalias-flag.rs
noalias-freeze.rs
noalias-refcell.rs
noalias-rwlockreadguard.rs
noalias-unpin.rs
noreturn-uninhabited.rs
noreturnflag.rs
nounwind.rs
nrvo.rs
optimize-attr-1.rs
option-as-slice.rs
option-niche-eq.rs
overaligned-constant.rs
packed.rs
panic-abort-windows.rs
panic-in-drop-abort.rs
panic-unwind-default-uwtable.rs
pattern_type_symbols.rs
personality_lifetimes.rs
pgo-counter-bias.rs
pgo-instrumentation.rs
pic-relocation-model.rs
pie-relocation-model.rs
placement-new.rs
powerpc64le-struct-align-128.rs
precondition-checks.rs
ptr-arithmetic.rs
ptr-read-metadata.rs
range-attribute.rs
range_to_inclusive.rs
README.md
refs.rs
reg-struct-return.rs
regparm-inreg.rs
repeat-trusted-len.rs
riscv-target-abi.rs
rust-abi-arch-specific-adjustment.rs
s390x-simd.rs
scalar-pair-bool.rs
set-discriminant-invalid.rs
skip-mono-inside-if-false.rs
slice-as_chunks.rs
slice-indexing.rs
slice-init.rs
slice-is-ascii.rs
slice-iter-fold.rs
slice-iter-len-eq-zero.rs
slice-iter-nonnull.rs
slice-pointer-nonnull-unwrap.rs
slice-position-bounds-check.rs
slice-ref-equality.rs
slice-reverse.rs
slice-windows-no-bounds-check.rs
slice_as_from_ptr_range.rs
some-abis-do-extend-params-to-32-bits.rs
some-global-nonnull.rs
sparc-struct-abi.rs
split-lto-unit.rs
sroa-fragment-debuginfo.rs
sse42-implies-crc32.rs
stack-probes-inline.rs
stack-protector.rs
static-relocation-model-msvc.rs
staticlib-external-inline-fns.rs
step_by-overflow-checks.rs
stores.rs
swap-large-types.rs
swap-small-types.rs
target-cpu-on-functions.rs
target-feature-inline-closure.rs
target-feature-overrides.rs
terminating-catchpad.rs
thread-local.rs
tied-features-strength.rs
to_vec.rs
trailing_zeros.rs
transmute-optimized.rs
transmute-scalar.rs
try_question_mark_nop.rs
tune-cpu-on-functions.rs
tuple-layout-opt.rs
ub-checks.rs
unchecked-float-casts.rs
unchecked_shifts.rs
uninit-consts.rs
union-abi.rs
unwind-and-panic-abort.rs
unwind-extern-exports.rs
unwind-extern-imports.rs
unwind-landingpad-cold.rs
unwind-landingpad-inline.rs
used_with_arg.rs
var-names.rs
vec-as-ptr.rs
vec-calloc.rs
vec-in-place.rs
vec-iter-collect-len.rs
vec-iter.rs
vec-len-invariant.rs
vec-optimizes-away.rs
vec-reserve-extend.rs
vec-shrink-panik.rs
vec-with-capacity.rs
vec_pop_push_noop.rs
vecdeque-drain.rs
vecdeque-nonempty-get-no-panic.rs
vecdeque_no_panic.rs
vecdeque_pop_push.rs
virtual-function-elimination-32bit.rs
virtual-function-elimination.rs
vtable-loads.rs
vtable-upcast.rs
wasm_casts_trapping.rs
wasm_exceptions.rs
zip.rs
zst-offset.rs

The files here use the LLVM FileCheck framework, documented at https://llvm.org/docs/CommandGuide/FileCheck.html.

One extension worth noting is the use of revisions as custom prefixes for FileCheck. If your codegen test has different behavior based on the chosen target or different compiler flags that you want to exercise, you can use a revisions annotation, like so:

// revisions: aaa bbb
// [bbb] compile-flags: --flags-for-bbb

After specifying those variations, you can write different expected, or explicitly unexpected output by using <prefix>-SAME: and <prefix>-NOT:, like so:

// CHECK: expected code
// aaa-SAME: emitted-only-for-aaa
// aaa-NOT:                        emitted-only-for-bbb
// bbb-NOT:  emitted-only-for-aaa
// bbb-SAME:                       emitted-only-for-bbb