rust/tests/mir-opt/building/match
bors f8463896a9 Auto merge of #150681 - meithecatte:always-discriminate, r=JonathanBrouwer,Nadrieril
Make operational semantics of pattern matching independent of crate and module

The question of "when does matching an enum against a pattern of one of its variants read its discriminant" is currently an underspecified part of the language, causing weird behavior around borrowck, drop order, and UB.

Of course, in the common cases, the discriminant must be read to distinguish the variant of the enum, but currently the following exceptions are implemented:

1. If the enum has only one variant, we currently skip the discriminant read.
     - This has the advantage that single-variant enums behave the same way as structs in this regard.
     - However, it means that if the discriminant exists in the layout, we can't say that this discriminant being invalid is UB. This makes me particularly uneasy in its interactions with niches – consider the following example ([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=5904a6155cbdd39af4a2e7b1d32a9b1a)), where miri currently doesn't detect any UB (because the semantics don't specify any):

        <details><summary>Example 1</summary>

        ```rust
        #![allow(dead_code)]
        use core::mem::{size_of, transmute};
        
        #[repr(u8)]
        enum Inner {
            X(u8),
        }
        
        enum Outer {
            A(Inner),
            B(u8),
        }
        
        fn f(x: &Inner) {
            match x {
                Inner::X(v) => {
                    println!("{v}");
                }
            }
        }
        
        fn main() {
            assert_eq!(size_of::<Inner>(), 2);
            assert_eq!(size_of::<Outer>(), 2);
            let x = Outer::B(42);
            let y = &x;
            f(unsafe { transmute(y) });
        }
        ```

      </details>

2. For the purpose of the above, enums with marked with `#[non_exhaustive]` are always considered to have multiple variants when observed from foreign crates, but the actual number of variants is considered in the current crate.
    - This means that whether code has UB can depend on which crate it is in: https://github.com/rust-lang/rust/issues/147722
    - In another case of `#[non_exhaustive]` affecting the runtime semantics, its presence or absence can change what gets captured by a closure, and by extension, the drop order: https://github.com/rust-lang/rust/issues/147722#issuecomment-3674554872
    - Also at the above link, there is an example where removing `#[non_exhaustive]` can cause borrowck to suddenly start failing in another crate.
3. Moreover, we currently make a more specific check: we only read the discriminant if there is more than one *inhabited* variant in the enum.
    - This means that the semantics can differ between `foo<!>`, and a copy of `foo` where `T` was manually replaced with `!`: rust-lang/rust#146803
    - Moreover, due to the privacy rules for inhabitedness, it means that the semantics of code can depend on the *module* in which it is located.
    - Additionally, this inhabitedness rule is even uglier due to the fact that closure capture analysis needs to happen before we can determine whether types are uninhabited, which means that whether the discriminant read happens has a different answer specifically for capture analysis.
    - For the two above points, see the following example ([playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2024&gist=a07d8a3ec0b31953942e96e2130476d9)):

        <details><summary>Example 2</summary>

        ```rust
        #![allow(unused)]
        
        mod foo {
            enum Never {}
            struct PrivatelyUninhabited(Never);
            pub enum A {
                V(String, String),
                Y(PrivatelyUninhabited),
            }
            
            fn works(mut x: A) {
                let a = match x {
                    A::V(ref mut a, _) => a,
                    _ => unreachable!(),
                };
                
                let b = match x {
                    A::V(_, ref mut b) => b,
                    _ => unreachable!(),
                };
            
                a.len(); b.len();
            }
            
            fn fails(mut x: A) {
                let mut f = || match x {
                    A::V(ref mut a, _) => (),
                    _ => unreachable!(),
                };
                
                let mut g = || match x {
                    A::V(_, ref mut b) => (),
                    _ => unreachable!(),
                };
            
                f(); g();
            }
        }
        
        use foo::A;
        
        fn fails(mut x: A) {
            let a = match x {
                A::V(ref mut a, _) => a,
                _ => unreachable!(),
            };
            
            let b = match x {
                A::V(_, ref mut b) => b,
                _ => unreachable!(),
            };
        
            a.len(); b.len();
        }
        
        
        fn fails2(mut x: A) {
            let mut f = || match x {
                A::V(ref mut a, _) => (),
                _ => unreachable!(),
            };
            
            let mut g = || match x {
                A::V(_, ref mut b) => (),
                _ => unreachable!(),
            };
        
            f(); g();
        }
        ```

        </details>

In light of the above, and following the discussion at rust-lang/rust#138961 and rust-lang/rust#147722, this PR ~~makes it so that, operationally, matching on an enum *always* reads its discriminant.~~ introduces the following changes to this behavior:

 - matching on a `#[non_exhaustive]` enum will always introduce a discriminant read, regardless of whether the enum is from an external crate
 - uninhabited variants now count just like normal ones, and don't get skipped in the checks

As per the discussion below, the resolution for point (1) above is that it should land as part of a separate PR, so that the subtler decision can be more carefully considered.

Note that this is a breaking change, due to the aforementioned changes in borrow checking behavior, new UB (or at least UB newly detected by miri), as well as drop order around closure captures. However, it seems to me that the combination of this PR with rust-lang/rust#138961 should have smaller real-world impact than rust-lang/rust#138961 by itself.

Fixes rust-lang/rust#142394 
Fixes rust-lang/rust#146590
Fixes rust-lang/rust#146803 (though already marked as duplicate)
Fixes parts of rust-lang/rust#147722
Fixes rust-lang/miri#4778

r? @Nadrieril @RalfJung 

@rustbot label +A-closures +A-patterns +T-opsem +T-lang
2026-02-14 12:53:09 +00:00
..
deref-patterns Remove feature(string_deref_patterns) 2025-12-31 14:21:38 +11:00
array_len.const_array_len.built.after.panic-abort.mir Add test. 2025-09-16 22:44:35 +00:00
array_len.const_array_len.built.after.panic-unwind.mir Add test. 2025-09-16 22:44:35 +00:00
array_len.rs Add test. 2025-09-16 22:44:35 +00:00
array_len.slice_len.built.after.panic-abort.mir Add test. 2025-09-16 22:44:35 +00:00
array_len.slice_len.built.after.panic-unwind.mir Add test. 2025-09-16 22:44:35 +00:00
exponential_or.match_tuple.SimplifyCfg-initial.after.mir Do not optimize out SwitchInt before borrowck, or if Zmir-preserve-ub 2025-04-08 21:05:20 +00:00
exponential_or.rs Regroup mir-opt tests of match building 2024-03-30 17:37:15 +01:00
match_false_edges.full_tested_match.built.after.mir Bless *all* the mir-opt tests 2024-08-18 16:07:33 -07:00
match_false_edges.full_tested_match2.built.after.mir Bless *all* the mir-opt tests 2024-08-18 16:07:33 -07:00
match_false_edges.main.built.after.mir Bless *all* the mir-opt tests 2024-08-18 16:07:33 -07:00
match_false_edges.rs Regroup mir-opt tests of match building 2024-03-30 17:37:15 +01:00
never_patterns.opt1.SimplifyCfg-initial.after.mir discriminant reads: make semantics independent of module/crate 2026-01-15 19:12:13 +01:00
never_patterns.opt2.SimplifyCfg-initial.after.mir discriminant reads: make semantics independent of module/crate 2026-01-15 19:12:13 +01:00
never_patterns.opt3.SimplifyCfg-initial.after.mir discriminant reads: make semantics independent of module/crate 2026-01-15 19:12:13 +01:00
never_patterns.rs Lower never patterns to Unreachable in mir 2024-05-04 16:30:01 +02:00
simple_match.match_bool.built.after.mir Bless *all* the mir-opt tests 2024-08-18 16:07:33 -07:00
simple_match.match_enum.built.after.mir Return the otherwise_block instead of passing it as argument 2024-07-09 22:47:35 +02:00
simple_match.rs Add test 2024-06-27 11:26:34 +02:00
sort_candidates.constant_eq.SimplifyCfg-initial.after.mir THIR patterns: Always use type str for string-constant-value nodes 2026-01-16 12:17:48 +11:00
sort_candidates.disjoint_ranges.SimplifyCfg-initial.after.mir Bless *all* the mir-opt tests 2024-08-18 16:07:33 -07:00
sort_candidates.rs Update mir-opt filechecks 2024-08-18 15:52:23 -07:00