Commit graph

867 commits

Author SHA1 Message Date
Ralf Jung
73f9571d4f add codegen smoke test 2022-04-15 15:04:00 -04:00
Dylan DPC
3f606ceaec
Rollup merge of #95864 - luqmana:inline-asm-unwind-store-miscompile, r=Amanieu
Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction.

We ran into this bug where rustc would segfault while trying to compile certain uses of inline assembly.

Here is a simple repro that demonstrates the issue:
```rust
#![feature(asm_unwind)]

fn main() {
    let _x = String::from("string here just cause we need something with a non-trivial drop");
    let foo: u64;
    unsafe {
        std::arch::asm!(
            "mov {}, 1",
            out(reg) foo,
            options(may_unwind)
        );
    }
    println!("{}", foo);
}
```
([playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=7d6641e83370d2536a07234aca2498ff))

But crucially `feature(asm_unwind)` is not actually needed and this can be triggered on stable as a result of the way async functions/generators are handled in the compiler. e.g.:

```rust
extern crate futures; // 0.3.21

async fn bar() {
    let foo: u64;
    unsafe {
        std::arch::asm!(
            "mov {}, 1",
            out(reg) foo,
        );
    }
    println!("{}", foo);
}

fn main() {
    futures::executor::block_on(bar());
}
```
([playground link](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1c7781c34dd4a3e80ae4bd936a0c82fc))

An example of the incorrect LLVM generated:
```llvm
bb1:                                              ; preds = %start
  %1 = invoke i64 asm sideeffect alignstack inteldialect unwind "mov ${0:q}, 1", "=&r,~{dirflag},~{fpsr},~{flags},~{memory}"()
          to label %bb2 unwind label %cleanup, !srcloc !9
  store i64 %1, i64* %foo, align 8

bb2:
[...snip...]
```

The store should not be placed after the asm invoke but rather should be in the normal control flow basic block (`bb2` in this case).

[Here](https://gist.github.com/luqmana/be1af5b64d2cda5a533e3e23a7830b44) is a writeup of the investigation that lead to finding this.
2022-04-11 20:00:42 +02:00
Jakob Degen
2a040284a5 Fix tests broken by deaggregation change 2022-04-11 09:26:26 -04:00
Luqman Aden
0b2f3604fd Update asm-may_unwind test to handle use of asm with outputs. 2022-04-09 15:16:38 -07:00
Dylan DPC
1c3657b20d
Rollup merge of #95011 - michaelwoerister:awaitee_field, r=tmandry
async: Give predictable name to binding generated from .await expressions.

This name makes it to debuginfo and allows debuggers to identify such bindings and their captured versions in suspended async fns.

This will be useful for async stack traces, as discussed in https://internals.rust-lang.org/t/async-debugging-logical-stack-traces-setting-goals-collecting-examples/15547.

I don't know if this needs some discussion by ````@rust-lang/compiler,```` e.g. about the name of the binding (`__awaitee`) or about the fact that this PR introduces a (soft) guarantee about a compiler generated name. Although, regarding the later, I think the same reasoning applies here as it does for debuginfo in general.

r? ````@tmandry````
2022-03-31 00:26:30 +02:00
Michael Woerister
78e27e2c7a async: Give predictable, reserved name to binding generated from .await expressions.
This name makes it to debuginfo and allows debuggers to identify such bindings and
their captured versions in suspended async fns.
2022-03-30 11:12:45 +02:00
Oli Scherer
c7efad044c Update allocation id 2022-03-23 16:50:42 +00:00
codehorseman
01dbfb3eb2 resolve the conflict in compiler/rustc_session/src/parse.rs
Signed-off-by: codehorseman <cricis@yeah.net>
2022-03-16 20:12:30 +08:00
Michael Woerister
3ad299aa67 debuginfo: change cpp-like naming for generator environments so that NatVis works for them 2022-03-14 16:52:47 +01:00
Michael Woerister
07ebc13d87 debuginfo: Refactor debuginfo generation for types
This commit
- changes names to use di_node instead of metadata
- uniformly names all functions that build new debuginfo nodes build_xyz_di_node
- renames CrateDebugContext to CodegenUnitDebugContext (which is more accurate)
- moves TypeMap and functions that work directly work with it to a new type_map module
- moves and reimplements enum related builder functions to a new enums module
- splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate
- uses SmallVec instead of Vec in many places
- removes the old infrastructure for dealing with recursion cycles (create_and_register_recursive_type_forward_declaration(), RecursiveTypeDescription, set_members_of_composite_type(), MemberDescription, MemberDescriptionFactory, prepare_xyz_metadata(), etc)
- adds type_map::build_type_with_children() as a replacement for dealing with recursion cycles
- adds many (doc-)comments explaining what's going on
- changes cpp-like naming for C-Style enums so they don't get a enum$<...> name (because the NatVis visualizer does not apply to them)
- fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields
- changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it
2022-03-14 16:49:06 +01:00
Scott McMurray
54408f0963 short-circuit the easy cases in is_copy_modulo_regions
This change is somewhat extensive, since it affects MIR -- since this is called to determine Copy vs Move -- so any test that's `no_core` needs to actually have the normal `impl`s it uses.
2022-03-10 01:19:02 -08:00
Scott McMurray
0d4a3f11e2 mir-opt: Replace clone on primitives with copy
We can't do it for everything, but it would be nice to at least stop making calls to clone methods in debug from things like derived-clones.
2022-03-10 01:19:02 -08:00
Tomasz Miąsko
095d818e0c Always include global target features in function attributes
This ensures that information about target features configured with
`-C target-feature=...` or detected with `-C target-cpu=native` is
retained for subsequent consumers of LLVM bitcode.

This is crucial for linker plugin LTO, since this information is not
conveyed to the plugin otherwise.
2022-03-04 16:57:34 +01:00
bors
b4bf56cd66 Auto merge of #94570 - shampoofactory:reopen-91719, r=workingjubilee
Reopen 91719

Reopened #91719, which was closed inadvertently due to technical difficulties.
2022-03-04 13:06:14 +00:00
bors
62ff2bcf94 Auto merge of #94159 - erikdesjardins:align-load, r=nikic
Add !align metadata on loads of &/&mut/Box

Note that this refers to the alignment of what the loaded value points
to, _not_ the alignment of the loaded value itself.

r? `@ghost` (blocked on #94158)
2022-03-04 08:14:31 +00:00
scottmcm
0c131861c9
Add a missing #[no_mangle] 2022-03-03 19:52:45 +00:00
Jubilee Young
934345be2a
Add autovectorization codegen test
Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>
2022-03-03 18:50:20 +00:00
Scott McMurray
8e985376d4
Redo the array-equality codegen tests for the different threshold 2022-03-03 18:30:34 +00:00
Vin Singh
9d45e0e0b4
Revert #26494 regression 2022-03-03 18:30:27 +00:00
Erik Desjardins
a381aefebb make test work on noopt builder 2022-03-02 22:46:23 -05:00
Dylan DPC
3e6abf0c35
Rollup merge of #94505 - cuviper:mono-item-sort-local, r=michaelwoerister,davidtwco
Restore the local filter on mono item sorting

In `CodegenUnit::items_in_deterministic_order`, there's a comment that
only local HirIds should be taken into account, but #90408 removed the
`as_local` call that sets others to None. Restoring that check fixes the
s390x hangs seen in [RHBZ 2058803].

[RHBZ 2058803]: https://bugzilla.redhat.com/show_bug.cgi?id=2058803
2022-03-03 01:09:14 +01:00
Erik Desjardins
f147303434 fix tests on platforms where Align16 is represented as i128 2022-03-02 17:30:58 -05:00
Josh Stone
6a838e41bb Use CHECK-DAG in codegen/debuginfo-generic-closure-env-names.rs 2022-03-01 15:53:22 -08:00
Augie Fackler
6fbef7f12c tests: avoid problems on 32 bit machines 2022-03-01 16:07:46 -05:00
Augie Fackler
26c5d2155e tests: accept llvm intrinsic in align-checking test
This changed in upstream change https://reviews.llvm.org/D98152 (aka
a266af7211)
wherein LLVM got smarter about using intrinsics. As best I can tell the
change I've made here preserves the intent of the test on LLVM 14 and
before while also passing on LLVM 15 and later.
2022-03-01 15:57:30 -05:00
bors
4a56cbec59 Auto merge of #94402 - erikdesjardins:revert-coldland, r=nagisa
Revert "Auto merge of #92419 - erikdesjardins:coldland, r=nagisa"

Should fix (untested) #94390

Reopens #46515, #87055

r? `@ehuss`
2022-03-01 08:57:46 +00:00
Erik Desjardins
69ae4233cf Add !align metadata on loads of &/&mut/Box
Note that this refers to the alignment of what the loaded value points
to, _not_ the alignment of the loaded value itself.
2022-02-28 20:04:36 -05:00
bors
edda7e959d Auto merge of #94216 - psumbera:sparc64-abi-fix2, r=nagisa
more complete sparc64 ABI fix for aggregates with floating point members

Previous fix didn't handle nested structures at all.
2022-02-28 11:54:17 +00:00
bors
427cf81206 Auto merge of #94158 - erikdesjardins:more-more-noundef, r=nikic
Apply noundef metadata to loads of types that do not permit raw init

This matches the noundef attributes we apply on arguments/return types.

Fixes (partially) #74378.
2022-02-28 06:11:20 +00:00
Erik Desjardins
0c78433749 update vec-shrink-panik test to allow panic_no_unwind in landingpads 2022-02-27 23:15:49 -05:00
Erik Desjardins
851fcc7a54 Revert "Auto merge of #92419 - erikdesjardins:coldland, r=nagisa"
This reverts commit 4f49627c6f, reversing
changes made to 028c6f1454.
2022-02-27 23:11:03 -05:00
bors
9fbff89354 Auto merge of #94157 - erikdesjardins:more-noundef, r=nikic
Apply noundef attribute to all scalar types which do not permit raw init

Beyond `&`/`&mut`/`Box`, this covers `char`, enum discriminants, `NonZero*`, etc.
All such types currently cause a Miri error if left uninitialized,
and an `invalid_value` lint in cases like `mem::uninitialized::<char>()`.

Note that this _does not_ change whether or not it is UB for `u64` (or
other integer types with no invalid values) to be undef.

Fixes (partially) #74378.

r? `@ghost` (blocked on #94127)

`@rustbot` label S-blocked
2022-02-27 21:41:06 +00:00
bors
6a70556616 Auto merge of #94412 - scottmcm:cfg-out-miri-from-swap, r=oli-obk
For MIRI, cfg out the swap vectorization logic from 94212

Because of #69488 the swap logic from #94212 doesn't currently work in MIRI.

Copying in smaller pieces is probably much worse for its performance anyway, so it'd probably rather just use the simple path regardless.

Part of #94371, though another PR will be needed for the CTFE aspect.

r? `@oli-obk`
cc `@RalfJung`
2022-02-27 17:42:48 +00:00
Erik Desjardins
fec4335407 Apply noundef metadata to loads of types that do not permit raw init
This matches the noundef attributes we apply on arguments/return types.
2022-02-27 12:16:16 -05:00
Scott McMurray
b582bd388f For MIRI, cfg out the swap logic from 94212 2022-02-26 18:57:15 -08:00
bors
761e888485 Auto merge of #93516 - nagisa:branch-protection, r=cjgillot
No branch protection metadata unless enabled

Even if we emit metadata disabling branch protection, this metadata may
conflict with other modules (e.g. during LTO) that have different branch
protection metadata set.

This is an unstable flag and feature, so ideally the flag not being
specified should act as if the feature wasn't implemented in the first
place.

Additionally this PR also ensures we emit an error if
`-Zbranch-protection` is set on targets other than the supported
aarch64. For now the error is being output from codegen, but ideally it
should be moved to earlier in the pipeline before stabilization.
2022-02-26 21:53:03 +00:00
Erik Desjardins
5979b681e6 Apply noundef attribute to all scalar types which do not permit raw init
Beyond `&`/`&mut`/`Box`, this covers `char`, discriminants, `NonZero*`, etc.
All such types currently cause a Miri error if left uninitialized,
and an `invalid_value` lint in cases like `mem::uninitialized::<char>()`

Note that this _does not_ change whether or not it is UB for `u64` (or
other integer types with no invalid values) to be undef.
2022-02-26 16:42:33 -05:00
bors
8128e910c0 Auto merge of #94127 - erikdesjardins:debugattr, r=nikic
At opt-level=0, apply only ABI-affecting attributes to functions

This should provide a small perf improvement for debug builds,
and should more than cancel out the perf regression from adding noundef (https://github.com/rust-lang/rust/pull/93670#issuecomment-1038347581, #94106).

r? `@nikic`
2022-02-26 09:41:19 +00:00
Erik Desjardins
945276c920 avoid test failure on targets where all functions are dso_local (e.g. wasm) 2022-02-25 19:24:59 -05:00
Erik Desjardins
b0921f8a0d make tests work on noopt builder 2022-02-25 14:17:45 -05:00
bors
9b2a46591a Auto merge of #93644 - michaelwoerister:simpler-debuginfo-typemap, r=wesleywiser
debuginfo: Simplify TypeMap used during LLVM debuginfo generation.

This PR simplifies the TypeMap that is used in `rustc_codegen_llvm::debuginfo::metadata`. It was unnecessarily complicated because it was originally implemented when types were not yet normalized before codegen. So it did it's own normalization and kept track of multiple unnormalized types being mapped to a single unique id.

This PR is based on https://github.com/rust-lang/rust/pull/93503, which is not merged yet.

The PR also removes the arena used for allocating string ids and instead uses `InlinableString` from the [inlinable_string](https://crates.io/crates/inlinable_string) crate. That might not be the best choice, since that crate does not seem to be very actively maintained. The [flexible-string](https://crates.io/crates/flexible-string) crate would be an alternative.

r? `@ghost`
2022-02-25 11:00:32 +00:00
bors
ece55d416e Auto merge of #94130 - erikdesjardins:partially, r=nikic
Use undef for (some) partially-uninit constants

There needs to be some limit to avoid perf regressions on large arrays
with undef in each element (see comment in the code).

Fixes: #84565
Original PR: #83698

Depends on LLVM 14: #93577
2022-02-25 05:44:33 +00:00
Dylan DPC
7fb55b4c3a
Rollup merge of #94212 - scottmcm:swapper, r=dtolnay
Stop manually SIMDing in `swap_nonoverlapping`

Like I previously did for `reverse` (#90821), this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have.

A variety of codegen tests are included to confirm that the various cases are still being vectorized.

It does still need logic to type-erase in some cases, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`.

As a bonus, this change also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>

<details>

<summary>ASM for this example</summary>

## Before (from godbolt)

note the `push`/`pop`s and `memcpy`

```x86
swap_m256_slice:
        push    r15
        push    r14
        push    r13
        push    r12
        push    rbx
        sub     rsp, 32
        cmp     rsi, rcx
        jne     .LBB0_6
        mov     r14, rsi
        shl     r14, 5
        je      .LBB0_6
        mov     r15, rdx
        mov     rbx, rdi
        xor     eax, eax
.LBB0_3:
        mov     rcx, rax
        vmovaps ymm0, ymmword ptr [rbx + rax]
        vmovaps ymm1, ymmword ptr [r15 + rax]
        vmovaps ymmword ptr [rbx + rax], ymm1
        vmovaps ymmword ptr [r15 + rax], ymm0
        add     rax, 32
        add     rcx, 64
        cmp     rcx, r14
        jbe     .LBB0_3
        sub     r14, rax
        jbe     .LBB0_6
        add     rbx, rax
        add     r15, rax
        mov     r12, rsp
        mov     r13, qword ptr [rip + memcpy@GOTPCREL]
        mov     rdi, r12
        mov     rsi, rbx
        mov     rdx, r14
        vzeroupper
        call    r13
        mov     rdi, rbx
        mov     rsi, r15
        mov     rdx, r14
        call    r13
        mov     rdi, r15
        mov     rsi, r12
        mov     rdx, r14
        call    r13
.LBB0_6:
        add     rsp, 32
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        vzeroupper
        ret
```

## After (from my machine)

Note no `rsp` manipulation, sorry for different ASM syntax

```x86
swap_m256_slice:
	cmpq	%r9, %rdx
	jne	.LBB1_6
	testq	%rdx, %rdx
	je	.LBB1_6
	cmpq	$1, %rdx
	jne	.LBB1_7
	xorl	%r10d, %r10d
	jmp	.LBB1_4
.LBB1_7:
	movq	%rdx, %r9
	andq	$-2, %r9
	movl	$32, %eax
	xorl	%r10d, %r10d
	.p2align	4, 0x90
.LBB1_8:
	vmovaps	-32(%rcx,%rax), %ymm0
	vmovaps	-32(%r8,%rax), %ymm1
	vmovaps	%ymm1, -32(%rcx,%rax)
	vmovaps	%ymm0, -32(%r8,%rax)
	vmovaps	(%rcx,%rax), %ymm0
	vmovaps	(%r8,%rax), %ymm1
	vmovaps	%ymm1, (%rcx,%rax)
	vmovaps	%ymm0, (%r8,%rax)
	addq	$2, %r10
	addq	$64, %rax
	cmpq	%r10, %r9
	jne	.LBB1_8
.LBB1_4:
	testb	$1, %dl
	je	.LBB1_6
	shlq	$5, %r10
	vmovaps	(%rcx,%r10), %ymm0
	vmovaps	(%r8,%r10), %ymm1
	vmovaps	%ymm1, (%rcx,%r10)
	vmovaps	%ymm0, (%r8,%r10)
.LBB1_6:
	vzeroupper
	retq
```

</details>

This does all its copying operations as either the original type or as `MaybeUninit`s, so as far as I know there should be no potential abstract machine issues with reading padding bytes as integers.

<details>

<summary>Perf is essentially unchanged</summary>

Though perhaps with more target features this would help more, if it could pick bigger chunks

## Before

```
running 10 tests
test slice::swap_with_slice_4x_usize_30                            ... bench:         894 ns/iter (+/- 11)
test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,476 ns/iter (+/- 2,784)
test slice::swap_with_slice_5x_usize_30                            ... bench:       1,257 ns/iter (+/- 7)
test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,922 ns/iter (+/- 959)
test slice::swap_with_slice_rgb_30                                 ... bench:         328 ns/iter (+/- 27)
test slice::swap_with_slice_rgb_3000                               ... bench:      16,215 ns/iter (+/- 176)
test slice::swap_with_slice_u8_30                                  ... bench:         312 ns/iter (+/- 9)
test slice::swap_with_slice_u8_3000                                ... bench:       5,401 ns/iter (+/- 123)
test slice::swap_with_slice_usize_30                               ... bench:         368 ns/iter (+/- 3)
test slice::swap_with_slice_usize_3000                             ... bench:      28,472 ns/iter (+/- 3,913)
```

## After

```
running 10 tests
test slice::swap_with_slice_4x_usize_30                            ... bench:         868 ns/iter (+/- 36)
test slice::swap_with_slice_4x_usize_3000                          ... bench:      99,642 ns/iter (+/- 1,507)
test slice::swap_with_slice_5x_usize_30                            ... bench:       1,194 ns/iter (+/- 11)
test slice::swap_with_slice_5x_usize_3000                          ... bench:     139,761 ns/iter (+/- 5,018)
test slice::swap_with_slice_rgb_30                                 ... bench:         324 ns/iter (+/- 6)
test slice::swap_with_slice_rgb_3000                               ... bench:      15,962 ns/iter (+/- 287)
test slice::swap_with_slice_u8_30                                  ... bench:         281 ns/iter (+/- 5)
test slice::swap_with_slice_u8_3000                                ... bench:       5,324 ns/iter (+/- 40)
test slice::swap_with_slice_usize_30                               ... bench:         275 ns/iter (+/- 5)
test slice::swap_with_slice_usize_3000                             ... bench:      28,277 ns/iter (+/- 277)
```

</detail>
2022-02-24 21:42:14 +01:00
Petr Sumbera
992c27c601 Add test for nested structures. 2022-02-24 13:46:34 +01:00
Michael Woerister
e72e6399b1 debuginfo: Simplify TypeMap used during LLVM debuginfo generation.
The previous implementation was written before types were properly
normalized for code generation and had to assume a more complicated
relationship between types and their debuginfo -- generating separate
identifiers for debuginfo nodes that were based on normalized types.

Since types are now already normalized, we can use them as identifiers
for debuginfo nodes.
2022-02-21 13:03:36 +01:00
Scott McMurray
8ca47d7ae4 Stop manually SIMDing in swap_nonoverlapping
Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have.

It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`.

As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>
2022-02-21 00:54:02 -08:00
Erik Desjardins
5bf8303bbc limit tests to llvm 14+ 2022-02-20 12:57:22 -05:00
bors
2690468727 Auto merge of #92911 - nbdd0121:unwind, r=Amanieu
Guard against unwinding in cleanup code

Currently the only safe guard we have against double unwind is the panic count (which is local to Rust). When double unwinds indeed happen (e.g. C++ exception + Rust panic, or two C++ exceptions), then the second unwind actually goes through and the first unwind is leaked. This can cause UB. cc rust-lang/project-ffi-unwind#6

E.g. given the following C++ code:
```c++
extern "C" void foo() {
    throw "A";
}

extern "C" void execute(void (*fn)()) {
    try {
        fn();
    } catch(...) {
    }
}
```

This program is well-defined to terminate:
```c++
struct dtor {
    ~dtor() noexcept(false) {
        foo();
    }
};

void a() {
    dtor a;
    dtor b;
}

int main() {
    execute(a);
    return 0;
}
```

But this Rust code doesn't catch the double unwind:
```rust
extern "C-unwind" {
    fn foo();
    fn execute(f: unsafe extern "C-unwind" fn());
}

struct Dtor;

impl Drop for Dtor {
    fn drop(&mut self) {
        unsafe { foo(); }
    }
}

extern "C-unwind" fn a() {
    let _a = Dtor;
    let _b = Dtor;
}

fn main() {
    unsafe { execute(a) };
}
```

To address this issue, this PR adds an unwind edge to an abort block, so that the Rust example aborts. This is similar to how clang guards against double unwind (except clang calls terminate per C++ spec and we abort).

The cost should be very small; it's an additional trap instruction (well, two for now, since we use TrapUnreachable, but that's a different issue) for each function with landing pads; if LLVM gains support to encode "abort/terminate" info directly in LSDA like GCC does, then it'll be free. It's an additional basic block though so compile time may be worse, so I'd like a perf run.

r? `@ghost`
`@rustbot` label: F-c_unwind
2022-02-19 23:25:06 +00:00
Gary Guo
7d683f525a Fix codegen test for MSVC 2022-02-19 17:29:56 +00:00
Erik Desjardins
c2e84fa5fc reduce default uninit_const_chunk_threshold to 16 (from 256) 2022-02-19 10:36:29 -05:00