Commit graph

1040 commits

Author SHA1 Message Date
bors
d137783642 Auto merge of #102900 - abrachet:master, r=bjorn3
Don't internalize __llvm_profile_counter_bias

Currently, LLVM profiling runtime counter relocation cannot be used by rust during LTO because symbols are being internalized before all symbol information is known.

This mode makes LLVM emit a __llvm_profile_counter_bias symbol which is referenced by the profiling initialization, which itself is pulled in by the rust driver here [1].

It is enabled with -Cllvm-args=-runtime-counter-relocation for platforms which are opt-in to this mode like Linux. On these platforms there will be no link error, rather just surprising behavior for a user which request runtime counter relocation. The profiling runtime will not see that symbol go on as if it were never there. On Fuchsia, the profiling runtime must have this symbol which will cause a hard link error.

As an aside, I don't have enough context as to why rust's LTO model is how it is. AFAICT, the internalize pass is only safe to run at link time when all symbol information is actually known, this being an example as to why. I think special casing this symbol as a known one that LLVM can emit which should not have it's visbility de-escalated should be fine given how seldom this pattern of defining an undefined symbol to get initilization code pulled in is. From a quick grep, __llvm_profile_runtime is the only symbol that rustc does this for.

[1] 0265a3e93b/compiler/rustc_codegen_ssa/src/back/linker.rs (L598)
2022-12-11 14:42:45 +00:00
bors
32da230588 Auto merge of #105531 - matthiaskrgr:rollup-7y7zbgl, r=matthiaskrgr
Rollup of 6 pull requests

Successful merges:

 - #104460 (Migrate parts of `rustc_expand` to session diagnostics)
 - #105192 (Point at LHS on binop type err if relevant)
 - #105234 (Remove unneeded field from `SwitchTargets`)
 - #105239 (Avoid heap allocation when truncating thread names)
 - #105410 (Consider `parent_count` for const param defaults)
 - #105482 (Fix invalid codegen during debuginfo lowering)

Failed merges:

 - #105411 (Introduce `with_forced_trimmed_paths`)

r? `@ghost`
`@rustbot` modify labels: rollup
2022-12-10 16:37:36 +00:00
Matthias Krüger
ab505298ea
Rollup merge of #105482 - wesleywiser:fix_debuginfo_ub, r=tmiasko
Fix invalid codegen during debuginfo lowering

In order for LLVM to correctly generate debuginfo for msvc, we sometimes need to spill arguments to the stack and perform some direct & indirect offsets into the value. Previously, this code always performed those actions, even when not required as LLVM would clean it up during optimization.

However, when MIR inlining is enabled, this can cause problems as the operations occur prior to the spilled value being initialized. To solve this, we first calculate the necessary offsets using just the type which is side-effect free and does not alter the LLVM IR. Then, if we are in a situation which requires us to generate the LLVM IR (and this situation only occurs for arguments, not local variables) then we perform the same calculation again, this time generating the appropriate LLVM IR as we go.

r? `@tmiasko` but feel free to reassign if you want 🙂

Fixes #105386
2022-12-10 15:01:45 +01:00
bors
a161a7b654 Auto merge of #105384 - uweigand:s390x-test-codegen, r=Mark-Simulacrum
Fix failing codegen tests on s390x

Several codegen tests are currently failing due to making assumptions that are not valid for the s390x architecture:

- catch-unwind.rs: fails due to inlining differences. Already ignored on another platform for the same reason. Solution: Ignore on s390x.

- remap_path_prefix/main.rs: fails due to different alignment requirement for string constants. Solution: Do not test for the alignment requirement.

- repr-transparent-aggregates-1.rs: many ABI assumptions. Already ignored on many platforms for the same reason. Solution: Ignore on s390x.

- repr-transparent.rs: no vector ABI by default on s390x. Already ignored on another platform for a similar reason. Solution: Ignore on s390x.

- uninit-consts.rs: hard-coded little-endian constant. Solution: Match both little- and big-endian versions.

Fixes part of https://github.com/rust-lang/rust/issues/105383.
2022-12-10 13:56:58 +00:00
Matthias Krüger
947fe7e341
Rollup merge of #105109 - rcvalle:rust-kcfi, r=bjorn3
Add LLVM KCFI support to the Rust compiler

This PR adds LLVM Kernel Control Flow Integrity (KCFI) support to the Rust compiler. It initially provides forward-edge control flow protection for operating systems kernels for Rust-compiled code only by aggregating function pointers in groups identified by their return and parameter types. (See llvm/llvm-project@cff5bef.)

Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by identifying C char and integer type uses at the time types are encoded (see Type metadata in the design document in the tracking issue #89653).

LLVM KCFI can be enabled with -Zsanitizer=kcfi.

Thank you again, `@bjorn3,` `@eddyb,` `@nagisa,` and `@ojeda,` for all the help!
2022-12-10 09:24:43 +01:00
Wesley Wiser
7253057887 Don't generate pointer loads to spills unless necessary
In order for LLVM to correctly generate debuginfo for msvc, we sometimes
need to spill arguments to the stack and perform some direct & indirect
offsets into the value. Previously, this code always performed those
actions, even when not required as LLVM would clean it up during
optimization.

However, when MIR inlining is enabled, this can cause problems as the
operations occur prior to the spilled value being initialized. To solve
this, we first calculate the necessary offsets using just the type which
is side-effect free and does not alter the LLVM IR. Then, if we are in a
situation which requires us to generate the LLVM IR (and this situation
only occurs for arguments, not local variables) then we perform the same
calculation again, this time generating the appropriate LLVM IR as we
go.
2022-12-08 20:38:23 -05:00
Wesley Wiser
34246f89e2 Add test case for ub in generated LLVM IR 2022-12-08 20:38:18 -05:00
Ramon de C Valle
65698ae9f3 Add LLVM KCFI support to the Rust compiler
This commit adds LLVM Kernel Control Flow Integrity (KCFI) support to
the Rust compiler. It initially provides forward-edge control flow
protection for operating systems kernels for Rust-compiled code only by
aggregating function pointers in groups identified by their return and
parameter types. (See llvm/llvm-project@cff5bef.)

Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by identifying C char and integer type uses at the
time types are encoded (see Type metadata in the design document in the
tracking issue #89653).

LLVM KCFI can be enabled with -Zsanitizer=kcfi.

Co-authored-by: bjorn3 <17426603+bjorn3@users.noreply.github.com>
2022-12-08 17:24:39 -08:00
Alex Brachet
5d88d36053 Don't internalize __llvm_profile_counter_bias
Currently, LLVM profiling runtime counter relocation cannot be
used by rust during LTO because symbols are being internalized
before all symbol information is known.

This mode makes LLVM emit a __llvm_profile_counter_bias symbol
which is referenced by the profiling initialization, which itself
is pulled in by the rust driver here [1].

It is enabled with -Cllvm-args=-runtime-counter-relocation for
platforms which are opt-in to this mode like Linux. On these
platforms there will be no link error, rather just surprising
behavior for a user which request runtime counter relocation.
The profiling runtime will not see that symbol go on as if it
were never there. On Fuchsia, the profiling runtime must have
this symbol which will cause a hard link error.

As an aside, I don't have enough context as to why rust's LTO
model is how it is. AFAICT, the internalize pass is only safe
to run at link time when all symbol information is actually
known, this being an example as to why. I think special casing
this symbol as a known one that LLVM can emit which should not
have it's visbility de-escalated should be fine given how
seldom this pattern of defining an undefined symbol to get
initilization code pulled in is. From a quick grep,
__llvm_profile_runtime is the only symbol that rustc does this
for.

[1] 0265a3e93b/compiler/rustc_codegen_ssa/src/back/linker.rs (L598)
2022-12-07 16:32:59 +00:00
Ulrich Weigand
3d33af38b3 Fix failing codegen tests on s390x
Several codegen tests are currently failing due to making
assumptions that are not valid for the s390x architecture:

- catch-unwind.rs: fails due to inlining differences.
  Already ignored on another platform for the same reason.
  Solution: Ignore on s390x.

- remap_path_prefix/main.rs: fails due to different alignment
  requirement for string constants.
  Solution: Do not test for the alignment requirement.

- repr-transparent-aggregates-1.rs: many ABI assumptions.
  Already ignored on many platforms for the same reason.
  Solution: Ignore on s390x.

- repr-transparent.rs: no vector ABI by default on s390x.
  Already ignored on another platform for a similar reason.
  Solution: Ignore on s390x.

- uninit-consts.rs: hard-coded little-endian constant.
  Solution: Match both little- and big-endian versions.

Fixes part of https://github.com/rust-lang/rust/issues/105383.
2022-12-06 18:12:46 +01:00
bors
53e4b9dd74 Auto merge of #104535 - mikebenfield:discr-fix, r=pnkfelix
rustc_codegen_ssa: Fix for codegen_get_discr

When doing the optimized implementation of getting the discriminant, the arithmetic needs to be done in the tag type so wrapping behavior works correctly.

Fixes #104519
2022-12-04 20:05:32 +00:00
Tomasz Miąsko
c955add18c Disable coverage instrumentation for naked functions 2022-12-03 01:03:28 +01:00
bors
367ecffe52 Auto merge of #105003 - flba-eb:only_windows, r=Mark-Simulacrum
Run Windows-only tests only on Windows

This removes the need to maintain an ignore-list of all other OSs.

See https://github.com/rust-lang/rust/pull/102305 for a similar change.
2022-12-01 14:27:34 +00:00
Florian Bartels
c04d67444f Ignore gnu systems (windows-msvc only) 2022-12-01 14:40:10 +01:00
David Rheinsberg
7b9e1a95f0 test/codegen: test inter-crate linkage with static relocation
Add a codegen-test that verifies inter-crate linkage with the static
relocation model. We expect all symbols that are part of a rust
compilation to end up in the same DSO, thus we expect `dso_local`
annotations.
2022-11-29 11:21:16 +01:00
Florian Bartels
3e72a01566 Run Windows-only tests only on Windows
This removes the need to maintain a list of all other
OSs which ignore the tests.
2022-11-28 08:11:19 +01:00
bors
faf1891deb Auto merge of #104818 - scottmcm:refactor-extend-func, r=the8472
Stop peeling the last iteration of the loop in `Vec::resize_with`

`resize_with` uses the `ExtendWith` code that peels the last iteration:
341d8b8a2c/library/alloc/src/vec/mod.rs (L2525-L2529)

But that's kinda weird for `ExtendFunc` because it does the same thing on the last iteration anyway:
341d8b8a2c/library/alloc/src/vec/mod.rs (L2494-L2502)

So this just has it use the normal `extend`-from-`TrustedLen` code instead.

r? `@ghost`
2022-11-27 00:58:50 +00:00
bors
8841bee954 Auto merge of #103556 - clubby789:specialize-option-partial-eq, r=scottmcm
Manually implement PartialEq for Option<T> and specialize non-nullable types

This PR manually implements `PartialEq` and `StructuralPartialEq` for `Option`, which seems to produce slightly better codegen than the automatically derived implementation.

It also allows specializing on the `core::num::NonZero*` and `core::ptr::NonNull` types, taking advantage of the niche optimization by transmuting the `Option<T>` to `T` to be compared directly, which can be done in just two instructions.

A comparison of the original, new and specialized code generation is available [here](https://godbolt.org/z/dE4jxdYsa).
2022-11-26 08:56:20 +00:00
Scott McMurray
9d68a1a74c Tune RepeatWith::try_fold and Take::for_each and Vec::extend_trusted 2022-11-24 19:14:19 -08:00
Arpad Borsos
9f36f988ad
Avoid GenFuture shim when compiling async constructs
Previously, async constructs would be lowered to "normal" generators,
with an additional `from_generator` / `GenFuture` shim in between to
convert from `Generator` to `Future`.

The compiler will now special-case these generators internally so that
async constructs will *directly* implement `Future` without the need
to go through the `from_generator` / `GenFuture` shim.

The primary motivation for this change was hiding this implementation
detail in stack traces and debuginfo, but it can in theory also help
the optimizer as there is less abstractions to see through.
2022-11-24 10:04:27 +01:00
The 8472
a9128d8927 fix tests, update size asserts 2022-11-22 23:12:26 +01:00
Yuki Okushi
785237d392
Rollup merge of #104435 - scottmcm:iter-repeat-n, r=thomcc
`VecDeque::resize` should re-use the buffer in the passed-in element

Today it always copies it for *every* appended element, but one of those clones is avoidable.

This adds `iter::repeat_n` (https://github.com/rust-lang/rust/issues/104434) as the primitive needed to do this.  If this PR is acceptable, I'll also use this in `Vec` rather than its custom `ExtendElement` type & infrastructure that is harder to share between multiple different containers:

101e1822c3/library/alloc/src/vec/mod.rs (L2479-L2492)
2022-11-20 13:15:59 +09:00
Manish Goregaokar
e2301154e3
Rollup merge of #103456 - scottmcm:fix-unchecked-shifts, r=scottmcm
`unchecked_{shl|shr}` should use `u32` as the RHS

The other shift methods, such as https://doc.rust-lang.org/nightly/std/primitive.u64.html#method.checked_shr and https://doc.rust-lang.org/nightly/std/primitive.i16.html#method.wrapping_shl, use `u32` for the shift amount.  That's consistent with other things, like `count_ones`, which also always use `u32` for a bit count, regardless of the size of the type.

This PR changes `unchecked_shl` and `unchecked_shr` to also use `u32` for the shift amount (rather than Self).

cc #85122, the `unchecked_math` tracking issue
2022-11-18 17:48:17 -05:00
Michael Benfield
31c0645b9d rustc_codegen_ssa: Fix for codegen_get_discr
When doing the optimized implementation of getting the discriminant, the
arithmetic needs to be done in the tag type so wrapping behavior works
correctly.

Fixes #104519
2022-11-18 21:16:12 +00:00
Scott McMurray
9d4b1f98e6 Ignore the unchecked-shifts codegen test in debug-assertions builds 2022-11-16 15:58:43 -08:00
Scott McMurray
d62b903892 VecDeque::resize should re-use the buffer in the passed-in element
Today it always copies it for *every* appended element, but one of those clones is avoidable.
2022-11-15 00:53:26 -08:00
Michael Benfield
51918dcc51 rustc_codegen_ssa: Better code generation for niche discriminants.
In some cases we can avoid arithmetic before checking whether a niche
represents an untagged variant.

This is relevant to #101872
2022-11-11 05:54:30 +00:00
Manish Goregaokar
8f2c1f8469
Rollup merge of #104077 - nicholasbishop:bishop-uefi-aapcs, r=nagisa
Use aapcs for efiapi calling convention on arm

On arm, [llvm treats the C calling convention as `aapcs` on soft-float targets and `aapcs-vfp` on hard-float targets](https://github.com/rust-lang/compiler-builtins/issues/116#issuecomment-261057422). UEFI specifies in the arm calling convention that [floating point extensions aren't used](https://uefi.org/specs/UEFI/2.10/02_Overview.html#detailed-calling-convention), so always translate `efiapi` to `aapcs` on arm.

https://github.com/rust-lang/rust/issues/65815
2022-11-10 10:47:39 -05:00
Manish Goregaokar
7521a974d3
Rollup merge of #103353 - wesleywiser:fix_lld_thinlto_msvc, r=michaelwoerister
Fix Access Violation when using lld & ThinLTO on windows-msvc

Users report an AV at runtime of the compiled binary when using lld and ThinLTO on windows-msvc. The AV occurs when accessing a static value which is defined in one crate but used in another. Based on the disassembly of the cross-crate use, it appears that the use is not correctly linked with the definition and is instead assigned a garbage pointer value.

If we look at the symbol tables for each crates' obj file, we can see what is happening:

*lib.obj*:

```
COFF SYMBOL TABLE
...
00E 00000000 SECT2  notype       External     | _ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```

*bin.obj*:

```
COFF SYMBOL TABLE
...
010 00000000 UNDEF  notype       External     | __imp__ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```

The use of the symbol has the "import" style symbol name but the declaration doesn't generate any symbol with the same name. As a result, linking the files generates a warning from lld:

> rust-lld: warning: bin.obj: locally defined symbol imported: reproducer::memrchr::FN::h612b61ca0e168901 (defined in lib.obj) [LNK4217]

and the symbol reference remains undefined at runtime leading to the AV.

To fix this, we just need to detect that we are performing ThinLTO (and thus, static linking) and omit the `dllimport` attribute on the extern item in LLVM IR.

Fixes #81408
2022-11-08 21:03:52 -05:00
Nicholas Bishop
42cbb40157 Use aapcs for efiapi calling convention on arm
On arm, llvm treats the C calling convention as `aapcs` on soft-float
targets and `aapcs-vfp` on hard-float targets [1]. UEFI specifies in the
arm calling convention that floating point extensions aren't used [2],
so always translate `efiapi` to `aapcs` on arm.

[1]: https://github.com/rust-lang/compiler-builtins/issues/116#issuecomment-261057422
[2]: https://uefi.org/specs/UEFI/2.10/02_Overview.html#detailed-calling-convention

https://github.com/rust-lang/rust/issues/65815
2022-11-06 18:05:24 -05:00
Michael Goulet
ff8f84ccf6 Bless more tests 2022-11-05 18:05:45 +00:00
Tim Neumann
c15cfc91c4 LLVM 16: Switch to using MemoryEffects 2022-11-04 17:58:16 +00:00
Wesley Wiser
296489c892 Fix Access Violation when using lld & ThinLTO on windows-msvc
Users report an AV at runtime of the compiled binary when using lld and
ThinLTO on windows-msvc. The AV occurs when accessing a static value
which is defined in one crate but used in another. Based on the
disassembly of the cross-crate use, it appears that the use is not
correctly linked with the definition and is instead assigned a garbage
pointer value.

If we look at the symbol tables for each crates' obj file, we can see
what is happening:

*lib.obj*:

```
COFF SYMBOL TABLE
...
00E 00000000 SECT2  notype       External     | _ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```

*bin.obj*:

```
COFF SYMBOL TABLE
...
010 00000000 UNDEF  notype       External     | __imp__ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```

The use of the symbol has the "import" style symbol name but the
declaration doesn't generate any symbol with the same name. As a result,
linking the files generates a warning from lld:

> rust-lld: warning: bin.obj: locally defined symbol imported: reproducer::memrchr::FN::h612b61ca0e168901 (defined in lib.obj) [LNK4217]

and the symbol reference remains undefined at runtime leading to the AV.

To fix this, we just need to detect that we are performing ThinLTO (and
thus, static linking) and omit the `dllimport` attribute on the extern
item in LLVM IR.
2022-11-03 11:17:42 -04:00
Wesley Wiser
cf6efe8137 Add test case 2022-11-03 11:17:42 -04:00
clubby789
8e8fd02b27 Specialize PartialEq for Option<num::NonZero*> and Option<ptr::NonNull> 2022-10-31 16:43:31 +00:00
ouz-a
a1672ad5b8 Remove bounds check with enum cast 2022-10-31 14:10:37 +03:00
Nicholas Nethercote
003a3f8cd3 Use br instead of switch in more cases.
`codegen_switchint_terminator` already uses `br` instead of `switch`
when there is one normal target plus the `otherwise` target. But there's
another common case with two normal targets and an `otherwise` target
that points to an empty unreachable BB. This comes up a lot when
switching on the tags of enums that use niches.

The pattern looks like this:
```
bb1:                                              ; preds = %bb6
  %3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
  %4 = sub i8 %3, 2
  %5 = icmp eq i8 %4, 0
  %_6 = select i1 %5, i64 0, i64 1
  switch i64 %_6, label %bb3 [
    i64 0, label %bb4
    i64 1, label %bb2
  ]

bb3:                                              ; preds = %bb1
  unreachable
```
This commit adds code to convert the `switch` to a `br`:
```
bb1:                                              ; preds = %bb6
  %3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
  %4 = sub i8 %3, 2
  %5 = icmp eq i8 %4, 0
  %_6 = select i1 %5, i64 0, i64 1
  %6 = icmp eq i64 %_6, 0
  br i1 %6, label %bb4, label %bb2

bb3:                                              ; No predecessors!
  unreachable
```
This has a surprisingly large effect on compile times, with reductions
of 5% on debug builds of some crates. The reduction is all due to LLVM
taking less time. Maybe LLVM is just much better at handling `br` than
`switch`.

The resulting code is still suboptimal.
- The `icmp`, `select`, `icmp` sequence is silly, converting an `i1` to an `i64`
  and back to an `i1`. But with the current code structure it's hard to avoid,
  and LLVM will easily clean it up, in opt builds at least.
- `bb3` is usually now truly dead code (though not always, so it can't
  be removed universally).
2022-10-31 10:16:39 +11:00
bors
f42b6fa7ca Auto merge of #103299 - nikic:usub-overflow, r=wesleywiser
Don't use usub.with.overflow intrinsic

The canonical form of a usub.with.overflow check in LLVM are separate sub + icmp instructions, rather than a usub.with.overflow intrinsic. Using usub.with.overflow will generally result in worse optimization potential.

The backend will attempt to form usub.with.overflow when it comes to actual instruction selection. This is not fully reliable, but I believe this is a better tradeoff than using the intrinsic in IR.

Fixes #103285.
2022-10-30 17:45:04 +00:00
bors
77e7b74ad5 Auto merge of #103071 - wesleywiser:fix_inlined_line_numbers, r=davidtwco
Fix line numbers for MIR inlined code

`should_collapse_debuginfo` detects if the specified span is part of a
macro expansion however it does this by checking if the span is anything
other than a normal (non-expanded) kind, then the span sequence is
walked backwards to the root span.

This doesn't work when the MIR inliner inlines code as it creates spans
with expansion information set to `ExprKind::Inlined` and results in the
line number being attributed to the inline callsite rather than the
normal line number of the inlined code.

Fixes #103068
2022-10-28 16:27:56 +00:00
Scott McMurray
3b16c04676 unchecked_{shl|shr} should use u32 as the RHS 2022-10-23 17:32:36 -07:00
Patrick Walton
da630ac79d Introduce deduced parameter attributes, and use them for deducing readonly on
indirect immutable freeze by-value function parameters.

Right now, `rustc` only examines function signatures and the platform ABI when
determining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular, `readonly`
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.

This patch introduces the machinery that allows `rustc` to determine those
attributes. It consists of a query, `deduced_param_attrs`, that, when
evaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the `readonly` attribute should be applied to
parameters that are indirect immutable freeze by-value.  More attributes could
conceivably be deduced in the future: `nocapture` and `noalias` come to mind.

Adding `readonly` to indirect function parameters where applicable enables some
potential optimizations in LLVM that are discussed in [issue 103103] and [PR
103070] around avoiding stack-to-stack memory copies that appear in functions
like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These
functions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly
nocapture noalias` in both the caller and callee. We already pass `nocapture
noalias`, but we're missing `readonly`, as we can't determine whether a
by-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the `readonly` attribute, even with fat
LTO; it seems that deducing it at the MIR level is necessary.

No large benefits should be expected from this optimization *now*; LLVM needs
some changes (discussed in [PR 103070]) to more aggressively use the `noalias
nocapture readonly` combination in its alias analysis. I have some LLVM patches
for these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the `memcpy`s from the following
code:

```rust
fn main() {
    println!("Hello {}", 3);
}
```

which is a significant codegen improvement over the status quo. I expect that
if this optimization kicks in in multiple places even for such a simple
program, then it will apply to Rust code all over the place.

[issue 103103]: https://github.com/rust-lang/rust/issues/103103

[PR 103070]: https://github.com/rust-lang/rust/pull/103070
2022-10-21 02:33:15 -07:00
Nikita Popov
783301298f Don't use usub.with.overflow intrinsic
The canonical form of a usub.with.overflow check in LLVM are
separate sub + icmp instructions, rather than a usub.with.overflow
intrinsic. Using usub.with.overflow will generally result in worse
optimization potential.

The backend will attempt to form usub.with.overflow when it comes
to actual instruction selection. This is not fully reliable, but
I believe this is a better tradeoff than using the intrinsic in
IR.

Fixes #103285.
2022-10-20 12:47:17 +02:00
Wesley Wiser
34d90a46da Fix line numbers for MIR inlined code
`should_collapse_debuginfo` detects if the specified span is part of a
macro expansion however it does this by checking if the span is anything
other than a normal (non-expanded) kind, then the span sequence is
walked backwards to the root span.

This doesn't work when the MIR inliner inlines code as it creates spans
with expansion information set to `ExprKind::Inlined` and results in the
line number being attributed to the inline callsite rather than the
normal line number of the inlined code.
2022-10-14 18:44:30 -04:00
Wesley Wiser
9363a1401e Add test case for MIR inlining debuginfo line numbers 2022-10-14 14:09:30 -04:00
Rageking8
7122abaddf more dupe word typos 2022-10-14 12:57:56 +08:00
bors
365578445c Auto merge of #102724 - pcc:scs-fix-test, r=Mark-Simulacrum
Fix the sanitizer_scs_attr_check.rs test

The test is failing when targeting aarch64 Android. The intent appears to have been to look for a function attributes comment (or the absence of one) on the line preceding the function declaration. But this isn't quite possible with FileCheck and the test as written was looking for a line with `no_scs` after a line with `scs`, which doesn't appear in the output. Instead, match on the function attributes comment on the line following the demangled function name comment.
2022-10-11 04:27:13 +00:00
bors
a6b7274a46 Auto merge of #102596 - scottmcm:option-bool-calloc, r=Mark-Simulacrum
Do the `calloc` optimization for `Option<bool>`

Inspired by <https://old.reddit.com/r/rust/comments/xtiqj8/why_is_this_functional_version_faster_than_my_for/iqqy37b/>.
2022-10-10 18:42:40 +00:00
Ralf Jung
6f6433428f add a few more assert_unsafe_precondition 2022-10-07 14:35:12 +02:00
Peter Collingbourne
5f3a4240c5 Fix the sanitizer_scs_attr_check.rs test
The test is failing when targeting aarch64 Android. The intent appears
to have been to look for a function attributes comment (or the absence
of one) on the line preceding the function declaration. But this isn't
quite possible with FileCheck and the test as written was looking for a
line with `no_scs` after a line with `scs`, which doesn't appear in the
output. Instead, match on the function attributes comment on the line
following the demangled function name comment.
2022-10-05 21:55:24 -07:00
bors
607b8296e0 Auto merge of #102503 - cuviper:x86-stack-probes, r=nagisa
Enable inline stack probes on X86 with LLVM 16

The known problems with x86 inline-asm stack probes have been solved on LLVM main (16), so this flips the switch. Anyone using bleeding-edge LLVM with rustc can start testing this, as I have done locally. We'll get more direct rust-ci when LLVM 16 branches and we start our upgrade, and we can always patch or disable it then if we find new problems.

The previous attempt was #77885, reverted in #84708.
2022-10-03 02:09:05 +00:00