Commit graph

306 commits

Author SHA1 Message Date
Matthias Krüger
555df301f8
Rollup merge of #134232 - bjorn3:naked_asm_improvements, r=wesleywiser
Share the naked asm impl between cg_ssa and cg_clif

This was introduced in https://github.com/rust-lang/rust/pull/128004.
2025-04-30 17:27:57 +02:00
Trevor Gross
e3458dcf19 Update documentation for fn target_config
This was missed as part of [1].

[1]: https://github.com/rust-lang/rust/pull/140323
2025-04-29 06:13:02 +00:00
Trevor Gross
6ceeb0849e Implement the internal feature cfg_target_has_reliable_f16_f128
Support for `f16` and `f128` is varied across targets, backends, and
backend versions. Eventually we would like to reach a point where all
backends support these approximately equally, but until then we have to
work around some of these nuances of support being observable.

Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which
provides the following new configuration gates:

* `cfg(target_has_reliable_f16)`
* `cfg(target_has_reliable_f16_math)`
* `cfg(target_has_reliable_f128)`
* `cfg(target_has_reliable_f128_math)`

`reliable_f16` and `reliable_f128` indicate that basic arithmetic for
the type works correctly. The `_math` versions indicate that anything
relying on `libm` works correctly, since sometimes this hits a separate
class of codegen bugs.

These options match configuration set by the build script at [1]. The
logic for LLVM support is duplicated as-is from the same script. There
are a few possible updates that will come as a follow up.

The config introduced here is not planned to ever become stable, it is
only intended to replace the build scripts for `std` tests and
`compiler-builtins` that don't have any way to configure based on the
codegen backend.

MCP: https://github.com/rust-lang/compiler-team/issues/866
Closes: https://github.com/rust-lang/compiler-team/issues/866

[1]: 555e1d0386/library/std/build.rs (L84-L186)
2025-04-27 19:58:44 +00:00
bjorn3
421f22e8bf Pass &mut self to codegen_global_asm 2025-04-14 09:38:04 +00:00
bors
1df5affaca Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics

Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.

These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.

I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.

r? `@scottmcm`
2025-03-24 22:53:12 +00:00
bors
ebf0cf75d3 Auto merge of #137586 - nnethercote:SetImpliedBits, r=bjorn3
Speed up target feature computation

The LLVM backend calls `LLVMRustHasFeature` twice for every feature. In short-running rustc invocations, this accounts for a surprising amount of work.

r? `@bjorn3`
2025-03-11 12:05:16 +00:00
Matthias Krüger
63c548d82c
Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco
Clean up various LLVM FFI things in codegen_llvm

cc ```@ZuseZ4``` I touched some autodiff parts

The major change of this PR is [bfd88ce](https://github.com/rust-lang/rust/pull/137549/commits/bfd88cead0dd79717f123ad7e9a26ecad88653cb) which makes `CodegenCx` generic just like `GenericBuilder`

The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks.

best reviewed commit-by-commit
2025-03-07 19:15:34 +01:00
DaniPopes
58c10c66c1
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics
Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding
LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.
2025-03-06 22:29:05 +08:00
Nicholas Nethercote
936a8232df Change signature of target_features_cfg.
Currently it is called twice, once with `allow_unstable` set to true and
once with it set to false. This results in some duplicated work. Most
notably, for the LLVM backend, `LLVMRustHasFeature` is called twice for
every feature, and it's moderately slow. For very short running
compilations on platforms with many features (e.g. a `check` build of
hello-world on x86) this is a significant fraction of runtime.

This commit changes `target_features_cfg` so it is only called once, and
it now returns a pair of feature sets. This halves the number of
`LLVMRustHasFeature` calls.
2025-03-05 09:49:17 +11:00
Michael Goulet
a59a8f9e75 Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin"
This reverts commit a7a6c64a65, reversing
changes made to ebbe63891f.
2025-03-02 18:52:48 +00:00
bors
0c72c0d11a Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic
The embedded bitcode should always be prepared for LTO/ThinLTO

Fixes #115344. Fixes #117220.

There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.

When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.

This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.

r? nikic
2025-03-01 08:22:18 +00:00
Oli Scherer
29440b84a9 Remove an unused lifetime param 2025-02-24 15:11:29 +00:00
Oli Scherer
840e31b29f Generalize BaseTypeCodegenMethods 2025-02-24 15:11:29 +00:00
Oli Scherer
d4379d2afd Remove an unnecessary lifetime 2025-02-24 15:05:56 +00:00
David Wood
5afa6a111b
ssa/mono: deduplicate type_has_metadata
The implementation of the `type_has_metadata` function is duplicated in
`rustc_codegen_ssa` and `rustc_monomorphize`, so move this to
`rustc_middle`.
2025-02-24 08:08:23 +00:00
bors
e0be1a0262 Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcm
Emit getelementptr inbounds nuw for pointer::add()

Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.

Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-24 03:06:16 +00:00
DianQK
da50297a6e
Save pre-link bitcode to ModuleCodegen 2025-02-23 21:23:38 +08:00
Scott McMurray
6f9cfd694d Rework OperandRef::extract_field to stop calling to_immediate_scalar on things which are already immediates
That means it stops trying to truncate things that are already `i1`s.
2025-02-19 12:03:40 -08:00
Scott McMurray
511bf307f0 Emit trunc nuw for unchecked shifts and to_immediate_scalar
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19 11:36:52 -08:00
Nikita Popov
31cc4c074d Emit getelementptr inbounds nuw for pointer::add() 2025-02-19 11:32:32 +01:00
bors
3b022d8cee Auto merge of #133852 - x17jiri:cold_path, r=saethlin
improve cold_path()

#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`

However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:

```
if let Some(x) = y { ... }
```

may generate 3-target switch:

```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```

and therefore marking a branch as cold will have no effect.

This PR improves `cold_path()` to work with arbitrary switch instructions.

Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-18 07:49:09 +00:00
Jiri Bobek
7bb5f4dd78 improve cold_path() 2025-02-17 06:39:58 +01:00
bors
bdc97d1046 Auto merge of #136575 - scottmcm:nsuw-math, r=nikic
Set both `nuw` and `nsw` in slice size calculation

There's an old note in the code to do this, and now that [LLVM-C has an API for it](f0b8ff1251/llvm/include/llvm-c/Core.h (L4403-L4408)), we might as well.  And it's been there since what looks like LLVM 17 de9b6aa341 so doesn't even need to be conditional.

(There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
2025-02-14 14:21:29 +00:00
Scott McMurray
9ad6839f7a Set both nuw and nsw in slice size calculation
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-13 21:26:48 -08:00
Scott McMurray
0cc14b688d transmute should also assume non-null pointers
Previously it only did integer-ABI things, but this way it does data pointers too.  That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
2025-02-12 23:01:27 -08:00
Jubilee Young
eddfe8f503 compiler: remove reexports from rustc_target::callconv 2025-02-07 11:25:18 -08:00
Jubilee Young
1f37b9a643 compiler: remove rustc_target::abi entirely 2025-02-07 11:23:12 -08:00
Scott McMurray
4ee1602eab Override disjoint_or in the LLVM backend 2025-01-31 22:29:08 -08:00
Michael Goulet
9dc41a048d Use ExistentialTraitRef throughout codegen 2025-01-30 15:34:00 +00:00
Matthias Krüger
1e454fe725
Rollup merge of #135581 - EnzymeAD:refactor-codgencx, r=oli-obk
Separate Builder methods from tcx

As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads.
We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them.
The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync).

This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make  implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code.

`call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all.

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-01-24 23:25:42 +01:00
Manuel Drehwald
386c233858 Make CodegenCx and Builder generic
Co-authored-by: Oli Scherer <github35764891676564198441@oli-obk.de>
2025-01-24 16:05:26 -05:00
bors
b2728d5426 Auto merge of #135674 - scottmcm:assume-better, r=estebank
Update our range `assume`s to the format that LLVM prefers

I found out in https://github.com/llvm/llvm-project/issues/123278#issuecomment-2597440158 that the way I started emitting the `assume`s in #109993 was suboptimal, and as seen in that LLVM issue the way we're doing it -- with two `assume`s sometimes -- can at times lead to CVP/SCCP not realize what's happening because one of them turns into a `ne` instead of conveying a range.

So this updates how it's emitted from
```
assume( x >= LOW );
assume( x <= HIGH );
```
or
```
// (for ranges that wrap the range)
assume( (x <= LOW) | (x >= HIGH) );
```
to
```
assume( (x - LOW) <= (HIGH - LOW) );
```
so that we don't need multiple `icmp`s nor multiple `assume`s for a single value, and both wrappping and non-wrapping ranges emit the same shape.

(And we don't bother emitting the subtraction if `LOW` is zero, since that's trivial for us to check too.)
2025-01-22 04:18:30 +00:00
Oli Scherer
dfa4c01b2e Treat undef bytes as equal to any other byte 2025-01-21 08:27:21 +00:00
Scott McMurray
6fe82006a4 Update our range assumes to the format that LLVM prefers 2025-01-17 20:39:38 -08:00
Manuel Drehwald
d753cbf779 upstream rustc_codegen_llvm changes for enzyme/autodiff 2025-01-01 21:42:45 +01:00
Ralf Jung
7291b1eaf7 rename typed_swap → typed_swap_nonoverlapping 2024-12-25 10:53:03 +01:00
Nicholas Nethercote
2620eb42d7 Re-export more rustc_span::symbol things from rustc_span.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.

This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
2024-12-18 13:38:53 +11:00
bors
327c7ee436 Auto merge of #133099 - RalfJung:forbidden-hardfloat-features, r=workingjubilee
forbid toggling x87 and fpregs on hard-float targets

Part of https://github.com/rust-lang/rust/issues/116344, follow-up to https://github.com/rust-lang/rust/pull/129884:

The `x87`  target feature on x86 and the `fpregs` target feature on ARM must not be disabled on a hardfloat target, as that would change the float ABI. However, *enabling* `fpregs` on ARM is [explicitly requested](https://github.com/rust-lang/rust/issues/130988) as it seems to be useful. Therefore, we need to refine the distinction of "forbidden" target features and "allowed" target features: all (un)stable target features can determine on a per-target basis whether they should be allowed to be toggled or not. `fpregs` then checks whether the current target has the `soft-float` feature, and if yes, `fpregs` is permitted -- otherwise, it is not. (Same for `x87` on x86).

Also fixes https://github.com/rust-lang/rust/issues/132351. Since `fpregs` and `x87` can be enabled on some builds and disabled on others, it would make sense that one can query it via `cfg`. Therefore, I made them behave in `cfg` like any other unstable target feature.

The first commit prepares the infrastructure, but does not change behavior. The second commit then wires up `fpregs` and `x87` with that new infrastructure.

r? `@workingjubilee`
2024-12-13 19:43:00 +00:00
bors
1daec069fb Auto merge of #128004 - folkertdev:naked-fn-asm, r=Amanieu
codegen `#[naked]` functions using global asm

tracking issue: https://github.com/rust-lang/rust/issues/90957

Fixes #124375

This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).

I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.

Combined with https://github.com/rust-lang/rust/pull/127853, if both accepted, I think that resolves all steps from the tracking issue.

r? `@Amanieu`
2024-12-11 21:51:07 +00:00
Ralf Jung
2d887a5c5c generalize 'forbidden feature' concept so that even (un)stable feature can be invalid to toggle
Also rename some things for extra clarity
2024-12-11 22:11:15 +01:00
Folkert
bd8f8e0631
codegen #[naked] functions using global_asm! 2024-12-10 21:41:03 +01:00
bjorn3
401dd840ff Remove all threading through of ErrorGuaranteed from the driver
It was inconsistently done (sometimes even within a single function) and
most of the rest of the compiler uses fatal errors instead, which need
to be caught using catch_with_exit_code anyway. Using fatal errors
instead of ErrorGuaranteed everywhere in the driver simplifies things a
bit.
2024-12-06 18:42:31 +00:00
Monadic Cat
ca55eeeaf3
use intra-doc links for CodegenBackend::link 2024-11-27 18:42:14 -06:00
Monadic Cat
52684a4c52
update comment (codegen_backend -> codegen_crate)
use intra-doc links so there'll be a doc gen fail next time this becomes
wrong
2024-11-27 18:26:08 -06:00
lcnr
948cec0fad move fn is_item_raw to TypingEnv 2024-11-19 18:06:20 +01:00
lcnr
9cba14b95b use TypingEnv when no infcx is available
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
2024-11-18 10:38:56 +01:00
Jiri Bobek
777003ae9f Likely unlikely fix 2024-11-17 21:49:10 +01:00
Matthias Krüger
bd79fe7a94
Rollup merge of #132702 - 1c3t3a:issue-132615, r=rcvalle
CFI: Append debug location to CFI blocks

Currently we're not appending debug locations to the inserted CFI blocks. This shows up in #132615 and #100783. This change fixes that by passing down the debug location to the CFI type-test generation and appending it to the blocks.

Credits also belong to `@jakos-sec` who worked with me on this.
2024-11-12 23:26:41 +01:00
Bastian Kersting
c2102259a0 CFI: Append debug location to CFI blocks 2024-11-11 09:17:43 +00:00
bjorn3
0a619dbc5d Pass owned CodegenResults to link_binary
After link_binary the temporary files referenced by CodegenResults are
deleted, so calling link_binary again with the same CodegenResults
should not be allowed.
2024-11-09 21:22:00 +00:00