Commit graph

581 commits

Author SHA1 Message Date
Manuel Drehwald
c89a89bb14 Fix multi-cgu+debug builds using autodiff by delaying autodiff till lto 2026-02-11 14:08:56 -05:00
Jonathan Brouwer
dec8d6ebcf
Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Matthew Maurer
b639b0a4d8 llvm: Tolerate dead_on_return attribute changes
The attribute now has a size parameter and sorts differently:
* Explicitly omit size parameter during construction on 23+
* Tolerate alternate sorting in tests

https://github.com/llvm/llvm-project/pull/171712
2026-01-21 23:39:03 +00:00
Nikita Popov
0be66603ac Avoid passing addrspacecast to lifetime intrinsics
Since LLVM 22 the alloca must be passed directly. Do this by
stripping the addrspacecast if it exists.
2026-01-20 14:47:04 +01:00
Marcelo Domínguez
307a4fcdf8 Add scalar support for both host and device 2026-01-19 22:28:42 +01:00
Farid Zakaria
93f2e80f4a Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
2026-01-07 11:57:48 -08:00
Jonathan Brouwer
d898dccc21
Rollup merge of #150511 - Sa4dUs:offload-inline, r=ZuseZ4
Allow inline calls to offload intrinsic

Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block.

r? `@ZuseZ4`

fixes: https://github.com/rust-lang/rust/issues/150413
2025-12-31 14:30:48 +01:00
Marcelo Domínguez
9d8b4cc70d Restore builder at the end of saved bb 2025-12-31 13:10:29 +01:00
Jonathan Brouwer
122f02ad02
Rollup merge of #150394 - DKLoehr:passplugin, r=nikic
Accommodate LLVM PassPlugin rename

LLVM [recently moved](https://github.com/llvm/llvm-project/pull/173279) their `PassPlugin` files to a new folder. This PR updates our `PassWrapper` to point to the new location.
2025-12-29 17:17:56 +01:00
dianqk
fe075ad212
Removes the serde dependency in rustc_codegen_llvm 2025-12-28 15:52:20 +08:00
Devon Loehr
634251cba8 Accommodate upstream PassPlugin rename 2025-12-26 15:40:40 +00:00
Manuel Drehwald
dfef2e96fe Remove the need to call clang for std::offload usages 2025-12-23 05:20:07 -08:00
sgasho
ddd5aad8a3 feat: dlopen Enzyme 2025-12-16 00:31:32 +09:00
Alina Sbirlea
ad73972e99 Fix for LLVM22 making lowering decisions dependent on RuntimeLibraryInfo.
LLVM reference commit:
04c81a9973.
2025-12-04 20:23:00 +00:00
Stuart Cook
2b150f2c65
Rollup merge of #147936 - Sa4dUs:offload-intrinsic, r=ZuseZ4
Offload intrinsic

This PR implements the minimal mechanisms required to run a small subset of arbitrary offload kernels without relying on hardcoded names or metadata.

- `offload(kernel, (..args))`: an intrinsic that generates the necessary host-side LLVM-IR code.
- `rustc_offload_kernel`: a builtin attribute that marks device kernels to be handled appropriately.

Example usage (pseudocode):
```rust
fn kernel(x: *mut [f64; 128]) {
    core::intrinsics::offload(kernel_1, (x,))
}

#[cfg(target_os = "linux")]
extern "C" {
    pub fn kernel_1(array_b: *mut [f64; 128]);
}

#[cfg(not(target_os = "linux"))]
#[rustc_offload_kernel]
extern "gpu-kernel" fn kernel_1(x: *mut [f64; 128]) {
    unsafe { (*x)[0] = 21.0 };
}
```
2025-11-26 23:32:03 +11:00
Marcelo Domínguez
5128ce10a0 Implement offload intrinsic 2025-11-25 20:04:27 +01:00
Manuel Drehwald
5fbe5dae42 Only try to link against offload functions if llvm.enzyme is enabled 2025-11-23 00:19:53 -08:00
Manuel Drehwald
89d50591c0 Replace the first of 4 binary invocations for offload 2025-11-21 02:41:17 -08:00
Quinn Okabayashi
c7e50d0f37 Remove unused LLVMModuleRef argument 2025-11-12 15:46:08 +00:00
bors
87f9dcd5e2 Auto merge of #147935 - luca3s:add-rtsan, r=petrochenkov
Add LLVM realtime sanitizer

This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust.

Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766.

Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary.

Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary.

The first commit adds support for more detailed sanitizer information.
The second commit then actually adds real-time sanitizer.
The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit.
The fourth commit adds the documentation for real-time sanitizer.
2025-11-08 12:24:15 +00:00
Lucas Baumann
d198633b95 add realtime sanitizer 2025-11-06 13:20:12 +01:00
Manuel Drehwald
360b38cceb Fix device code generation, to account for an implicit dyn_ptr argument. 2025-11-06 03:34:38 -05:00
Matthias Krüger
3d671c0d54
Rollup merge of #148103 - Zalathar:compression, r=wesleywiser
cg_llvm: Pass `debuginfo_compression` through FFI as an enum

There are only three possible values, making an enum more appropriate.

This avoids string allocation on the Rust side, and avoids ad-hoc `!strcmp` to convert back to an enum on the C++ side.
2025-10-31 18:41:51 +01:00
Tomasz Miąsko
2a03a948b9 Deduce captures(none) for a return place and parameters
Extend attribute deduction to determine whether parameters using
indirect pass mode might have their address captured. Similarly to
the deduction of `readonly` attribute this information facilitates
memcpy optimizations.
2025-10-25 22:53:52 +02:00
Zalathar
73b734bf63 Pass debuginfo_compression through FFI as an enum 2025-10-25 23:58:19 +11:00
Matthias Krüger
e132d2d8a5
Rollup merge of #147948 - aeubanks:summarylist, r=durin42
PassWrapper: Access GlobalValueSummaryInfo::SummaryList via getter for LLVM 22+

https://github.com/llvm/llvm-project/pull/164355 makes SummaryList private and provides a getter method.

`@rustbot` label llvm-main
2025-10-22 07:12:12 +02:00
Arthur Eubanks
1e8054669c format 2025-10-21 22:29:48 +00:00
Arthur Eubanks
875cc36b17 actually compile 2025-10-21 22:21:50 +00:00
Arthur Eubanks
f38839a874 format 2025-10-21 21:35:23 +00:00
Arthur Eubanks
fc7c4be59a PassWrapper: Access GlobalValueSummaryInfo::SummaryList via getter for LLVM 22+
https://github.com/llvm/llvm-project/pull/164355 makes SummaryList private and provides a getter method.

@rustbot label llvm-main
2025-10-21 20:58:27 +00:00
Zalathar
98c95c966b Remove current code for embedding command-line args in PDB 2025-10-18 12:24:40 +11:00
Guillaume Gomez
3938f42bb1
Rollup merge of #147608 - Zalathar:debuginfo, r=nnethercote
cg_llvm: Use `LLVMDIBuilderCreateGlobalVariableExpression`

- Part of rust-lang/rust#134001
- Follow-up to rust-lang/rust#146763

---

This PR dismantles the somewhat complicated `LLVMRustDIBuilderCreateStaticVariable` function, and replaces it with equivalent calls to `LLVMDIBuilderCreateGlobalVariableExpression` and `LLVMGlobalSetMetadata`.

A key difference is that the new code does not replicate the attempted downcast of `InitVal`. As far as I can tell, those downcasts were actually dead, because `llvm::ConstantInt` and `llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`. I tried replacing those code paths with fatal errors, and was unable to induce failure in any of the relevant test suites I ran.

I have also confirmed that if the calls to `create_static_variable` are commented out, debuginfo tests will fail, demonstrating some amount of relevant test coverage.

The new `DIBuilder` methods have been added via an extension trait, not as inherent methods, to avoid impeding rust-lang/rust#142897.
2025-10-13 11:25:23 +02:00
Zalathar
1081d98551 Use LLVMDIBuilderCreateGlobalVariableExpression
Note that the code in `LLVMRustDIBuilderCreateStaticVariable` that tried to
downcast `InitVal` appears to have been dead, because `llvm::ConstantInt` and
`llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`.
2025-10-12 23:36:26 +11:00
AMS21
0abecda9ed
Replace LLVMRustContextCreate with normal LLVM-C API calls
Since `LLVMRustContextCreate` can easily be replaced with a call to
`LLVMContextCreate` and `LLVMContextSetDiscardValueNames`.
2025-10-10 15:45:40 +02:00
bors
4b57d8154a Auto merge of #147519 - Zalathar:rollup-o5f16uo, r=Zalathar
Rollup of 3 pull requests

Successful merges:

 - rust-lang/rust#147446 (PassWrapper: use non-deprecated lookupTarget method)
 - rust-lang/rust#147473 (Do `x check` on various bootstrap tools in CI)
 - rust-lang/rust#147509 (remove intrinsic wrapper functions from LLVM bindings)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-10-09 10:54:43 +00:00
Stuart Cook
828dd0cdd4
Rollup merge of #147509 - AMS21:remove_llvm_rust_intrinsics, r=Zalathar
remove intrinsic wrapper functions from LLVM bindings

As discussed on https://github.com/llvm/llvm-project/pull/162500 there is no good reason to implement these intrinsic function via the LLVM wrapper instead we now just implement them via `call_intrinsic` like all the other intrinsic functions.

Work towards https://github.com/rust-lang/rust/issues/46437
2025-10-09 21:29:05 +11:00
Stuart Cook
c33f0b373b
Rollup merge of #147446 - durin42:llvm-22-lookupTarget-deprecation, r=Zalathar
PassWrapper: use non-deprecated lookupTarget method

This avoids an extra trip through a triple string by directly passing the Triple, and has been available since LLVM 21. The string overload was deprecated today and throws an error on our CI for HEAD due to -Werror paranoia, so we may as well clean this up now and also skip the conversion on LLVM 21 since we can.

`@rustbot` label llvm-main
2025-10-09 21:29:04 +11:00
Stuart Cook
4dfd977c8b
Rollup merge of #147488 - AMS21:remove_llvm_rust_insert_private_global, r=nikic
refactor: Remove `LLVMRustInsertPrivateGlobal` and `define_private_global`

Since it can easily be implemented using the existing LLVM C API in
terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global`
was only used in one place.

Work towards https://github.com/rust-lang/rust/issues/46437
2025-10-09 18:43:26 +11:00
AMS21
064e3b8212
remove intrinsic wrapper functions from LLVM bindings 2025-10-09 09:26:44 +02:00
AMS21
036ab3a925
refactor: Remove LLVMRustInsertPrivateGlobal and define_private_global
Since it can easily be implemented using the existing LLVM C API in
terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global`
was only used in one place.
2025-10-08 21:59:48 +02:00
AMS21
1aed495ed7
refactor: replace LLVMRustAtomicLoad/Store with LLVM built-in functions 2025-10-08 13:53:09 +02:00
Augie Fackler
eb20c6abd6 PassWrapper: use non-deprecated lookupTarget method
This avoids an extra trip through a triple string by directly passing
the Triple, and has been available since LLVM 21. The string overload
was deprecated today and throws an error on our CI for HEAD due to
-Werror paranoia, so we may as well clean this up now and also skip the
conversion on LLVM 21 since we can.

@rustbot label llvm-main
2025-10-07 11:02:13 -04:00
dianqk
1bd89bd42e
codegen: Generate dbg_value for the ref statement 2025-10-02 14:55:51 +08:00
Zalathar
906bf49ade Declare all "fixed" metadata kinds as MetadataKindId 2025-09-30 20:10:10 +10:00
Matthias Krüger
c29fb2e57e
Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4
TypeTree support in autodiff

# TypeTrees for Autodiff

## What are TypeTrees?
Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.

## Structure
```rust
TypeTree(Vec<Type>)

Type {
    offset: isize,  // byte offset (-1 = everywhere)
    size: usize,    // size in bytes
    kind: Kind,     // Float, Integer, Pointer, etc.
    child: TypeTree // nested structure
}
```

## Example: `fn compute(x: &f32, data: &[f32]) -> f32`

**Input 0: `x: &f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,
        child: TypeTree::new()
    }])
}])
```

**Input 1: `data: &[f32]`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,  // -1 = all elements
        child: TypeTree::new()
    }])
}])
```

**Output: `f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 4, kind: Float,
    child: TypeTree::new()
}])
```

## Why Needed?
- Enzyme can't deduce complex type layouts from LLVM IR
- Prevents slow memory pattern analysis
- Enables correct derivative computation for nested structures
- Tells Enzyme which bytes are differentiable vs metadata

## What Enzyme Does With This Information:

Without TypeTrees (current state):
```llvm
; Enzyme sees generic LLVM IR:
define float ``@distance(ptr*`` %p1, ptr* %p2) {
; Has to guess what these pointers point to
; Slow analysis of all memory operations
; May miss optimization opportunities
}
```

With TypeTrees (our implementation):
```llvm
define "enzyme_type"="{[]:Float@float}" float ``@distance(``
    ptr "enzyme_type"="{[]:Pointer}" %p1,
    ptr "enzyme_type"="{[]:Pointer}" %p2
) {
; Enzyme knows exact type layout
; Can generate efficient derivative code directly
}
```

# TypeTrees - Offset and -1 Explained

## Type Structure

```rust
Type {
    offset: isize, // WHERE this type starts
    size: usize,   // HOW BIG this type is
    kind: Kind,    // WHAT KIND of data (Float, Int, Pointer)
    child: TypeTree // WHAT'S INSIDE (for pointers/containers)
}
```

## Offset Values

### Regular Offset (0, 4, 8, etc.)
**Specific byte position within a structure**

```rust
struct Point {
    x: f32, // offset 0, size 4
    y: f32, // offset 4, size 4
    id: i32, // offset 8, size 4
}
```

TypeTree for `&Point` (internal representation):
```rust
TypeTree(vec![
    Type { offset: 0, size: 4, kind: Float },   // x at byte 0
    Type { offset: 4, size: 4, kind: Float },   // y at byte 4
    Type { offset: 8, size: 4, kind: Integer }  // id at byte 8
])
```

Generates LLVM:
```llvm
"enzyme_type"="{[]:Float@float}"
```

### Offset -1 (Special: "Everywhere")
**Means "this pattern repeats for ALL elements"**

#### Example 1: Array `[f32; 100]`
```rust
TypeTree(vec![Type {
    offset: -1, // ALL positions
    size: 4,    // each f32 is 4 bytes
    kind: Float, // every element is float
}])
```

Instead of listing 100 separate Types with offsets `0,4,8,12...396`

#### Example 2: Slice `&[i32]`
```rust
// Pointer to slice data
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, // ALL slice elements
        size: 4,    // each i32 is 4 bytes
        kind: Integer
    }])
}])
```

#### Example 3: Mixed Structure
```rust
struct Container {
    header: i64,        // offset 0
    data: [f32; 1000],  // offset 8, but elements use -1
}
```

```rust
TypeTree(vec![
    Type { offset: 0, size: 8, kind: Integer }, // header
    Type { offset: 8, size: 4000, kind: Pointer,
        child: TypeTree(vec![Type {
            offset: -1, size: 4, kind: Float // ALL array elements
        }])
    }
])
```
2025-09-28 18:13:11 +02:00
Matthias Krüger
e8578c8808
Rollup merge of #146763 - Zalathar:di-builder, r=jdonszelmann
cg_llvm: Replace some DIBuilder wrappers with LLVM-C API bindings (part 5)

- Part of rust-lang/rust#134001
- Follow-up to rust-lang/rust#146673

---

This is another batch of LLVMDIBuilder binding migrations, replacing some our own LLVMRust bindings with bindings to upstream LLVM-C APIs.

Some of these are a little more complex than most of the previous migrations, because they split one LLVMRust binding into multiple LLVM bindings, but nothing too fancy.

This appears to be the last of the low-hanging fruit. As noted in https://github.com/rust-lang/rust/issues/134001#issuecomment-2524979268, the remaining bindings are difficult or impossible to migrate at present.
2025-09-28 09:15:23 +02:00
Augie Fackler
bd860bdf26
Fix typo in LLVM_VERSION_ macro use
Co-authored-by: Nikita Popov <github@npopov.com>
2025-09-26 15:38:30 -04:00
Augie Fackler
4c7292aba3 PassWrapper: drop unused variable for LLVM 22+ 2025-09-26 13:27:34 -04:00
Augie Fackler
7a9b6d94d4 PassWrapper: update for new PGOOptions args in LLVM 22
This changed in upstream change a5569b4bd7f8.

@rustbot label llvm-main
2025-09-25 18:24:16 -04:00
Stuart Cook
59866ef305
Rollup merge of #147015 - Zalathar:dispose-tm, r=lqd
Use `LLVMDisposeTargetMachine`

After bumping the minimum LLVM version to 20 (rust-lang/rust#145071), we no longer need to run any custom code when disposing of a TargetMachine, so we can just use the upstream LLVM-C function.
2025-09-25 20:32:00 +10:00