llvm 19 support #227

brandonros · 2025-06-08T22:27:27Z

potentially addresses all of:

Update rustc_llvm_wrapper to optionally support LLVM v19 #226 (Updates rustc_llvm_wrapper)
GitHub Codespaces/VSCode Devcontainer support #224 (Adds Codespaces, optional, we could remove)
sha2 crate = runtime error #207 (if LLVM v19 will compile and not have same issues as LLVM v7)
CUDA 12.8.1 and LLVM 18.1.8 #197 (we can put this behind a feature flag to optionally support Blackwell+)

brandonros · 2025-06-08T22:30:49Z

i'm on the fence about renaming llvm to llvm7 and llvm19

i think it might not actually be needed and i might put it back

brandonros · 2025-06-08T22:33:38Z

@LegNeato my measuring stick here is does vecadd example build with LLVM v19

could you tell me if I'm close or if I'm actually missing something huge like a mountain of work I'm not seeing?

brandonros · 2025-06-10T04:30:35Z

 DEBUG: About to call LLVMRunPasses - THIS IS THE CRITICAL POINT
  DEBUG: Parameters:
  DEBUG:   llmod: 0x770f3d33a580
  DEBUG:   pipeline: "default<O0>"
  DEBUG:   tm: 0x770f27dafb00
  DEBUG:   pass_options: 0x770f27e00060
  DEBUG: LLVMRunPasses returned: 0
  DEBUG: LLVMRunPasses completed successfully
  DEBUG: Cleaning up pass builder options
  DEBUG: Pass builder options disposed
  DEBUG: optimize function completed successfully
  DEBUG: About to verify module before prepare_thin
  DEBUG: Module verification result: 0
  DEBUG: LLVMRustThinLTOBufferCreate called with is_thin=1, emit_summary=1
  DEBUG: Taking ThinLTO path
  DEBUG: About to run ThinLTO pass
  error: rustc interrupted by SIGSEGV, printing backtrace

(gdb) bt
#0  0x0000772679b89e10 in llvm::ValueEnumerator::EnumerateType(llvm::Type*) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#1  0x0000772679b8d240 in llvm::ValueEnumerator::incorporateFunction(llvm::Function const&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#2  0x0000772679b615b2 in (anonymous namespace)::ModuleBitcodeWriter::write() ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#3  0x0000772679b5b341 in llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#4  0x0000772679b66802 in llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#5  0x000077267944549c in llvm::ThinLTOBitcodeWriterPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#6  0x00007726786b34a7 in llvm::detail::PassModel<llvm::Module, llvm::ThinLTOBitcodeWriterPass, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#7  0x000077267a2e0dc8 in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
--Type <RET> for more, q to quit, c to continue without paging--
#8  0x000077267867e178 in LLVMRustThinLTOBufferCreate ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#9  0x000077267859d7be in rustc_codegen_nvvm::lto::ThinBuffer::new ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#10 0x000077267858512a in <rustc_codegen_nvvm::NvvmCodegenBackend as rustc_codegen_ssa::traits::write::WriteBackendMethods>::prepare_thin () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#11 0x00007726785ec605 in rustc_codegen_ssa::back::write::execute_optimize_work_item ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#12 0x00007726785e3cd5 in rustc_codegen_ssa::back::write::spawn_work::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#13 0x0000772678404d97 in std::sys::backtrace::__rust_begin_short_backtrace ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#14 0x000077267847357f in std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#15 0x0000772678520144 in <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
    () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#16 0x0000772678466e24 in std::panicking::try::do_call ()
--Type <RET> for more, q to quit, c to continue without paging--
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#17 0x000077267847381b in __rust_try () from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#18 0x00007726784730f1 in std::thread::Builder::spawn_unchecked_::{{closure}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#19 0x000077267848f987 in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
   from /workspaces/Rust-CUDA/target/debug/deps/librustc_codegen_nvvm.so
#20 0x000077268c55b8ab in std::sys::pal::unix::thread::Thread::new::thread_start ()
   from /root/.rustup/toolchains/nightly-2025-03-02-x86_64-unknown-linux-gnu/bin/../lib/librustc_driver-e3b06f91230294e6.so
#21 0x000077268668aaa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#22 0x0000772686717a34 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

pain

LegNeato · 2025-06-10T14:53:08Z

I don't know, llvm is an area of the project I have not touched.

LegNeato

This should stay the same, right? The newer support should be optional / only enabled when

LegNeato · 2025-06-10T14:57:35Z

crates/rustc_codegen_nvvm/build.rs

    "https://github.com/rust-gpu/rustc_codegen_nvvm-llvm/releases/download/LLVM-7.1.0/";

-static REQUIRED_MAJOR_LLVM_VERSION: u8 = 7;
+static REQUIRED_MAJOR_LLVM_VERSION: u8 = 19;


The requirement doesn't bump unless targeting a higher arch? So the logic is:

7 if targeting arch supported by 7 and 19

19 if targeting arch not supported by 7

LegNeato · 2025-06-10T15:02:28Z

We should probably break this down, seeing as neither of us understand this space.

First, we should probably add various values for arch and stuff to enums on the rust side. Some of these might need to be gated if on 7 vs 19.

Then, we should get switching between 7 and a stubbed out / non working 19 via target arch.

Then, we should systematically fix each issue and refactor common code on the way.

brandonros · 2025-06-10T15:49:12Z

seeing as neither of us understand this space.

ok... kind of a strange remark...

i'm just going to work on my local branch and make kernels compile with 19.1

then i was going to work backwards and "make it upstream worthy"...

brandonros · 2025-06-14T17:33:02Z

@LegNeato

i rented some powerful big aws spot VM in the cloud and built llvm-19 debug with assertions enabled and am using it in devcontainer (it's a 15gb .tar.xz and 67gb uncompressed)

it's helping find issues more than a typical llvm release build fyi, just a little tip i'd share

i have the "shaved yaks" opinionated way to get that instance if you want, cost me about $3 total

https://github.com/brandonros/cloud-llvm-build

brandonros · 2025-06-15T01:41:31Z

@LegNeato i'm on the fence about having two rustc_codegen_nvvm_v{{version}} crates 85% copy and pasted but.... i got this to work. vecadd is working at least, going to see if ed25519_vanity_rs compiles later

proof:
vecadd_kernel.ptx.txt

brandonros · 2025-06-20T14:41:21Z

@LegNeato i've read multiple conflicting things from multiple "official nvidia sources/documentation" that the new cuda 12.9 toolkit is based on/adds support for either llvm 18, llvm 19, or llvm 20.

i can see in their cicc binary references of llvm 20, so....

i also question if we need this. this might sound dumb but... what if we used a simple official rust target like riscv64gc-unknown-none-elf, use official rust compiler (no custom nightly, no custom codegen llvm integration) to spit out llvm ir, and then patch it to work with cuda...

i totally agree with you/the project's view on "use nvvm to compile llvm ir to ptx"

https://github.com/brandonros/vanity-miner-rs/pull/8/files i haven't had a chance to test it yet (working on it) but in terms of thinking outside the box, i was for sure able to get nvvm to accept llvm ir and make what seems to be a "valid ptx"

brandonros · 2025-06-22T20:30:41Z

i got this working for blackwell a different way. this branch/pr/LLVM v19 integration might work just fine but it's kind of a lot to maintain if there's an "easier" (albeit hackier) way to solve this

https://github.com/brandonros/vanity-miner-rs/actions/runs/15809309968/job/44558442212

Build Pipeline

compile no_std Rust logic + kernels libraries (specifically 1.86.0 because it was built against LLVM 19) targeting riscv64gc-unknown-none-elf due to its simplicity in instruction set
make it emit LLVM IR instead of an actual binary
Adapt the RISC-V LLVM IR to NVPTX64 LLVM IR
assemble the NVPTX64 LLVM IR to NVPTX64 LLVM bitcode
feed the NVPTX64 LLVM bitcode to new CUDA toolkit 12.9 libNVVM which adds support for LLVM19 for Blackwell (previous architectures only support LLVM v7 which is very old) to get Nvidia's PTX (Parallel Thread Execution)
feed the PTX to ptxas to get CUBIN SaSS (Streaming ASSembler)
run the CUBIN on device with gpu_runner

let me know if you actually want this/to put time into it, otherwise blackwell+ might be able to avoid cuda_builder or i'd need to make a cuda_builder that makes this super opinionated rust -> cubin pipeline i made

LegNeato · 2025-08-01T19:38:12Z

I'm re-opening because I think we want to go this route. Totally understand if you go a different route for your project!

LegNeato · 2025-08-01T19:38:53Z

I plan to poke at it this week. Apologies for the previous response saying we don't know the space, I don't really know the LLVM side of the house and I misspoke.

brandonros · 2025-08-01T23:33:08Z

Totally understand if you go a different route for your project!

Nope! Here to help, let's land this!

#216

Let's land that and then I'll rebase?

LegNeato · 2025-08-03T09:52:15Z

I don't have time to jam on this with you until later in the week, but I think this is a great start! Ideally both are compiled in statically or as dylibs and runtime chooses based on arch selected. But distribution with dylibs is annoying, and compiling 2 llvm versions in the same process will be annoying. So I think the first step is what this is doing, manually switching, but we should be aware where we would like it to go.

tyler274 · 2025-08-16T03:17:01Z

Im willing to put some time in here if I can get some pointers.

devillove084 · 2025-08-22T05:52:14Z

@tyler274 I'm also spending time on this and have followed the same approach as @brandonros. Writing extensive conditional compilation is truly a pain. Perhaps I think we should focus our efforts on this branch? Maybe we could explore how to improve this together?

devillove084 · 2025-08-22T05:54:36Z

@tyler274 #229 (comment) Here are some previous hints from @LegNeato.

devillove084 · 2025-08-22T06:01:00Z

@LegNeato Additionally, based on my previous open-source contributions and discussions, I've learned that both the Graphite and Turso are exploring GPU acceleration. I believe this represents a significant opportunity for us. I'm highly motivated to drive this initiative forward and position our project as the leading GPU-accelerated solution within the Rust ecosystem.

LegNeato · 2025-08-22T06:04:33Z

@Firestar99 is working on Graphite's support via rust-gpu!

devillove084 · 2025-08-22T06:09:19Z

@Firestar99 is working on Graphite's support via rust-gpu!

@LegNeato That's awesome! The good news is, after a month of learning and hands-on practice, I've basically figured out the workflow of LLVM backend generation. The bad news is debugging conditional compilation remains quite painful. What do you think – would it be better to write conditional compilation, or maintain two separate branches?

LegNeato · 2025-08-22T06:10:34Z

I think conditional is better, as I think upgrading drops support for a bunch of devices? Or is that not the case?

brandonros · 2025-08-22T06:11:44Z

what would it take to land this?

devillove084 · 2025-08-22T06:15:44Z

@LegNeato Based on my research, here are the facts: Taking PassManager as an example, this component orchestrates LLVM optimizations and analyses during backend code generation. Typically, multiple passes need to collaborate (e.g., constant propagation followed by dead code elimination). The PassManager schedules passes in a predefined order or based on dependencies to ensure logical execution sequence.

Prior to LLVM 9/10, implementations required inheriting from specific PassManager virtual base classes. However, starting from LLVM 14, everything has been unified into a CRTP (Curiously Recurring Template Pattern) code structure. Additionally, header files may have been relocated or modified across different LLVM versions.

Therefore, if we choose conditional compilation, I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

brandonros · 2025-08-22T06:17:01Z

I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

I think this is a misunderstanding. NVIDIA cards either run LLVM 7 or LLVM 19, nothing in between. Please correct me if I am wrong.

devillove084 · 2025-08-22T06:19:52Z

I would need to create CI workflows for every single version from LLVM 7 to LLVM 19 to ensure compatibility.

I think this is a misunderstanding. NVIDIA cards either run LLVM 7 or LLVM 19, nothing in between. Please correct me if I am wrong.

@brandonros I'm referring specifically to modifications in the LLVM backend generation logic within wrapper files like rustc_llvm_wrapper. This is not targeting NVVM specifically.

devillove084 · 2025-08-22T06:25:34Z

@brandonros @LegNeato https://gist.github.com/ax3l/9489132 This source perfectly highlights that while NVIDIA officially certifies specific LLVM versions per CUDA release (as shown in the version matrix), the reality is more nuanced:

The crt/host_config.hhack (as noted) allows unofficial flexibility for newer LLVM versions
Production environments often use LLVM 11-15 (especially with CUDA 11.x/12.x)
NVIDIA’s Enhanced Compatibility(since CUDA 11.1) intentionally supports cross-version compatibility

devillove084 · 2025-08-22T06:32:54Z

@brandonros I think we could start by refining the branches for LLVM 7 and LLVM 19 based on your existing work, then progressively extend support to other components like NVVM (CUDA) and additional LLVM versions.

devillove084 · 2025-08-22T06:40:02Z

@LegNeato I strongly agree that prioritizing support for the newer Rust toolchain, CUDA, and LLVM versions is critical. Our project's future hinges on optimizing for cutting-edge hardware like the H100, A100, and even the GH200 (which I currently have access to). This strategic focus will enable major enterprise customers to integrate our solution into their infrastructure—the key to maximizing our long-term growth and impact.

brandonros · 2025-08-22T06:43:41Z

I think we could start by refining the branches for LLVM 7 and LLVM 19 based on your existing work,

I will have rebased this massive thing twice now and it continues to go stale at almost 2+ months old. Are we serious about upstreaming this? Otherwise I'm hesitant to keep doing this same song and dance of "get it ready for merge, put it on the shelf".

devillove084 · 2025-08-22T06:50:28Z

@LegNeato Based on the current modifications, what are the primary remaining challenges? Let's explore what additional efforts and adjustments we can make.

devillove084 · 2025-08-22T12:14:24Z

@LegNeato After reviewing his code changes, I can roughly understand that it seems you don't want to create code for different LLVM versions in multiple folders. Instead, you may prefer to achieve a balance between NVVM and LLVM through conditional compilation.

I sincerely request that you take the time to delve into this part.

brandonros · 2025-09-11T20:52:09Z

@devillove084

I can roughly understand that it seems you don't want to create code for different LLVM versions in multiple folders.

I understood... the opposite? There was hesitancy in the beginning and then I thought we settled on we would do it this way?

This analysis examines the effort required to add CUDA 13.0 support to rust-cuda with dynamic LLVM version detection and switching. Key findings: - CUDA 13 introduces NVVM IR 2.0 (breaking change from 1.x) - Requires dual LLVM support: 7.0.1 (legacy) + 20.1.0 (Blackwell+) - Estimated effort: 6-8 weeks for experienced developer - Recommended approach: runtime backend selection by architecture - Maintains backward compatibility with CUDA 11.2+ Analysis includes: - Current state assessment of codebase - CUDA 13.0 breaking changes documentation - Detailed 5-phase implementation plan - Risk assessment and mitigation strategies - Comparison to PR Rust-GPU#227 (LLVM 19 effort) - Test strategy and validation metrics Related to: Rust-GPU#299, Rust-GPU#227

brandonros · 2025-12-03T03:07:32Z

rustc_codegen_nvvm: V7 vs V19 File-by-File Comparison

This document describes each file in rustc_codegen_nvvm_v7 and whether it differs from rustc_codegen_nvvm_v19.

Summary

Status	Count
IDENTICAL	7 files
DIFFERENT	26 files
V7 ONLY	1 file (`ptx_filter.rs`)
V19 ONLY	3 files (new headers)

Root Files

`build.rs` — DIFFERENT

Build script that downloads/configures LLVM and compiles the C++ wrapper.

Aspect	V7	V19
LLVM Version	7.1.0	19.1.7
Env var for config	`LLVM_CONFIG`	`LLVM_CONFIG_19`
Env var for prebuilt	`USE_PREBUILT_LLVM`	`USE_PREBUILT_LLVM_19`
llvm-as command	`llvm-as-7`	`llvm-as-19`
llvm-config fallback	`llvm-config`	`llvm-config-19`

`Cargo.toml` — DIFFERENT

Only differs in crate name (rustc_codegen_nvvm_v7 vs rustc_codegen_nvvm_v19). Dependencies are identical.

`CHANGELOG.md` — (not compared)

Documentation file.

`libintrinsics.bc` / `libintrinsics.ll` — (not compared)

Precompiled LLVM bitcode for intrinsics. Likely version-specific.

rustc_llvm_wrapper/ (C++ FFI Layer)

V7 Files

File	Description
`rustllvm.h`	Header with LLVM version compatibility macros (134 lines)
`RustWrapper.cpp`	Main FFI wrapper (62KB, 1927 lines)
`PassWrapper.cpp`	Pass manager wrapper using legacy PassManager (43KB, 1509 lines)

V19 Files

File	Description
`LLVMWrapper.h`	Minimal header, hard-coded for LLVM 19 (48 lines)
`SuppressLLVMWarnings.h`	Warning suppression header (NEW)
`RustWrapper.cpp`	Main FFI wrapper (95KB, 2616 lines) — significantly larger
`PassWrapper.cpp`	Pass manager wrapper using new PassBuilder (64KB, 1805 lines)
`README.md`	Documentation (NEW)

Key C++ Differences

Pass Manager: V7 uses legacy PassManagerBuilder, V19 uses new PassBuilder
Type System: V19 adds opaque pointer support
Size: V19 RustWrapper.cpp is ~50% larger due to new APIs

src/ Files

IDENTICAL FILES (can be shared)

File	Lines	Description
`common.rs`	~30	`AsCCharPtr` trait for C string conversion
`ptxgen.rs`	~50	PTX generation utilities

debug_info/ IDENTICAL FILES

File	Lines	Description
`create_scope_map.rs`	~100	Debug scope mapping
`dwarf_const.rs`	~50	DWARF constant definitions
`namespace.rs`	~80	Namespace handling for debug info
`util.rs`	~60	Debug info utilities
`metadata/type_map.rs`	~100	Type mapping for metadata

src/ DIFFERENT FILES

Core Infrastructure

`lib.rs` — DIFFERENT (114 diff lines)

Main crate entry point, implements CodegenBackend trait.

Difference	V7	V19
Modules	includes `ptx_filter`	no `ptx_filter`
Feature flags	different set	adds `hash_raw_entry`
`global_backend_features`	custom impl parsing CUDA arch	empty impl
Trait signatures	older rustc_codegen_ssa	newer signatures

`llvm.rs` — DIFFERENT (337 diff lines) ⚠️ MAJOR

FFI bindings to LLVM C API. Completely different between versions.

Difference	V7	V19
Attribute handling	simple `LLVMRustAddFunctionAttribute`	type-aware for `StructRet`
`TypeKind` enum	18 variants	21 variants (+X86_FP80, FP128, PPC_FP128, X86_AMX, TargetExt)
`AsmDialect`	`{Other, Att, Intel}`	`{Att, Intel}` (removed Other)
`CodeGenOptLevel`	`{Other, None, Less, Default, Aggressive}`	removed Other
Pointer types	`LLVMPointerType`	`LLVMPointerTypeInContext` (opaque ptrs)
Build calls	`LLVMRustBuildCall(B, Fn, Args, N, Bundle)`	`LLVMRustBuildCall(B, Ty, Fn, Args, N, Bundles, NBundles)`
New types	—	`PassBuilderOptions`, `FloatABIType`, `LLVMRustDISPFlags`

`context.rs` — DIFFERENT (~80 diff lines)

Codegen context and command-line argument parsing.

Difference	V7	V19
`CodegenArgs`	includes `DisassembleMode`, disassemble options	removed disassembly options
`parse()` signature	`parse(args, sess)`	`parse(args)`
Methods	`add_used_global()`, `add_compiler_used_global()`	removed
New method	—	`codegen_unit()`

`builder.rs` — DIFFERENT (221 diff lines)

LLVM IR builder implementation.

Difference	V7	V19
Load instruction	`LLVMBuildLoad(builder, ptr, name)`	`LLVMBuildLoad2(builder, ty, ptr, name)`
`AtomicOrdering` import	`rustc_middle::ty`	`rustc_codegen_ssa::common`
`atomic_load` impl	different stub	returns `LLVMGetUndef(ty)`
Let-chain syntax	uses `if let ... &&`	nested `if let` blocks

Code Generation

`abi.rs` — DIFFERENT (79 diff lines)

ABI and calling convention handling.

Difference	V7	V19
Conv type	`CanonAbi::Rust`	`Conv::Rust`
Float types	f32, f64	adds f16, f128
Pointer creation	`LLVMPointerType(ty, ...)`	`LLVMPointerTypeInContext(cx.llcx, ...)`
New method	—	`arg_memory_ty()`

`intrinsic.rs` — DIFFERENT (124 diff lines)

Intrinsic function codegen.

Difference	V7	V19
`codegen_intrinsic_call` signature	`(instance, args, result: PlaceRef, span)`	`(instance, fn_abi, args, llresult: &Value, span)`
`volatile_load` impl	simple type cast	extracts pointee type properly
Format strings	f-string `format!("{name}")`	printf `format!("{}", name)`

`back.rs` — DIFFERENT (316 diff lines) ⚠️ MAJOR

Backend code generation and target machine creation.

Difference	V7	V19
`LLVMRustCreateTargetMachine`	~10 params	~20+ params
ABI parameter	none	`abi_cstr`
Float ABI	bool `use_softfp`	`FloatABIType` enum
String passing	raw ptr + len	CString

`target.rs` — DIFFERENT (~20 diff lines)

Target specification.

Difference	V7	V19
`DATA_LAYOUT`	no i128	adds `i128:128:128`
Default CPU	`sm_30`	`sm_120`

`ty.rs` — DIFFERENT (~60 diff lines)

Type handling.

Difference	V7	V19
Pointer creation	`LLVMPointerType`	`LLVMPointerTypeInContext`

`consts.rs` — DIFFERENT (~40 diff lines)

Constant value handling. Minor API adjustments.

`const_ty.rs` — DIFFERENT (~15 diff lines)

Constant type utilities. Minor differences.

`attributes.rs` — DIFFERENT (~30 diff lines)

LLVM attribute handling. Minor API adjustments.

Linking & LTO

`link.rs` — DIFFERENT (~100 diff lines)

Linking and output generation.

Difference	V7	V19
PTX filtering	uses `PtxFilter`	removed
Metadata	`metadata` param in `link_crate`	uses `codegen_results.metadata`
`file_for_writing`	4 params	3 params

`lto.rs` — DIFFERENT (~50 diff lines)

Link-time optimization. Minor API adjustments.

Other Files

`nvvm.rs` — DIFFERENT (~150 diff lines)

NVVM (NVIDIA's LLVM fork) interface.

Difference	V7	V19
IR version check	`major <= 1 && minor < 6`	`ir_major != 2 \|\| ir_minor != 0`
Pass manager	`LLVMCreatePassManager` + `LLVMAddGlobalDCEPass`	`LLVMRunPasses` with string
`LLVMRustParseBitcodeForLTO`	5 params	4 params (removed one)
Import style	`use crate::llvm::*`	explicit imports

`init.rs` — DIFFERENT (~10 diff lines)

LLVM initialization.

Difference	V7	V19
`LLVMInitializePasses()`	called	removed (TODO comment)

`allocator.rs` — DIFFERENT (~50 diff lines)

Allocator shim generation.

Difference	V7	V19
Pointer type	`LLVMPointerType(i8, 0)`	`LLVMPointerTypeInContext(llcx, 0)`
`LLVMRustBuildCall`	5 params	7 params (added ty, bundles)

`mono_item.rs` — DIFFERENT (~20 diff lines)

Monomorphization item handling.

Difference	V7	V19
`predefine_static`	`&mut self`	`&self`
`predefine_fn`	`&mut self`	`&self`

`override_fns.rs` — DIFFERENT (~40 diff lines)

Function override logic for libm.

Difference	V7	V19
`define_or_override_fn`	`cx: &mut CodegenCx`	`cx: &CodegenCx`
`MonoItem::define`	4 params with `MonoItemData`	2 params
Closure check	none	skips closures

`asm.rs` — DIFFERENT (~15 diff lines)

Inline assembly handling. Minor differences.

`ctx_intrinsics.rs` — DIFFERENT (~10 diff lines)

Context intrinsics. Minor import differences.

`int_replace.rs` — DIFFERENT (~5 diff lines)

Integer type replacement. Very minor differences (likely just formatting).

V7-Only Files

`ptx_filter.rs` — V7 ONLY (~14,000 lines!)

PTX disassembly and filtering functionality. Completely removed in V19.

Features:

PTX parsing and filtering
Function/global disassembly
Entry point filtering
Pretty printing

debug_info/ DIFFERENT FILES

`mod.rs` — DIFFERENT (~30 diff lines)

Debug info module root. Minor API adjustments.

`metadata.rs` — DIFFERENT (~50 diff lines)

Debug metadata handling. LLVM metadata API differences.

`metadata/enums.rs` — DIFFERENT (~20 diff lines)

Enum debug info. Minor differences.

Conclusion

Why These Files Differ

LLVM API Changes (7→19)
- Opaque pointers (LLVMPointerType → LLVMPointerTypeInContext)
- Typed instructions (LLVMBuildLoad → LLVMBuildLoad2)
- New pass manager (PassManagerBuilder → PassBuilder)
- Extended type system (f16, f128, new TypeKind variants)
rustc_codegen_ssa Evolution
- Trait signatures changed (&mut self → &self)
- Method signatures changed (extra parameters)
- Import path changes
Feature Changes
- V7 has PTX disassembly (ptx_filter.rs)
- V19 removed disassembly features
- V19 targets newer CUDA architectures (sm_120 vs sm_30)

Files That Could Theoretically Be Shared

Only these 7 files are truly identical:

common.rs
ptxgen.rs
debug_info/create_scope_map.rs
debug_info/dwarf_const.rs
debug_info/namespace.rs
debug_info/util.rs
debug_info/metadata/type_map.rs

All other files have embedded LLVM version-specific calls or rustc API differences.

LegNeato reviewed Jun 10, 2025

View reviewed changes

brandonros closed this Jun 10, 2025

brandonros reopened this Jun 15, 2025

brandonros force-pushed the llvm-19 branch 4 times, most recently from 3cac1eb to f50708b Compare June 15, 2025 01:14

brandonros closed this Jun 22, 2025

LegNeato reopened this Aug 1, 2025

brandonros force-pushed the llvm-19 branch from 52791ef to 80d972b Compare August 2, 2025 14:53

brandonros added 2 commits August 2, 2025 10:54

Fix read_volatile intrinsic

f4cfaaa

support sm_100 and llvm v19

c2a4471

brandonros force-pushed the llvm-19 branch from 80d972b to c2a4471 Compare August 2, 2025 14:58

LegNeato mentioned this pull request Nov 7, 2025

Support CUDA 13 #299

Closed

llvm 19 support #227

Are you sure you want to change the base?

llvm 19 support #227

Uh oh!

Conversation

brandonros commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 8, 2025

Uh oh!

brandonros commented Jun 8, 2025

Uh oh!

brandonros commented Jun 10, 2025

Uh oh!

LegNeato commented Jun 10, 2025

Uh oh!

LegNeato left a comment

Choose a reason for hiding this comment

Uh oh!

LegNeato Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

LegNeato commented Jun 10, 2025

Uh oh!

brandonros commented Jun 10, 2025

Uh oh!

brandonros commented Jun 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonros commented Jun 22, 2025

Build Pipeline

Uh oh!

LegNeato commented Aug 1, 2025

Uh oh!

LegNeato commented Aug 1, 2025

Uh oh!

brandonros commented Aug 1, 2025

Uh oh!

LegNeato commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tyler274 commented Aug 16, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

LegNeato commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

LegNeato commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Aug 22, 2025

Uh oh!

devillove084 commented Aug 22, 2025

Uh oh!

brandonros commented Jun 8, 2025 •

edited

Loading

brandonros commented Jun 14, 2025 •

edited

Loading

brandonros commented Jun 15, 2025 •

edited

Loading

brandonros commented Jun 20, 2025 •

edited

Loading

LegNeato commented Aug 3, 2025 •

edited

Loading

devillove084 commented Aug 22, 2025 •

edited

Loading

devillove084 commented Aug 22, 2025 •

edited

Loading

`build.rs` — DIFFERENT

`Cargo.toml` — DIFFERENT

`CHANGELOG.md` — (not compared)

`libintrinsics.bc` / `libintrinsics.ll` — (not compared)

`lib.rs` — DIFFERENT (114 diff lines)

`llvm.rs` — DIFFERENT (337 diff lines) ⚠️ MAJOR

`context.rs` — DIFFERENT (~80 diff lines)

`builder.rs` — DIFFERENT (221 diff lines)

`abi.rs` — DIFFERENT (79 diff lines)

`intrinsic.rs` — DIFFERENT (124 diff lines)

`back.rs` — DIFFERENT (316 diff lines) ⚠️ MAJOR

`target.rs` — DIFFERENT (~20 diff lines)

`ty.rs` — DIFFERENT (~60 diff lines)

`consts.rs` — DIFFERENT (~40 diff lines)

`const_ty.rs` — DIFFERENT (~15 diff lines)

`attributes.rs` — DIFFERENT (~30 diff lines)

`link.rs` — DIFFERENT (~100 diff lines)

`lto.rs` — DIFFERENT (~50 diff lines)

`nvvm.rs` — DIFFERENT (~150 diff lines)

`init.rs` — DIFFERENT (~10 diff lines)

`allocator.rs` — DIFFERENT (~50 diff lines)

`mono_item.rs` — DIFFERENT (~20 diff lines)

`override_fns.rs` — DIFFERENT (~40 diff lines)

`asm.rs` — DIFFERENT (~15 diff lines)

`ctx_intrinsics.rs` — DIFFERENT (~10 diff lines)

`int_replace.rs` — DIFFERENT (~5 diff lines)

`ptx_filter.rs` — V7 ONLY (~14,000 lines!)

`mod.rs` — DIFFERENT (~30 diff lines)

`metadata.rs` — DIFFERENT (~50 diff lines)

`metadata/enums.rs` — DIFFERENT (~20 diff lines)