LLVM and SPIRV-LLVM-Translator pulldown (WW20 2026)#22020
Draft
iclsrc wants to merge 2098 commits into
Draft
Conversation
Summary: Right now it's a little difficult to use the multilibs support because the user must manually provide one. I believe that when the user configures multilibs with the LLVM CMake arguments at a minimum we should provide one that forward `-fmultilib-flag=<multilib>` to the created runtime. This RP makes CMake emit this by manually writing a flag. Because users could provide their own, this adds some extre complexity to prevent this from being overwritten. The desire for this change is to more easily ship this support in CMake configuration files without needing to write files manually (for the typical case).
### Summary part of : llvm/llvm-project#185382 This is a follow up : llvm/llvm-project#193658 Lower zip1 and zip2 intrinsics in https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements All the intrinsics are handled inline in `llvm-project/build/lib/clang/23/include/arm_neon.h` like: ``` #ifdef __LITTLE_ENDIAN__ __ai __attribute__((target("neon"))) int8x8_t vzip1_s8(int8x8_t __p0, int8x8_t __p1) { int8x8_t __ret; __ret = __builtin_shufflevector(__p0, __p1, 0, 8, 1, 9, 2, 10, 3, 11); return __ret; } #else __ai __attribute__((target("neon"))) int8x8_t vzip1_s8(int8x8_t __p0, int8x8_t __p1) { int8x8_t __ret; int8x8_t __rev0; __rev0 = __builtin_shufflevector(__p0, __p0, __lane_reverse_64_8); int8x8_t __rev1; __rev1 = __builtin_shufflevector(__p1, __p1, __lane_reverse_64_8); __ret = __builtin_shufflevector(__rev0, __rev1, 0, 8, 1, 9, 2, 10, 3, 11); __ret = __builtin_shufflevector(__ret, __ret, __lane_reverse_64_8); return __ret; } #endif ``` So no additional special lowering logic is needed.
- Normalize the header syntax for ReleaseNotes (current `.md` file and `ReleaseNotesTemplate.txt`) to use `#`-based headings - Normalize indents to distinguish doc title from page headers Fixes navigation indents for Furo theme update (see llvm/llvm-project#184440).
The SetVector already ensures that there are no cycles in the collection.
…4790) - Remove UNSUPPORTED: intelgpu from 12 passing tests: * mapping/data_member_ref.cpp * offloading/bug50022.cpp, info.c * offloading/target_critical_region.cpp, target_depend_nowait.cpp, target_nowait_target.cpp * offloading/strided_update/* (6 tests) * unified_shared_memory/close_member.c - Change CUDA tests from XFAIL to UNSUPPORTED for Intel GPU: * offloading/CUDA/basic_launch.cu * offloading/CUDA/basic_launch_blocks_and_threads.cu * offloading/CUDA/basic_launch_multi_arg.cu * offloading/CUDA/launch_tu.cu - Add Intel GPU configuration section to lit.cfg to disable USM tests by default
…#194610) This PR implements the refactorings discussed with @localspook in #193838 --------- Co-authored-by: Victor Chernyakin <chernyakin.victor.j@outlook.com>
z/OS has a table of mapped names in the IR. Counting the hits for just the name leads to one more hit than expected. Search for the name with the @ char to make sure the right occurrences are being counted.
…#194648) Fixes llvm/llvm-project#194596. When the function result symbol is encountered while the compiler is already completing the function result type, flang could recursively re-enter _CompleteFunctionResultType()_ and crash on invalid code. Instead of crashing on conflicting declarations, flang now reports an “already declared” error and stops further recursion.
Handle AVX-512 VGF2P8AFFINEQB rmbi instructions in X86MCInstLower. Unlike the existing rmi forms, rmbi uses a 64-bit broadcast memory operand, so the constant pool entry may only contain the broadcast source instead of a full-width vector constant. Print that constant repeated across the destination vector width when forming the asm comment. Related: llvm/llvm-project#194572
…tributes (#194726) Replace `getAsInteger()` parsing of the `patchable-function-entry` and `patchable-function-prefix` function attributes with the existing `Function::getFnAttributeAsParsedInteger()` helper across AsmPrinter and all backend targets. The IR verifier already validates these attributes as unsigned base-10 integers via `checkUnsignedBaseTenFuncAttr`, so parse failure at point of use indicates a verifier bypass or IR corruption. `getFnAttributeAsParsedInteger()` returns a default of 0 on failure (matching the implicit behavior of the old code) and emits a diagnostic rather than silently continuing.
Add operations that follow `float op(float, int)` pattern, mirroring the existing `spirv.GL.Ldexp` op
The constexpr functions in question take a scoped enum as an argument and a switch statement returns a value for each value of the enum. These are all legal statements in a constexpr function in C++14. Under constexpr rules, the evaluation of a constexpr function cannot lead to an evaluation of any prohibited forms of expressions. An evaluation of the functions being discussed with a valid argument will terminate at the switch, and an code that follows will not be evaluated. Using "llvm_unreachable" after the switch should be ok as long as the expansion of the llvm_unreachable macro does not contain any statements not allowed to appear in a constexpr function. At the same time, GCC before v9 did not tolerate any unguarded calls to non-constexpr functions after the switch. To avoid using "llvm_unreachable", which can have multiple expansions, use an assert with an explicit condition that the underlying value of the argument lies between the minimum and maximum values of the enum.
Pulled out of #194473 - update combineMinMaxReduction to fold to a ISD::VECREDUCE_SMAX/SMIN/UMAX/UMIN node and then perform the lowering later on. combineMinMaxReduction will go away once we can use shouldExpandReduction, rely on the middle-end to recognise reductions and not have to recreate them from the expanded patterns. I've added pre-SSE41 handling using vector unrolling - hopefully this will go away once #194672 is in place.
PR#194368 changed how line breaks are handles on Windows and it broke several libcxx tests on Windows, including libcxx/test/std/localization/locale.categories/facet.numpunct/ locale.numpunct.byname/thousands_sep.pass.cpp This patch addresses this issue.
### Summary part of llvm/llvm-project#185382 lower part of intrinsics in : https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements Lower NEON::BI__builtin_neon_vzip_v and NEON::BI__builtin_neon_vzipq_v in CIRGenBuiltinAArch64.cpp by porting the existing incubator logic (`clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp`) onto ClangIR: two bitcasts on the input vectors, two rounds of cir.vec.shuffle generating the low/high interleave patterns, each stored through a ptr_stride of the sret base pointer. ### Test - test_vzip_mf8 - test_vzipq_mf8 I found that these two intrinsics are defined in `llvm-project/clang/test/CodeGen/AArch64/fp8-intrinsics/acle_neon_fp8_untyped.c`, but this file seems to be a test suite specifically for the `mfloat8` type, so I did not remove their original test cases. Some of the new CHECK lines additionally match a pair of bitcasts before the shuffle; this shape comes from arm_neon.h's inline wrappers, which re-cast typed vectors (e.g. <4 x i16>) through <8 x i8> before calling __builtin_neon_vzip_v. Variants whose element type is already i8 (s8/u8/p8/mf8) skip that round-trip and therefore have no bitcasts in the check lines.
Cache root entry and SLPCostThreshold queries once, group !ForReduction-only checks under two blocks, extract a shared benign-node predicate from the two duplicated lambdas, and skip HasSingleLoad and allConstant work when results are dead. Reviewers: Pull Request: llvm/llvm-project#194895
…els (#194754) As it turns out, even if a `ProcResGroup` consists of in-order pipes, as long as its (the group's) BufferSize is not zero, Machine Scheduler will not use in-order scheduling on instructions that consume it. Since BufferSize also defaults to -1 for `ProcResGroup`, we have been scheduling the resource consumption of SiFive7's `PipeAB` (scalar pipes) and `VA1OrVA2` (vector pipes) in an out-of-order fashion! Co-authored-by: Min Hsu <min.hsu@sifive.com>
… scalar remainder (#190258) Add two new loop metadata attributes — `llvm.loop.vectorize.body` and `llvm.loop.vectorize.epilogue` — that the loop vectorizer sets on the generated vector loop and epilogue loop respectively. The metadata is only emitted when optimization remarks are enabled (`ORE->enabled()`), so it has zero cost in normal compilation. These enable downstream passes (LoopUnroll, WarnMissedTransforms) to produce more precise optimization remarks. Instead of the generic "loop not unrolled" warning on a source line that was vectorized, the unroller can now report: - **"vectorized loop"** for the main vector body - **"epilogue loop"** for the scalar epilogue/remainder - **"epilogue vectorized loop"** for an epilogue that was itself vectorized during epilogue vectorization (carries both attributes) A shared `getLoopVectorizeKindPrefix()` helper in `LoopUtils.h`/`LoopUtils.cpp` reads the metadata and returns the appropriate prefix string, used by both `LoopUnroll.cpp` and `WarnMissedTransforms.cpp`. The metadata emission in `VPlan.cpp` uses `Loop::addIntLoopAttribute` from the NFC PR #194676. Two end-to-end tests exercise the full `loop-vectorize → loop-unroll` pipeline with forced epilogue vectorization (`-enable-epilogue-vectorization -epilogue-vectorization-force-VF=4`) to produce all four loop categories from a single vectorizable function. Each test also includes a plain (non-vectorized) function to cover the baseline "loop" case. Both tests verify stderr diagnostic output and YAML structured remarks. **`LoopUnroll/vectorizer-loop-kind-remarks.ll`** checks for successful-unroll remarks. **`LoopTransformWarning/vectorizer-loop-kind-unroll-warning.ll`** checks for failed-unroll warnings. AI Disclaimer: this patch was generated with assistance of GitHub Copilot/Claude Opus and reviewed by a human.
…669) Set the debug location on non-target constant nodes so that the resulting machine instructions inherit the correct source location.
Prevent combinePExtTruncate from forming RISCVISD nodes with illegal type. Remove unnecessary call to getSimpleVT(). Legalize shift amount when custom legalizing i64 shifts. SelectionDAGBuilder usually pre-legalizes shift amounts. If we scalarize a vXi64 vector shift the shift amount will be i64.
This allows us to strip DerivedIVRecipe::execute, and remove the dependency on emitTransformedIndex. It allows us to benefit from existing simplifications in VPlan.
Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off vp_truncate from #179622.
Reviewers: Pull Request: llvm/llvm-project#194903
…ody from scalar remainder (#190258)" (#194901) Reverts llvm/llvm-project#190258 This commit is causing crashes on the `intel-sycl-gpu` buildbot: https://lab.llvm.org/buildbot/#/builders/225/builds/7157 The crash is a SEGFAULT in `LoopVectorizationPlanner::updateLoopMetadataAndProfileInfo` when optimization remarks are enabled (`-pass-remarks-analysis=loop-vectorize`). Reverting while investigating the root cause.
Support Ternary expression for VectorType with vector condition Issue #192311
* Add the usual Windows static library name "libxml2s" * Windows build with static libxml2 requires compiler define
Introduce a dedicated cleanup flag for SEH __finally blocks and use it to separate SEH try cleanup emission from C++ object cleanup emission This prevents __finally cleanups from emitting seh.scope.begin/end and keeps destructor/delete cleanups paired with seh.scope markers Fix #109576
CONFLICT (content): Merge conflict in llvm/lib/SYCLLowerIR/CMakeLists.txt
…95131) When debugging PExpect tests, the 60 second timeout can make that process rather tedious. For TestStatusline, I used a class variable to easily override it while iterating but the idea is applicable more generally.
…nts" (#195135) Reverts llvm/llvm-project#190607 Causes crashes, e.g. https://lab.llvm.org/buildbot/#/builders/10/builds/27641
CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
CONFLICT (content): Merge conflict in clang/include/clang/Options/Options.td
CONFLICT (content): Merge conflict in clang/lib/CodeGen/TargetInfo.cpp CONFLICT (content): Merge conflict in clang/lib/Sema/SemaSYCL.cpp
All DebugInfo specs (core SPIR-V, OpenCL.DebugInfo.100, NonSemantic.Shader.DebugInfo.*) require the `ReturnType` operand of `DebugTypeFunction` to reference a valid type. For void-returning functions the translator was emitting `DebugInfoNone` instead of `VoidTy`. Also fix the same issue in existing reverse-translation test inputs that were generated by the Writer with the bug in place. `spirv-val` on the NonSemantic path cannot be enabled yet due to other spec-compliance issues to be fixed in subsequent changes. AI-assisted: Claude Sonnet 4.6 (commercial SaaS) Original commit: KhronosGroup/SPIRV-LLVM-Translator@a94d7338ab07198
The headers for this extension were published so we should use them instead: KhronosGroup/SPIRV-Headers@b8a3296 Co-Authored-By: Claude Original commit: KhronosGroup/SPIRV-LLVM-Translator@5a8baf6cd2b573f
The SPV_KHR_abort extension introduces the `OpAbortKHR` instruction, allowing shaders to terminate execution early. Assisted-by: Claude Opus 4.7 <noreply@anthropic.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@c5dbe5662a56e57
Add `spirv.nontemporal` metadata for image operations that have a `Nontemporal` image operand, such that the nontemporal information is preserved in the LLVM IR. Original commit: KhronosGroup/SPIRV-LLVM-Translator@8324c559351d1da
The current approach with `not` causes the spirv-val step to fail on recent enough SPIRV-Tools versions. Avoid running spirv-val at all until a new SPIRV-Tools release is more widely available. Original commit: KhronosGroup/SPIRV-LLVM-Translator@c1c1f797546f38a
The IR Verifier started rejecting some IR after llvm-project commit 81518d0 ("[DebugInfo] Verify DISubprogram has a type (#194556)", 2026-05-04). Update some .ll tests in the same way the llvm-project commit did. Make sure that a `DISubprogram` always has a non-null type by handling the case where the SPIR-V has a `DebugInfoNone` for the `DebugTypeFunction`. Original commit: KhronosGroup/SPIRV-LLVM-Translator@d62c927753b580f
…sts (#3717) Enable tests that pass on the latest LLVM revision Original commit: KhronosGroup/SPIRV-LLVM-Translator@8f759502ce0d6fa
This document describes already existing mechanism to preserve some of LLVM constructs in SPIR-V during the translation without a formal extension. It is referencing some public LLVM APIs (which is unusual for formal SPIR-V specification), but quite useful to capture what llvm-spirv does. On top of existing functionality added a way to preserve available_externally after the round trip. Capturing current implementation notes is done with a help from Claude Code Opus 4.7. Original commit: KhronosGroup/SPIRV-LLVM-Translator@bb8538937edc78e
Update floating point literal FileCheck lines after llvm/llvm-project@41c214f0b115 ("[AsmWriter] Change the output syntax of floating-point literals. (#190649)", 2026-05-07). Original commit: KhronosGroup/SPIRV-LLVM-Translator@e6a669945c3ee87
The NonSemantic.Shader.DebugInfo spec requires the Flags operand of DebugTypeBasic to be a 32-bit integer constant. The translator was emitting `DebugInfoNone` instead of `OpConstant 0`. `spirv-val` on the NonSemantic path cannot be enabled yet due to other spec-compliance issues to be fixed in subsequent changes. Original commit: KhronosGroup/SPIRV-LLVM-Translator@28cfd130468a072
This follows doc added in KhronosGroup/SPIRV-LLVM-Translator#3734 Now only AvailableExternally is translated Original commit: KhronosGroup/SPIRV-LLVM-Translator@c2ba98f7c95142b
Do nothing instead of crashing/erroring out. Original commit: KhronosGroup/SPIRV-LLVM-Translator@1406f97973f1868
NonSemantic.Shader.DebugInfo.100/.200 require the Source operand of DebugLine to reference a DebugSource instruction, not a raw OpString ID (DIFile in LLVM IR). Use getSource() to emit a proper DebugSource reference. Enable spirv-val on the NonSemantic paths of DebugInfoTypeBasic.ll and InlineNamespace.ll now that all previous spec-compliance fixes are in place. AI-assisted: Claude Sonnet 4.6 (commercial SaaS) Original commit: KhronosGroup/SPIRV-LLVM-Translator@38a315ed3cc0d91
…tion (#3661) Align the translator with the LLVM SPIR-V backend behavior : undef/poison initializers are dropped unless the variable is constant with an aggregate type. This avoids emitting unnecessary OpUndef initializers in the generated SPIR-V. Original commit: KhronosGroup/SPIRV-LLVM-Translator@f48615b9415768e
Spec is available here: KhronosGroup/SPIRV-Registry#401 Original commit: KhronosGroup/SPIRV-LLVM-Translator@ba34cbb62a9cbec
| import io | ||
| import os | ||
| import shutil | ||
| import subprocess |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@bc325ec
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@ba34cbb62a9cbec