Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW20 2026)#22020

Draft
iclsrc wants to merge 2098 commits into
syclfrom
llvmspirv_pulldown
Draft

LLVM and SPIRV-LLVM-Translator pulldown (WW20 2026)#22020
iclsrc wants to merge 2098 commits into
syclfrom
llvmspirv_pulldown

Conversation

@iclsrc
Copy link
Copy Markdown
Collaborator

@iclsrc iclsrc commented May 14, 2026

jhuber6 and others added 30 commits April 29, 2026 07:54
Summary:
Right now it's a little difficult to use the multilibs support because
the user must manually provide one. I believe that when the user
configures multilibs with the LLVM CMake arguments at a minimum we
should provide one that forward `-fmultilib-flag=<multilib>` to the
created runtime.

This RP makes CMake emit this by manually writing a flag. Because users
could provide their own, this adds some extre complexity to prevent this
from being overwritten.

The desire for this change is to more easily ship this support in CMake
configuration files without needing to write files manually (for the
typical case).
### Summary

part of : llvm/llvm-project#185382

This is a follow up : llvm/llvm-project#193658

Lower zip1 and zip2 intrinsics in
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements

All the intrinsics are handled inline in
`llvm-project/build/lib/clang/23/include/arm_neon.h` like:
```
#ifdef __LITTLE_ENDIAN__
__ai __attribute__((target("neon"))) int8x8_t vzip1_s8(int8x8_t __p0, int8x8_t __p1) {
  int8x8_t __ret;
  __ret = __builtin_shufflevector(__p0, __p1, 0, 8, 1, 9, 2, 10, 3, 11);
  return __ret;
}
#else
__ai __attribute__((target("neon"))) int8x8_t vzip1_s8(int8x8_t __p0, int8x8_t __p1) {
  int8x8_t __ret;
  int8x8_t __rev0;  __rev0 = __builtin_shufflevector(__p0, __p0, __lane_reverse_64_8);
  int8x8_t __rev1;  __rev1 = __builtin_shufflevector(__p1, __p1, __lane_reverse_64_8);
  __ret = __builtin_shufflevector(__rev0, __rev1, 0, 8, 1, 9, 2, 10, 3, 11);
  __ret = __builtin_shufflevector(__ret, __ret, __lane_reverse_64_8);
  return __ret;
}
#endif
```
So no additional special lowering logic is needed.
- Normalize the header syntax for ReleaseNotes (current `.md` file and
`ReleaseNotesTemplate.txt`) to use `#`-based headings
- Normalize indents to distinguish doc title from page headers

Fixes navigation indents for Furo theme update (see
llvm/llvm-project#184440).
The SetVector already ensures that there are no cycles in the
collection.
…4790)

- Remove UNSUPPORTED: intelgpu from 12 passing tests:
  * mapping/data_member_ref.cpp
  * offloading/bug50022.cpp, info.c
* offloading/target_critical_region.cpp, target_depend_nowait.cpp,
target_nowait_target.cpp
  * offloading/strided_update/* (6 tests)
  * unified_shared_memory/close_member.c

- Change CUDA tests from XFAIL to UNSUPPORTED for Intel GPU:
  * offloading/CUDA/basic_launch.cu
  * offloading/CUDA/basic_launch_blocks_and_threads.cu
  * offloading/CUDA/basic_launch_multi_arg.cu
  * offloading/CUDA/launch_tu.cu

- Add Intel GPU configuration section to lit.cfg to disable USM tests by
default
…#194610)

This PR implements the refactorings discussed with @localspook in
#193838

---------

Co-authored-by: Victor Chernyakin <chernyakin.victor.j@outlook.com>
z/OS has a table of mapped names in the IR. Counting the hits for just
the name leads to one more hit than expected. Search for the name with
the @ char to make sure the right occurrences are being counted.
…#194648)

Fixes llvm/llvm-project#194596.

When the function result symbol is encountered while the compiler is
already completing the function result type, flang could recursively
re-enter _CompleteFunctionResultType()_ and crash on invalid code.

Instead of crashing on conflicting declarations, flang now reports an
“already declared” error and stops further recursion.
Handle AVX-512 VGF2P8AFFINEQB rmbi instructions in X86MCInstLower.

Unlike the existing rmi forms, rmbi uses a 64-bit broadcast memory
operand, so the constant pool entry may only contain the broadcast
source instead of a full-width vector constant. Print that constant
repeated across the destination vector width when forming the asm
comment.

Related: llvm/llvm-project#194572
…tributes (#194726)

Replace `getAsInteger()` parsing of the `patchable-function-entry`
and `patchable-function-prefix` function attributes with the existing
`Function::getFnAttributeAsParsedInteger()` helper across AsmPrinter
and all backend targets.

The IR verifier already validates these attributes as unsigned base-10
integers via `checkUnsignedBaseTenFuncAttr`, so parse failure at point
of use indicates a verifier bypass or IR corruption.
`getFnAttributeAsParsedInteger()` returns a default of 0 on failure
(matching the implicit behavior of the old code) and emits a diagnostic
rather than silently continuing.
Add operations that follow `float op(float, int)` pattern, mirroring the
existing `spirv.GL.Ldexp` op
The constexpr functions in question take a scoped enum as an argument
and a switch statement returns a value for each value of the enum. These
are all legal statements in a constexpr function in C++14.

Under constexpr rules, the evaluation of a constexpr function cannot
lead to an evaluation of any prohibited forms of expressions. An
evaluation of the functions being discussed with a valid argument will
terminate at the switch, and an code that follows will not be evaluated.

Using "llvm_unreachable" after the switch should be ok as long as the
expansion of the llvm_unreachable macro does not contain any statements
not allowed to appear in a constexpr function. At the same time, GCC
before v9 did not tolerate any unguarded calls to non-constexpr
functions after the switch.

To avoid using "llvm_unreachable", which can have multiple expansions,
use an assert with an explicit condition that the underlying value of
the argument lies between the minimum and maximum values of the enum.
Pulled out of #194473 - update combineMinMaxReduction to fold to a
ISD::VECREDUCE_SMAX/SMIN/UMAX/UMIN node and then perform the lowering
later on.

combineMinMaxReduction will go away once we can use
shouldExpandReduction, rely on the middle-end to recognise reductions
and not have to recreate them from the expanded patterns.

I've added pre-SSE41 handling using vector unrolling - hopefully this
will go away once #194672 is in place.
PR#194368 changed how line breaks are handles on Windows and it broke
several libcxx tests on Windows, including
libcxx/test/std/localization/locale.categories/facet.numpunct/
locale.numpunct.byname/thousands_sep.pass.cpp
This patch addresses this issue.
### Summary

part of llvm/llvm-project#185382

lower part of intrinsics in :
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements

Lower NEON::BI__builtin_neon_vzip_v and NEON::BI__builtin_neon_vzipq_v
in CIRGenBuiltinAArch64.cpp by porting the existing incubator logic
(`clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp`) onto ClangIR:
two bitcasts on the input vectors, two rounds of cir.vec.shuffle
generating the low/high interleave patterns, each stored through a
ptr_stride of the sret base pointer.

### Test
- test_vzip_mf8
- test_vzipq_mf8

I found that these two intrinsics are defined in
`llvm-project/clang/test/CodeGen/AArch64/fp8-intrinsics/acle_neon_fp8_untyped.c`,
but this file seems to be a test suite specifically for the `mfloat8`
type, so I did not remove their original test cases.

Some of the new CHECK lines additionally match a pair of bitcasts before
the shuffle; this shape comes from arm_neon.h's inline wrappers, which
re-cast typed vectors (e.g. <4 x i16>) through <8 x i8> before calling
__builtin_neon_vzip_v. Variants whose element type is already i8
(s8/u8/p8/mf8) skip that round-trip and therefore have no bitcasts in
the check lines.
Cache root entry and SLPCostThreshold queries once, group
!ForReduction-only checks under two blocks, extract a shared benign-node
predicate from the two duplicated lambdas, and skip HasSingleLoad and
allConstant work when results are dead.

Reviewers: 

Pull Request: llvm/llvm-project#194895
…els (#194754)

As it turns out, even if a `ProcResGroup` consists of in-order pipes, as
long as its (the group's) BufferSize is not zero, Machine Scheduler will
not use in-order scheduling on instructions that consume it. Since
BufferSize also defaults to -1 for `ProcResGroup`, we have been
scheduling the resource consumption of SiFive7's `PipeAB` (scalar pipes)
and `VA1OrVA2` (vector pipes) in an out-of-order fashion!

Co-authored-by: Min Hsu <min.hsu@sifive.com>
… scalar remainder (#190258)

Add two new loop metadata attributes — `llvm.loop.vectorize.body` and
`llvm.loop.vectorize.epilogue` — that the loop vectorizer sets on the
generated vector loop and epilogue loop respectively. The metadata is
only emitted when optimization remarks are enabled (`ORE->enabled()`),
so it has zero cost in normal compilation.

These enable downstream passes (LoopUnroll, WarnMissedTransforms) to
produce more precise optimization remarks. Instead of the generic "loop
not unrolled" warning on a source line that was vectorized, the unroller
can now report:
- **"vectorized loop"** for the main vector body
- **"epilogue loop"** for the scalar epilogue/remainder
- **"epilogue vectorized loop"** for an epilogue that was itself
vectorized during epilogue vectorization (carries both attributes)

A shared `getLoopVectorizeKindPrefix()` helper in
`LoopUtils.h`/`LoopUtils.cpp` reads the metadata and returns the
appropriate prefix string, used by both `LoopUnroll.cpp` and
`WarnMissedTransforms.cpp`. The metadata emission in `VPlan.cpp` uses
`Loop::addIntLoopAttribute` from the NFC PR #194676.

Two end-to-end tests exercise the full `loop-vectorize → loop-unroll`
pipeline with forced epilogue vectorization
(`-enable-epilogue-vectorization -epilogue-vectorization-force-VF=4`) to
produce all four loop categories from a single vectorizable function.
Each test also includes a plain (non-vectorized) function to cover the
baseline "loop" case. Both tests verify stderr diagnostic output and
YAML structured remarks.
**`LoopUnroll/vectorizer-loop-kind-remarks.ll`** checks for
successful-unroll remarks.
**`LoopTransformWarning/vectorizer-loop-kind-unroll-warning.ll`** checks
for failed-unroll warnings.

AI Disclaimer: this patch was generated with assistance of GitHub
Copilot/Claude Opus and reviewed by a human.
…669)

Set the debug location on non-target constant nodes so that the
resulting machine instructions inherit the correct source location.
Prevent combinePExtTruncate from forming RISCVISD nodes with illegal
type. Remove unnecessary call to getSimpleVT().

Legalize shift amount when custom legalizing i64 shifts.
SelectionDAGBuilder usually pre-legalizes shift amounts. If we scalarize
a vXi64 vector shift the shift amount will be i64.
This allows us to strip DerivedIVRecipe::execute, and remove the
dependency on emitTransformedIndex. It allows us to benefit from
existing simplifications in VPlan.
Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off vp_truncate from #179622.
…ody from scalar remainder (#190258)" (#194901)

Reverts llvm/llvm-project#190258

This commit is causing crashes on the `intel-sycl-gpu` buildbot:
https://lab.llvm.org/buildbot/#/builders/225/builds/7157

The crash is a SEGFAULT in
`LoopVectorizationPlanner::updateLoopMetadataAndProfileInfo` when
optimization remarks are enabled
(`-pass-remarks-analysis=loop-vectorize`). Reverting while investigating
the root cause.
Support Ternary expression for VectorType with vector condition

Issue #192311
* Add the usual Windows static library name "libxml2s"
* Windows build with static libxml2 requires compiler define
Introduce a dedicated cleanup flag for SEH __finally blocks and use it
to separate SEH try cleanup emission from C++ object cleanup emission

This prevents __finally cleanups from emitting seh.scope.begin/end and
keeps destructor/delete cleanups paired with seh.scope markers

Fix #109576
iclsrc and others added 28 commits April 30, 2026 11:05
  CONFLICT (content): Merge conflict in llvm/lib/SYCLLowerIR/CMakeLists.txt
…95131)

When debugging PExpect tests, the 60 second timeout can make that
process rather tedious. For TestStatusline, I used a class variable to
easily override it while iterating but the idea is applicable more
generally.
Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
  CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
  CONFLICT (content): Merge conflict in clang/include/clang/Options/Options.td
Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
CONFLICT (content): Merge conflict in clang/lib/CodeGen/TargetInfo.cpp
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaSYCL.cpp
All DebugInfo specs (core SPIR-V, OpenCL.DebugInfo.100, NonSemantic.Shader.DebugInfo.*) require the `ReturnType` operand of `DebugTypeFunction` to reference a valid type.
For void-returning functions the translator was emitting `DebugInfoNone` instead of `VoidTy`.

Also fix the same issue in existing reverse-translation test inputs that
were generated by the Writer with the bug in place.

`spirv-val` on the NonSemantic path cannot be enabled yet due to other
spec-compliance issues to be fixed in subsequent changes.

AI-assisted: Claude Sonnet 4.6 (commercial SaaS)

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@a94d7338ab07198
The headers for this extension were published so we should use them
instead:

KhronosGroup/SPIRV-Headers@b8a3296

Co-Authored-By: Claude

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@5a8baf6cd2b573f
The SPV_KHR_abort extension introduces the `OpAbortKHR` instruction,
allowing shaders to terminate execution early.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@c5dbe5662a56e57
Add `spirv.nontemporal` metadata for image operations that have a
`Nontemporal` image operand, such that the nontemporal information is
preserved in the LLVM IR.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@8324c559351d1da
The current approach with `not` causes the spirv-val step to fail on
recent enough SPIRV-Tools versions. Avoid running spirv-val at all until
a new SPIRV-Tools release is more widely available.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@c1c1f797546f38a
The IR Verifier started rejecting some IR after llvm-project commit
81518d0 ("[DebugInfo] Verify DISubprogram has a type (#194556)",
2026-05-04).

Update some .ll tests in the same way the llvm-project commit did.

Make sure that a `DISubprogram` always has a non-null type by handling
the case where the SPIR-V has a `DebugInfoNone` for the
`DebugTypeFunction`.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@d62c927753b580f
This document describes already existing mechanism to preserve some of
LLVM constructs in SPIR-V during the translation without a formal extension.
It is referencing some public LLVM APIs (which is unusual for formal SPIR-V
specification), but quite useful to capture what llvm-spirv does.

On top of existing functionality added a way to preserve
available_externally after the round trip.

Capturing current implementation notes is done with a help from Claude
Code Opus 4.7.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@bb8538937edc78e
Update floating point literal FileCheck lines after
llvm/llvm-project@41c214f0b115 ("[AsmWriter] Change the output syntax of
floating-point literals. (#190649)", 2026-05-07).

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@e6a669945c3ee87
The NonSemantic.Shader.DebugInfo spec requires the Flags operand of
DebugTypeBasic to be a 32-bit integer constant.

The translator was emitting `DebugInfoNone` instead of `OpConstant 0`.

`spirv-val` on the NonSemantic path cannot be enabled yet due to other
spec-compliance issues to be fixed in subsequent changes.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@28cfd130468a072
NonSemantic.Shader.DebugInfo.100/.200 require the Source operand of
DebugLine to reference a DebugSource instruction, not a raw OpString ID
(DIFile in LLVM IR).
Use getSource() to emit a proper DebugSource reference.

Enable spirv-val on the NonSemantic paths of DebugInfoTypeBasic.ll and
InlineNamespace.ll now that all previous spec-compliance fixes are in
place.

AI-assisted: Claude Sonnet 4.6 (commercial SaaS)

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@38a315ed3cc0d91
…tion (#3661)

Align the translator with the LLVM SPIR-V backend behavior :
undef/poison initializers are dropped unless the variable is constant
with an aggregate type. This avoids emitting unnecessary OpUndef
initializers in the generated SPIR-V.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@f48615b9415768e
@iclsrc iclsrc added the disable-lint Skip linter check step and proceed with build jobs label May 14, 2026
import io
import os
import shutil
import subprocess
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-lint Skip linter check step and proceed with build jobs

Projects

None yet

Development

Successfully merging this pull request may close these issues.