fix: try Vulkan before CUDA, iterate on failure instead of falling back to CPU#246
Open
johnmcmullan wants to merge 1 commit intotobi:mainfrom
Open
fix: try Vulkan before CUDA, iterate on failure instead of falling back to CPU#246johnmcmullan wants to merge 1 commit intotobi:mainfrom
johnmcmullan wants to merge 1 commit intotobi:mainfrom
Conversation
…ck to CPU On AMD systems with ROCm installed, getLlamaGpuTypes() reports "cuda" as available because ROCm ships a HIP/CUDA compatibility layer. The previous code picked CUDA first, failed to initialise (no NVIDIA GPU), and fell straight to CPU — skipping Vulkan entirely. This change: 1. Reorders preference to Vulkan > Metal > CUDA, since Vulkan is cross-vendor and works on AMD/Intel without driver workarounds. 2. Iterates through all available candidates on failure rather than short-circuiting to CPU on the first error. Tested on AMD RX 9070 XT (RDNA 4) with ROCm installed on Pop!_OS 22.04. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
On AMD systems with ROCm installed,
getLlamaGpuTypes()reports"cuda"as available because ROCm ships a HIP/CUDA compatibility layer. The previous code:["cuda", "metal", "vulkan"])This affects any AMD GPU user who has ROCm installed, even partially. The symptom is
qmd statusreportingGPU: none (running on CPU)despite having a capable GPU with working Vulkan drivers.Fix
Two changes:
1. Reorder preference to Vulkan > Metal > CUDA
Vulkan is cross-vendor (AMD, Intel, NVIDIA) and doesn't require proprietary drivers. CUDA only works on NVIDIA. Putting Vulkan first means it gets used on AMD/Intel without any driver workarounds, while NVIDIA systems still get CUDA via the fallback.
2. Iterate through candidates on failure rather than short-circuiting to CPU
The old code gave up on GPU entirely if the first candidate failed. The new code continues trying remaining candidates, so CUDA failing doesn't prevent Vulkan from being attempted.
Tested on
GPU: none (running on CPU)GPU: vulkan (offloading: yes)—AMD Radeon RX 9070 XT (RADV GFX1201)