Skip to content

fix: try Vulkan before CUDA, iterate on failure instead of falling back to CPU#246

Open
johnmcmullan wants to merge 1 commit intotobi:mainfrom
johnmcmullan:fix/amd-vulkan-gpu-fallback
Open

fix: try Vulkan before CUDA, iterate on failure instead of falling back to CPU#246
johnmcmullan wants to merge 1 commit intotobi:mainfrom
johnmcmullan:fix/amd-vulkan-gpu-fallback

Conversation

@johnmcmullan
Copy link

Problem

On AMD systems with ROCm installed, getLlamaGpuTypes() reports "cuda" as available because ROCm ships a HIP/CUDA compatibility layer. The previous code:

  1. Picked CUDA first (preference order: ["cuda", "metal", "vulkan"])
  2. Failed to initialise at runtime (no NVIDIA GPU)
  3. Fell straight to CPU — skipping Vulkan entirely

This affects any AMD GPU user who has ROCm installed, even partially. The symptom is qmd status reporting GPU: none (running on CPU) despite having a capable GPU with working Vulkan drivers.

Fix

Two changes:

1. Reorder preference to Vulkan > Metal > CUDA

Vulkan is cross-vendor (AMD, Intel, NVIDIA) and doesn't require proprietary drivers. CUDA only works on NVIDIA. Putting Vulkan first means it gets used on AMD/Intel without any driver workarounds, while NVIDIA systems still get CUDA via the fallback.

2. Iterate through candidates on failure rather than short-circuiting to CPU

The old code gave up on GPU entirely if the first candidate failed. The new code continues trying remaining candidates, so CUDA failing doesn't prevent Vulkan from being attempted.

Tested on

  • AMD RX 9070 XT (RDNA 4), Pop!_OS 22.04, ROCm installed
  • Before: GPU: none (running on CPU)
  • After: GPU: vulkan (offloading: yes)AMD Radeon RX 9070 XT (RADV GFX1201)

…ck to CPU

On AMD systems with ROCm installed, getLlamaGpuTypes() reports "cuda" as
available because ROCm ships a HIP/CUDA compatibility layer. The previous
code picked CUDA first, failed to initialise (no NVIDIA GPU), and fell
straight to CPU — skipping Vulkan entirely.

This change:
1. Reorders preference to Vulkan > Metal > CUDA, since Vulkan is
   cross-vendor and works on AMD/Intel without driver workarounds.
2. Iterates through all available candidates on failure rather than
   short-circuiting to CPU on the first error.

Tested on AMD RX 9070 XT (RDNA 4) with ROCm installed on Pop!_OS 22.04.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant