fix(cuda): add Pascal GPU architecture support to llama.cpp build#986
fix(cuda): add Pascal GPU architecture support to llama.cpp build#986lohitkolluri wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The hard-coded
CMAKE_CUDA_ARCHITECTURESlist includes120;121, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actualnvcc/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities. - Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The hard-coded `CMAKE_CUDA_ARCHITECTURES` list includes `120;121`, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actual `nvcc`/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities.
- Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request configures explicit CUDA architectures in the Dockerfile to support Pascal GPUs. However, the architecture list contains invalid values (120 and 121) which will cause compilation failures, and omits Hopper (90). The review feedback correctly identifies this issue and provides a corrected list of architectures.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Explicitly list CUDA architectures in the llama.cpp cmake build to include Pascal (sm_61, sm_62) support. CMake's CUDA_ARCHITECTURES defaults on CUDA 12.9+ omit pre-Turing architectures, causing "no compatible GPU found" on GTX 10-series and P40 GPUs. CUDA 12.x fully supports offline compilation of sm_61/sm_62; CUDA 13+ drops pre-sm_75 support, so the CUDA 12.x base image must be retained for Pascal compatibility. Signed-off-by: Lohit Kolluri <lohitkolluri@gmail.com>
d651d23 to
dcffe81
Compare
Description
Add explicit
CMAKE_CUDA_ARCHITECTURESto the llama.cpp CUDA build to include Pascal (sm_61, sm_62) GPU support.Without this flag, CMake's CUDA architecture defaults on CUDA 12.9+ omit pre-Turing architectures, causing
no compatible GPU founderrors on GTX 10-series (1080, 1080 Ti, 1070, 1060), Tesla P40, and other Pascal GPUs.CUDA 12.x fully supports offline compilation of these architectures — the limitation is only in CMake's default target list, not in nvcc capability.
Changes
llamacpp/native/cuda.Dockerfile: Added-DCMAKE_CUDA_ARCHITECTURES=61;62;70;75;80;86;89to the cmake flags, with a comment documenting why explicit architectures are needed and the CUDA version constraint for Pascal support.Background
Signed-off-by: Lohit Kolluri lohitkolluri@gmail.com