Skip to content

fix(cuda): add Pascal GPU architecture support to llama.cpp build#986

Open
lohitkolluri wants to merge 1 commit into
docker:mainfrom
lohitkolluri:feat/cuda-pascal-archs
Open

fix(cuda): add Pascal GPU architecture support to llama.cpp build#986
lohitkolluri wants to merge 1 commit into
docker:mainfrom
lohitkolluri:feat/cuda-pascal-archs

Conversation

@lohitkolluri

@lohitkolluri lohitkolluri commented Jun 24, 2026

Copy link
Copy Markdown

Description

Add explicit CMAKE_CUDA_ARCHITECTURES to the llama.cpp CUDA build to include Pascal (sm_61, sm_62) GPU support.

Without this flag, CMake's CUDA architecture defaults on CUDA 12.9+ omit pre-Turing architectures, causing no compatible GPU found errors on GTX 10-series (1080, 1080 Ti, 1070, 1060), Tesla P40, and other Pascal GPUs.

CUDA 12.x fully supports offline compilation of these architectures — the limitation is only in CMake's default target list, not in nvcc capability.

Changes

  • llamacpp/native/cuda.Dockerfile: Added -DCMAKE_CUDA_ARCHITECTURES=61;62;70;75;80;86;89 to the cmake flags, with a comment documenting why explicit architectures are needed and the CUDA version constraint for Pascal support.

Background

Signed-off-by: Lohit Kolluri lohitkolluri@gmail.com

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The hard-coded CMAKE_CUDA_ARCHITECTURES list includes 120;121, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actual nvcc/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities.
  • Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The hard-coded `CMAKE_CUDA_ARCHITECTURES` list includes `120;121`, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actual `nvcc`/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities.
- Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request configures explicit CUDA architectures in the Dockerfile to support Pascal GPUs. However, the architecture list contains invalid values (120 and 121) which will cause compilation failures, and omits Hopper (90). The review feedback correctly identifies this issue and provides a corrected list of architectures.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread llamacpp/native/cuda.Dockerfile Outdated
Explicitly list CUDA architectures in the llama.cpp cmake build to
include Pascal (sm_61, sm_62) support. CMake's CUDA_ARCHITECTURES
defaults on CUDA 12.9+ omit pre-Turing architectures, causing
"no compatible GPU found" on GTX 10-series and P40 GPUs.

CUDA 12.x fully supports offline compilation of sm_61/sm_62;
CUDA 13+ drops pre-sm_75 support, so the CUDA 12.x base image
must be retained for Pascal compatibility.

Signed-off-by: Lohit Kolluri <lohitkolluri@gmail.com>
@lohitkolluri lohitkolluri force-pushed the feat/cuda-pascal-archs branch from d651d23 to dcffe81 Compare June 24, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Pascal GPU (sm_61/sm_62) support to llama.cpp CUDA build

1 participant