fix(cuda): add Pascal GPU architecture support to llama.cpp build by lohitkolluri · Pull Request #986 · docker/model-runner

lohitkolluri · 2026-06-24T18:11:04Z

Description

Add explicit CMAKE_CUDA_ARCHITECTURES to the llama.cpp CUDA build to include Pascal (sm_61, sm_62) GPU support.

Without this flag, CMake's CUDA architecture defaults on CUDA 12.9+ omit pre-Turing architectures, causing no compatible GPU found errors on GTX 10-series (1080, 1080 Ti, 1070, 1060), Tesla P40, and other Pascal GPUs.

CUDA 12.x fully supports offline compilation of these architectures — the limitation is only in CMake's default target list, not in nvcc capability.

Changes

llamacpp/native/cuda.Dockerfile: Added -DCMAKE_CUDA_ARCHITECTURES=61;62;70;75;80;86;89 to the cmake flags, with a comment documenting why explicit architectures are needed and the CUDA version constraint for Pascal support.

Background

Closes Add Pascal GPU (sm_61/sm_62) support to llama.cpp CUDA build #929
Builds on the approach from fix: add Pascal CUDA architectures #970 with added documentation explaining the non-obvious architecture constraints.
CUDA 13+ drops offline compilation for pre-sm_75 targets, so the CUDA 12.x base image must be retained for Pascal compatibility.

Signed-off-by: Lohit Kolluri lohitkolluri@gmail.com

sourcery-ai

Hey - I've left some high level feedback:

The hard-coded CMAKE_CUDA_ARCHITECTURES list includes 120;121, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actual nvcc/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities.
Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The hard-coded `CMAKE_CUDA_ARCHITECTURES` list includes `120;121`, which may not be supported by the CUDA 12.x toolchain used in this Dockerfile; consider aligning the list with the actual `nvcc`/CUDA version capabilities or deriving it from a single source (e.g., an ARG or upstream default) to avoid future incompatibilities.
- Since the comment explicitly assumes CUDA 12.x, it would be safer to enforce that constraint in the base image/tag (or a build-time check) so this Dockerfile doesn’t silently break if the base image is updated to CUDA 13+ later.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request configures explicit CUDA architectures in the Dockerfile to support Pascal GPUs. However, the architecture list contains invalid values (120 and 121) which will cause compilation failures, and omits Hopper (90). The review feedback correctly identifies this issue and provides a corrected list of architectures.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Explicitly list CUDA architectures in the llama.cpp cmake build to include Pascal (sm_61, sm_62) support. CMake's CUDA_ARCHITECTURES defaults on CUDA 12.9+ omit pre-Turing architectures, causing "no compatible GPU found" on GTX 10-series and P40 GPUs. CUDA 12.x fully supports offline compilation of sm_61/sm_62; CUDA 13+ drops pre-sm_75 support, so the CUDA 12.x base image must be retained for Pascal compatibility. Signed-off-by: Lohit Kolluri <lohitkolluri@gmail.com>

sourcery-ai Bot reviewed Jun 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread llamacpp/native/cuda.Dockerfile Outdated

lohitkolluri force-pushed the feat/cuda-pascal-archs branch from d651d23 to dcffe81 Compare June 24, 2026 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(cuda): add Pascal GPU architecture support to llama.cpp build#986

fix(cuda): add Pascal GPU architecture support to llama.cpp build#986
lohitkolluri wants to merge 1 commit into
docker:mainfrom
lohitkolluri:feat/cuda-pascal-archs

lohitkolluri commented Jun 24, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lohitkolluri commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Background

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lohitkolluri commented Jun 24, 2026 •

edited

Loading