Skip to content

[codex] fix Gemma 4 thinking generation prompt#1

Draft
romgenie wants to merge 1 commit into
masterfrom
codex/gemma4-thinking-template-latest
Draft

[codex] fix Gemma 4 thinking generation prompt#1
romgenie wants to merge 1 commit into
masterfrom
codex/gemma4-thinking-template-latest

Conversation

@romgenie
Copy link
Copy Markdown

What changed

Fixes the Gemma 4 generation prompt in both shipped Gemma 4 templates so thinking mode opens the thought channel for generation instead of closing an empty thought block. Non-thinking mode now leaves the model turn open without injecting <|channel>thought\n<channel|>.

Why

The previous guard was inverted: it emitted a closed empty thought channel when enable_thinking was false. In practice this prevented Gemma 4 from producing visible reasoning when thinking was enabled and leaked the wrong control tokens into non-thinking prompts.

Validation

  • Built test-chat in an Ubuntu 24.04 Podman container and ran ./build-test/bin/test-chat --template google-gemma-4.
  • Ran the full ./build-test/bin/test-chat suite.
  • Built the Vulkan server image with .devops/vulkan.Dockerfile using UBUNTU_VERSION=24.04 as localhost/llama.cpp:server-vulkan-latest-gemma4-test.
  • Smoke-ran the image with llama-server --version, which loaded the Vulkan backend and reported version: 8981 (d77599234).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant