Skip to content

feat: support reasoning_content in OpenAI-compatible (completions) providers#295

Merged
cpsievert merged 5 commits intomainfrom
feature/completions-reasoning-content
May 7, 2026
Merged

feat: support reasoning_content in OpenAI-compatible (completions) providers#295
cpsievert merged 5 commits intomainfrom
feature/completions-reasoning-content

Conversation

@cpsievert
Copy link
Copy Markdown
Collaborator

Summary

OpenAI-compatible providers using the Completions API (ChatOpenAICompletions, ChatDeepSeek, ChatOpenRouter, etc.) now extract reasoning_content from model responses and produce ContentThinking objects — matching the behavior already present in the Responses API provider (ChatOpenAI).

A new preserve_thinking parameter controls whether reasoning content is sent back to the API in multi-turn conversations. This is necessary because providers disagree on whether reasoning traces belong in conversation history:

Provider Requirement
DeepSeek V4 (with tool calls) Must include — omitting causes 400
DeepSeek V4 (without tool calls) Ignored if included, safe either way
DeepSeek legacy deepseek-reasoner Must exclude — including causes 400
OpenRouter Should include — recommended for quality
Others (Groq, Cloudflare, etc.) Don't return reasoning_content (N/A)

Changes

  • OpenAICompletionsProvider: Extract reasoning_content from both streaming deltas and non-streaming responses. Handle ContentThinking in turn serialization — drop by default, preserve when preserve_thinking=True.
  • ChatOpenAICompletions: Expose preserve_thinking parameter for users of custom OpenAI-compatible endpoints.
  • ChatOpenRouter: Set preserve_thinking=True (OpenRouter recommends including reasoning traces).
  • ChatDeepSeek: Set preserve_thinking=True (required for V4 thinking models with tool calls; harmlessly ignored for non-thinking responses). Also updates default model from deprecated deepseek-chat to deepseek-v4-flash.

Motivation

This is the Python equivalent of tidyverse/ellmer#972. The ellmer PR defaults to preserve_thinking=False for DeepSeek based on the old deepseek-reasoner docs, but DeepSeek's current V4 models (which replace deepseek-reasoner and deepseek-chat as of 2026-07-24) actually require reasoning_content back when tool calls are present. We default to True for DeepSeek since it's a no-op for non-thinking responses and required for the tool-call case.

Relationship to #288

This PR overlaps significantly with #288, which also adds reasoning_content support. The key difference is that #288 always preserves thinking unconditionally, while this PR adds the preserve_thinking toggle so each provider wrapper can choose the correct behavior. This PR also updates the DeepSeek default model and re-records VCR cassettes.

One thing #288 includes that this PR does not: reordering tool result messages to precede user content in _turns_as_inputs. That may be worth investigating separately if DeepSeek requires that ordering.

Test plan

  • New unit tests for streaming extraction, non-streaming extraction, thinking drop (default), and thinking preserve
  • All existing provider tests pass (79 tests across 12 providers)
  • DeepSeek VCR cassettes re-recorded against live API with new default model
  • pyright passes with 0 errors across all modified files

cpsievert added 2 commits May 6, 2026 17:44
Extract `reasoning_content` from streaming deltas and non-streaming
responses in OpenAICompletionsProvider, producing ContentThinking
objects. Add `preserve_thinking` parameter (default False) to control
whether reasoning content is sent back in multi-turn conversations.

Set preserve_thinking=True for OpenRouter (which recommends including
reasoning traces). DeepSeek's default (False) prevents 400 errors
when reasoning_content is included in input messages.

Equivalent of tidyverse/ellmer#972.
…serve_thinking

deepseek-chat is deprecated (2026-07-24) and maps to v4-flash anyway.
V4 thinking models require reasoning_content back for tool-call turns,
so preserve_thinking=True is the correct default.

This comment was marked as resolved.

@cpsievert
Copy link
Copy Markdown
Collaborator Author

@copilot resolve the merge conflicts in this pull request

Co-authored-by: cpsievert <1365941+cpsievert@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 7, 2026

@copilot resolve the merge conflicts in this pull request

Resolved by merging the latest origin/main into this branch and fixing the CHANGELOG.md conflict in commit f050c7a. The branch now has no merge conflicts.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.cloudflare.com
    • Triggering command: /home/REDACTED/work/chatlas/chatlas/.venv/bin/pytest pytest (dns block)
  • api.portkey.ai
    • Triggering command: /home/REDACTED/work/chatlas/chatlas/.venv/bin/pytest pytest (dns block)
  • dummy-databricks-host
    • Triggering command: /home/REDACTED/work/chatlas/chatlas/.venv/bin/pytest pytest (dns block)
  • openaipublic.blob.core.windows.net
    • Triggering command: /home/REDACTED/work/chatlas/chatlas/.venv/bin/pytest pytest (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants