fix(deepseek): support reasoning_content in tools and streaming#288
fix(deepseek): support reasoning_content in tools and streaming#288Damonsoul wants to merge 4 commits intoposit-dev:mainfrom
Conversation
9f2feef to
9fb4304
Compare
|
Thanks @Damonsoul. I'd love to get this fixed, but this also feels like possibly more code than we really need to support this. Did you by chance consider the angle of adding support inside of |
c0f19de to
e6b6d3a
Compare
|
@cpsievert That is a fantastic point, and you're absolutely right! Moving the logic to the base class is a much cleaner architecture. I have force-pushed a single, clean commit to this PR that does exactly what you suggested: All extraction, streaming, and serialization of |
| """ | ||
| Chat with an OpenAI-compatible model (via the Completions API). | ||
|
|
||
| This function exists mainly for historical reasons; new code should | ||
| prefer `ChatOpenAI()`, which uses the newer Responses API. | ||
|
|
||
| This function may also be useful for using an "OpenAI-compatible model" | ||
| hosted by another provider (e.g., vLLM, Ollama, etc.) that supports the | ||
| OpenAI Completions API. | ||
| """ |
There was a problem hiding this comment.
There are a number of unnecessary deletions here, notably this docstring and numerous other comments. Please try to keep the changes minimal and focused on the bug fix.
| reasoning_content = getattr(message, "reasoning_content", None) | ||
| if ( | ||
| reasoning_content is None | ||
| and hasattr(message, "model_extra") | ||
| and message.model_extra | ||
| ): | ||
| reasoning_content = message.model_extra.get("reasoning_content") | ||
|
|
||
| if reasoning_content: | ||
| contents.append(ContentThinking(thinking=reasoning_content)) |
There was a problem hiding this comment.
It'd be great to have a test covering this logic.
e6b6d3a to
ed88e6a
Compare
ed88e6a to
8c527ac
Compare
|
I completely understand! I've updated the PR to strictly minimize the diff footprint. I kept all of your original comments, docstrings, and structure untouched. |
|
Thanks again for this contribution @Damonsoul. It helped me realize that we need more general controls over preserving thinking content or not. See #295 for more on that, which renders most of this PR irrelevant. Although, as you'll see in the PR description for #295, the one thing here that that other PR did not address is this:
If you're willing to, I'd really appreciate it if you would look deeper into this topic. A reproducible example using DeepSeek would be welcome in a new issue. If you want to go above and beyond a more general investigation into whether other providers also want this format, that'd be awesome. |
Description
DeepSeek's thinking models (like deepseek-v4-flash) require
reasoning_contentto be passed back in subsequent messages during tool calls, otherwise it returns a 400 error.This PR:
_turns_as_inputsto preserve and sendreasoning_content._response_as_turnandstream_contentto correctly capture reasoning from DeepSeek's API response.Tested with
deepseek-v4-flashand tool calls - the 400 error is resolved and thinking process is visible.