Fix/structured output bugs by kylexqian · Pull Request #49 · OpenGradient/tee-gateway

kylexqian · 2026-04-03T09:19:02Z

Fix some of the bugs found from testing SDK changes w/ all providers

Mainly

OpenAI enforces the word "JSON" to appear somewhere in the messages when using json_object (why? I have no idea)
Anthropic json_schema messages don't return a usage dict -- which is needed for calculating token usage

…usage metadata Two bugs causing 500s for structured output requests: 1. OpenAI json_object 500 (400 from OpenAI upstream): OpenAI requires the word 'json' to appear somewhere in the messages when using response_format.type='json_object'. Inject a system message 'Respond in JSON format.' when no message already contains the word, in both the streaming and non-streaming paths. The injection happens after request_bytes is computed so the TEE hash covers the original user request. 2. Anthropic non-streaming json_schema 500: _invoke_anthropic_structured was constructing AIMessage(content=...) from scratch, discarding the usage_metadata from the underlying Anthropic response. The response body therefore had no 'usage' dict, causing the x402 cost resolver to raise ValueError. Fix by passing include_raw=True to with_structured_output, extracting usage_metadata from the raw AIMessage, and copying it onto the synthesized AIMessage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…usage metadata Covers the two fixes from the previous commit: 1. TestJsonObjectKeywordInjection — verifies that a SystemMessage containing 'json' is prepended to langchain_messages when response_format is json_object and no existing message already contains the word (both non-streaming and streaming paths). Also verifies the injection is case-insensitive and does not fire for json_schema mode. 2. TestAnthropicUsageMetadataPreservation — verifies that usage_metadata from the raw Anthropic AIMessage is copied onto the synthesized return value of _invoke_anthropic_structured, and that the resulting non-streaming response dict contains a correctly populated 'usage' field for the x402 cost calculator. Also updates three existing test mocks that were returning plain dicts but now need to match the include_raw=True format: {"raw": AIMessage, "parsed": dict, "parsing_error": None}. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…red output _invoke_anthropic_structured is called synchronously before generate() runs, so the streaming loop (chunks_iter=[]) never executes and final_usage stays None. The final SSE chunk therefore has no 'usage' field, causing the x402 middleware to raise ValueError when trying to compute the session cost after the response is sent — billing silently fails even though the client receives a valid 200. Fix: extract usage_metadata from the AIMessage returned by _invoke_anthropic_structured (anthropic_structured_usage) and seed final_usage from it at the top of the Anthropic branch inside generate(). Also adds a unit test that asserts the final SSE chunk contains prompt_tokens, completion_tokens, and total_tokens when Anthropic structured output is used in streaming mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t lost Gemini returns cumulative usageMetadata on every SSE chunk; LangChain's subtract_usage() converts these to deltas, meaning input_tokens only appears non-zero in the *first* chunk carrying usage data and is 0 in all subsequent ones. The previous code replaced final_usage on every chunk, so the last chunk's input_tokens=0 silently wiped the correct prompt token count. Fix: accumulate numeric delta fields across chunks instead of replacing. Adds TestGeminiStreamingUsageAccumulation (3 tests) covering the preservation of prompt tokens, the two-chunk delta pattern, and the no-usage-chunks case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Fixes structured output edge cases across providers (OpenAI json_object keyword requirement; Anthropic structured output usage; Gemini streaming usage accumulation) and expands test coverage for these scenarios.

Changes:

Preserve Anthropic structured-output usage_metadata and ensure it is surfaced in final streaming/non-streaming responses.
Inject a minimal system instruction containing “json” when using response_format.type="json_object" and no message already includes the keyword.
Accumulate streaming usage deltas across chunks (notably for Gemini) and add/adjust tests accordingly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`tee_gateway/controllers/chat_controller.py`	Adds OpenAI `json_object` keyword injection, preserves Anthropic usage metadata, and accumulates streaming usage across chunks.
`tests/test_structured_outputs.py`	Updates Anthropic structured-output mocks to include `include_raw` shape; adds tests for usage propagation, keyword injection, and Gemini usage accumulation.
`pyproject.toml`	Expands Ruff exclude patterns (venvs, site-packages, etc.).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

adambalogh · 2026-04-03T13:56:00Z


        langchain_messages = convert_messages(chat_request.messages)
+
+        # OpenAI (and compatible providers) require the word "json" to appear


i'm not sure we really need this. this class works and does not need to inject anything https://github.com/OpenGradient/memsync/blob/main/memsync/llms/openai.py#L99

This is only the case when we use json_object

E.g. we get this error message

openai.BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}

mem0ai/mem0#4248 -- seems like a known requirement.

usage_metadata is a plain dict, so getattr() always returned the default 0 instead of the actual token counts, causing Anthropic streaming structured output to report zero usage in the final SSE chunk. Also bumps all langchain packages to their latest 1.x releases: langchain 1.2.15, langchain-core 1.2.26, langchain-openai 1.1.12, langchain-anthropic 1.4.0, langchain-google-genai 4.2.1, langchain-xai 1.2.2. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

socket-security · 2026-04-04T00:39:08Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	anthropic@0.84.0 ⏵ 0.89.0	⁺¹

View full report

…ng errors Two robustness fixes: 1. Replace inline `(getattr(m, "content", "") or "").lower()` with a dedicated `_messages_contain_json_word()` helper that handles both plain-string and list-of-parts (multimodal) message content. The old one-liner called `.lower()` on a list when a message contained image parts, causing an AttributeError and a 500 on any json_object request that included multimodal input. 2. Check `parsing_error` in `_invoke_anthropic_structured` after calling `with_structured_output(include_raw=True)`. Previously a schema mismatch silently serialised `None` as the string "None" and returned a signed 200; now it raises a ValueError that propagates to the outer exception handler and returns a 500 with a logged message. Also consolidates the duplicate `_normalize_response_format` call in the streaming json_object injection path to reuse `rf` already computed above. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kylexqian force-pushed the fix/structured-output-bugs branch from 89abdc3 to ac9f267 Compare April 3, 2026 09:21

kylexqian and others added 6 commits April 3, 2026 02:33

Lint fix

30b8c72

More lint and typing issues

9e5199e

Add more ignores to the pyproject linter

728fc53

adambalogh requested a review from Copilot April 3, 2026 13:52

Copilot started reviewing on behalf of adambalogh April 3, 2026 13:52 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread tee_gateway/controllers/chat_controller.py Outdated

Comment thread tee_gateway/controllers/chat_controller.py

Comment thread tee_gateway/controllers/chat_controller.py Outdated

adambalogh reviewed Apr 3, 2026

View reviewed changes

adambalogh approved these changes Apr 10, 2026

View reviewed changes

kylexqian merged commit 3a437a9 into main Apr 11, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/structured output bugs#49

Fix/structured output bugs#49
kylexqian merged 9 commits intomainfrom
fix/structured-output-bugs

kylexqian commented Apr 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adambalogh Apr 3, 2026

Uh oh!

kylexqian Apr 3, 2026

Uh oh!

socket-security bot commented Apr 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		langchain_messages = convert_messages(chat_request.messages)

		# OpenAI (and compatible providers) require the word "json" to appear

Conversation

kylexqian commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adambalogh Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

kylexqian Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

socket-security bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kylexqian commented Apr 3, 2026 •

edited

Loading

socket-security bot commented Apr 4, 2026 •

edited

Loading