[release-0.2][backport][api][plan][integrations] Record built-in chat token metrics outside the async call boundary (#712)#725
Merged
xintongsong merged 1 commit intoJun 2, 2026
Conversation
…the async call boundary (apache#712) Backport of apache#712 to release-0.2 with scope narrowed to the chat-model connections that exist on this branch. The Bedrock, AzureOpenAI and OpenAIResponses connection variants are not present on release-0.2 and are intentionally excluded. Move token-metric recording from the durable async callable (where it crossed the operator/mailbox thread boundary) to the action thread: - BaseChatModelSetup gains public recordTokenMetrics(String, long, long). - BaseChatModelConnection.recordTokenMetrics(...) and the connection.setMetricGroup(...) forwarding in BaseChatModelSetup are removed. - Each connection's chat() stashes model_name / promptTokens / completionTokens into ChatMessage.extraArgs (Ollama, Anthropic, AzureAI, OpenAI on release-0.2). - ChatModelAction records via the setup after durableExecute(Async) returns, before structured-output reassignment. - RunnerContext.getAgentMetricGroup/getActionMetricGroup javadoc notes that the returned group must only be accessed from the operator thread, not inside a durable callable. Emitted metric paths and counter names are unchanged. Records are gated identically to Python: non-empty model name and both token counts greater than zero; Integer/Long token values are accepted via Number#longValue(). Tests: - BaseChatModelConnectionTokenMetricsTest renamed and rewritten to BaseChatModelSetupTokenMetricsTest (target moved from connection to setup). - New ChatModelActionTest covers recordChatTokenMetrics: records when all keys present and positive; Integer-typed values still recorded; skips on missing key, non-numeric value, zero token, or empty model name.
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #712 to
release-0.2. The change keeps token-metric recording on the operator/mailbox (action) thread instead of the durable-execution pool thread, mirroring the Python side of the framework.Scope on release-0.2
The original PR touched 7 connections;
release-0.2only ships 4 of them. The Bedrock, AzureOpenAI and OpenAIResponses connections are not present on this branch and are intentionally excluded from the backport. Covered connections:release-0.2; onmainit has since been split into Completions / Responses / AzureOpenAI)What changed
BaseChatModelSetupgainspublic recordTokenMetrics(String, long, long)— the Python-parity record site. The setup's bound metric group is the action metric group, so the emitted metric path and counter names are unchanged.BaseChatModelConnection.recordTokenMetrics(...)and the now-deadconnection.setMetricGroup(...)forwarding inBaseChatModelSetupare removed.chat()now stashesmodel_name/promptTokens/completionTokensinto the responseChatMessage.extraArgsinstead of recording insidechat().ChatModelActionrecords after the durable call returns (before structured-output reassignment, which would drop the keys) via a newstatic recordChatTokenMetrics(...)helper.RunnerContextmetric-group getter javadoc now documents that the returned group must only be accessed from the operator/mailbox thread, not inside a durable callable.Recording is gated identically to Python: non-empty model name and both token counts greater than zero; values are read as
Numberand converted withlongValue()to tolerateInteger/Longacross durable recovery.This also fixes the same latent gap as on
main: Python-backed chat models invoked from the Java action previously recorded no token metrics (the path bypasses the Java connection's recording); they are now captured once via the setup.Tests
BaseChatModelSetupTokenMetricsTest), mirroring the rename onmain.ChatModelActionTestcoversrecordChatTokenMetrics: records when all keys are present and positive;Integer-typed token values still recorded viaNumber#longValue(); skips when a key is missing or non-numeric; skips when a token is0or the model name is empty (Python parity)../tools/build.sh -jand module-levelmvn teston api / plan / 4 covered connections all pass locally.API
Adds
public BaseChatModelSetup.recordTokenMetrics(String, long, long)and removes the previouslyprotectedBaseChatModelConnection.recordTokenMetrics(...). No change to public configuration, event, or resource APIs. Emitted metric names and paths are unchanged.Documentation
doc-neededdoc-not-neededdoc-included