Skip to content

[api][plan][runtime] Separate prompt arguments from message extra_args in BaseChatModelSetup.chat()#698

Open
weiqingy wants to merge 2 commits into
apache:mainfrom
weiqingy:220-impl
Open

[api][plan][runtime] Separate prompt arguments from message extra_args in BaseChatModelSetup.chat()#698
weiqingy wants to merge 2 commits into
apache:mainfrom
weiqingy:220-impl

Conversation

@weiqingy
Copy link
Copy Markdown
Collaborator

Linked issue: #220

Purpose of change

BaseChatModelSetup.chat() (Java + Python) previously filled prompt templates by flattening every input message's extra_args into a single map. This conflated chat-message metadata with prompt-template variables and forced callers to stuff template values into a generic metadata bag.

This PR adds an explicit arguments parameter on chat() and carries the same field on ChatRequestEvent. ChatModelAction extracts it from the event and forwards to the setup on both round 1 and tool-response continuations, so multi-turn flows keep re-filling the template correctly.

ChatMessage.extra_args is unchanged and still carries externalId, STRUCTURED_OUTPUT, OpenAI refusal, Ollama reasoning, and other provider-specific metadata used by chat-model connections.

Tests

  • New positive test (Java + Python): chat(messages, arguments, parameters) formats the Prompt template using values from arguments.
  • New negative test (Java + Python): a ChatMessage with extra_args set no longer feeds the prompt template — proves the cutover.
  • New multi-turn test (Java + Python): two consecutive chat() invocations with the same arguments re-fill the template each time.
  • New action-layer regression test (Java + Python): processChatRequestOrToolResponse with a ToolResponseEvent extracts the persisted arguments from the saved tool-request context and forwards to chat_model.chat(...) on round 2 — locks the multi-turn contract.
  • Bridge test updated: PythonChatModelSetupTest.testChat asserts "arguments" flows into Pemja kwargs marshalled to Python.
  • Existing migrated tests in test_built_in_actions.py, built_in_action_async_execution_test.py, and e2e_tests_mcp/mcp_test.py preserve their outcome assertions post-cutover — proves behavior parity.
  • Full Java + Python test sweeps green (./tools/ut.sh -j; uv run pytest: 509 passed / 13 skipped).
  • ./tools/lint.sh -c clean.

API

Yes — this is a breaking API change:

  • Java BaseChatModelSetup: the existing 2-arg chat(List<ChatMessage>, Map<String, Object> parameters) overload is removed (it would erase to the same signature as a hypothetical chat(messages, arguments)). The new primary form is chat(List<ChatMessage>, Map<String, Object> arguments, Map<String, Object> parameters). The 1-arg chat(List<ChatMessage>) convenience overload is retained.
  • Python BaseChatModelSetup.chat: arguments: Mapping[str, Any] | None = None is added between messages and **kwargs.
  • Java ChatRequestEvent: new 4-arg constructor (String model, List<ChatMessage> messages, @Nullable Map<String, Object> arguments, @Nullable Object outputSchema). Existing 2-arg and 3-arg legacy constructors continue to work (delegating with empty arguments).
  • Python ChatRequestEvent.__init__: arguments: Dict[str, Any] | None = None added between messages and output_schema.

Migration: callers that previously set template variables via ChatMessage.extra_args should move them to ChatRequestEvent.arguments. All in-repo callers (3 Java examples, 3 Python examples, 2 e2e tests, 1 runtime test) are migrated in this PR.

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

…s in BaseChatModelSetup.chat()

`BaseChatModelSetup.chat()` previously filled prompt templates by flattening
every input message's `extra_args` into a single map. This conflated chat
metadata with template variables. Introduce an explicit `arguments`
parameter on `chat()` (Java + Python) and carry the same field on
`ChatRequestEvent`, then thread it through `ChatModelAction` to the setup
on both round 1 and tool-response continuations so multi-turn flows keep
re-filling the template correctly.

`ChatMessage.extra_args` is unchanged and still carries `externalId`,
`STRUCTURED_OUTPUT`, OpenAI `refusal`, Ollama `reasoning`, and other
provider-specific metadata used by chat-model connections.

Closes apache#220
@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels May 22, 2026
@weiqingy
Copy link
Copy Markdown
Collaborator Author

Looks like the 1 failing check is unrelated to this PR:

  • it-python [ubuntu-latest] [java-17] [python-3.12] [flink-2.1]: LLM-output nondeterminism in real-Ollama e2e testtest_react_agent_on_local_runner fails at _generate_structured_output (chat_model_action.py:236) because the local qwen3:1.7b returned a stray tool-call JSON instead of {"result": N}; all 3 retries produced different malformed responses. The same test passes on 3 sibling matrix slots in the same CI run (Python 3.11 + Flink 1.20, Python 3.11 + Flink 2.0, Python 3.12 + Flink 2.2), and this PR doesn't modify _generate_structured_output or the schema-validation path. Evidence: failing job log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant