[api] Specify model in ChatModelSetup (Java + Python)#685
Merged
Conversation
OllamaChatModelSetup declared its own `private final String model`
that duplicated `BaseChatModelSetup.model`. Both fields were populated
from the same `"model"` descriptor argument and held identical values,
but `getParameters()` happened to read the subclass shadow while the
inherited `getModel()` read the base field — an asymmetry that diverged
from every other Java chat-model setup and from the parallel
`BaseEmbeddingModelSetup` contract.
Delete the shadow and the redundant constructor assignment. The
identifier `model` in `getParameters()` now resolves to the inherited
`protected String model` field declared at `BaseChatModelSetup:43` and
populated in the base constructor at line 57. Behavior is bit-for-bit
identical; the redundant field surface is gone.
Add `OllamaChatModelSetupTest` locking the invariant that `getModel()`
and `getParameters().get("model")` both return the descriptor-supplied
value, so any future regression in the inherited-field linkage surfaces
in CI.
Part of apache#159.
Python BaseChatModelSetup did not declare a `model` field at all — every chat-model subclass had to redeclare `model: str = Field(...)` locally, in contrast to BaseEmbeddingModelSetup which has owned the field at the base since day one. Closes the Python side of issue apache#159: the ChatModelSetup abstraction now specifies `model` consistently, aligned with EmbeddingModelSetup. Declare `model: str = Field(description="Name of the chat model to use.")` on BaseChatModelSetup right after `connection`, matching the field order in BaseEmbeddingModelSetup. The field is required (no default) — per-integration defaults stay in subclass `__init__` parameters where they already live. Remove the now-redundant `model: str = Field(...)` redeclarations from the five Python chat-model subclasses (Ollama, Anthropic, OpenAI, Tongyi, AzureOpenAI). `__init__` signatures, defaults, and `super().__init__(model=model, ...)` call sites are untouched, so external callers see the same surface. The bridge wrapper `runtime/java/java_chat_model.py` now pops `model` from kwargs and forwards it to `super().__init__()`, symmetric with the existing `connection` handling — required because the base field is now required. Lock the new invariant with `test_chat_model_base.py`: a minimal subclass that omits the local `model` field inherits it; missing `model` raises ValidationError. Add round-trip tests per integration and default-honoring tests where `__init__` provides a default (Anthropic, OpenAI, Tongyi). Update test mocks (`MockChatModelImpl`, `MockChatModel`, `SlowMockChatModel`, BaseChatModelSetup mocks in `test_token_metrics`, `test_built_in_actions`, `test_get_resource_in_action`) to pass `model="..."`. Sync the cross-language compatibility fixture `agent_plan.json` with a `"model"` argument. Update the ReActAgent docstring example so copy-paste recipes stay valid. Closes apache#159.
The Python compat test agent (python_agent_plan_compatibility_test_agent.py)
now passes `connection="mock_connection"` and `model="mock-model"` when
constructing its MockChatModel ResourceDescriptor — the latter required
after this PR's base-class lift of `model` to BaseChatModelSetup.
The Java compat test (CreateJavaAgentPlanFromJson.main) hand-builds the
EXPECTED PythonResourceProvider and compares it against what's
deserialized from the Python-generated JSON. The expected kwargs map
was still `{name, prompt, tools}` — missing `connection` and `model` —
so the equality check at line 208 failed in the cross-language CI job:
expected: <{... CHAT_MODEL={chat_model=...@ec4dc5ff}}>
but was: <{... CHAT_MODEL={chat_model=...@443ca67b}}>
Add the two missing kwargs entries so the Java expectation mirrors what
the Python side now generates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Linked issue: #159
Purpose of change
Different
ChatModelSetupimplementations diverged in how they exposed themodelparameter. On the Java side, every chat-model subclass exceptOllamaChatModelSetupalready used the inheritedBaseChatModelSetup.modelfield — Ollama re-declared its ownprivate final String modelshadow. On the Python side,BaseChatModelSetuphad nomodelfield at all, so every chat-model subclass redeclaredmodel: str = Field(...)locally. This diverged fromBaseEmbeddingModelSetup, which has owned the field at the base on both languages from the start.This PR aligns
ChatModelSetupwithEmbeddingModelSetup:modelfield shadow inOllamaChatModelSetup.getParameters()now reads from the inheritedBaseChatModelSetup.modelfield that was already populated by the same descriptor argument. Behavior is bit-for-bit identical; the redundant surface is gone.model: str = Field(description="Name of the chat model to use.")toBaseChatModelSetup(required, no default — matchingBaseEmbeddingModelSetup). Removes the now-redundant per-subclassFieldredeclarations from Ollama, Anthropic, OpenAI, Tongyi, and AzureOpenAI.__init__signatures, per-integration defaults, andsuper().__init__(model=model, ...)call sites are all preserved.The cross-language bridge (
runtime/java/java_chat_model.py) popsmodelfrom kwargs and forwards it tosuper().__init__(), symmetric with the existingconnectionhandling. Test mocks and the cross-language compatibility fixture (agent_plan.json) are updated to satisfy the new required field.Tests
OllamaChatModelSetupTest(Java) locks the invariant thatgetModel()andgetParameters().get("model")both return the descriptor-supplied value after the shadow removal.test_chat_model_base.py(Python) — a minimal subclass that omits the localmodelfield inherits it from the base; missingmodelraisesValidationError.test_model_field_roundtrip(5 files); Anthropic / OpenAI / Tongyi also gaintest_default_model_when_omittedto confirm__init__-level defaults still apply post-refactor../tools/ut.sh -j(Java),cd python && uv run pytest flink_agents -k "not e2e_tests"(Python)../tools/lint.sh -cand./tools/check-license.shpass.API
BaseChatModelSetup(Python) now declaresmodel: stras a required base-class field, matchingBaseEmbeddingModelSetup. Public__init__signatures of every chat-model subclass are unchanged. Descriptor argument key remains"model".One behavior tightening, intentional and aligned with the embedding contract: calling
<Subclass>.model_validate({"connection": "x"})without a"model"key now raisespydantic.ValidationErrorinstead of falling back to a subclass field-level default. Construction through__init__(the normal path) is unaffected — per-integration defaults still apply.Documentation
doc-neededdoc-not-neededdoc-included