[api] Specify model in ChatModelSetup (Java + Python) by weiqingy · Pull Request #685 · apache/flink-agents

weiqingy · 2026-05-17T06:00:21Z

Linked issue: #159

Purpose of change

Different ChatModelSetup implementations diverged in how they exposed the model parameter. On the Java side, every chat-model subclass except OllamaChatModelSetup already used the inherited BaseChatModelSetup.model field — Ollama re-declared its own private final String model shadow. On the Python side, BaseChatModelSetup had no model field at all, so every chat-model subclass redeclared model: str = Field(...) locally. This diverged from BaseEmbeddingModelSetup, which has owned the field at the base on both languages from the start.

This PR aligns ChatModelSetup with EmbeddingModelSetup:

Commit 1 (Java) — removes the redundant model field shadow in OllamaChatModelSetup. getParameters() now reads from the inherited BaseChatModelSetup.model field that was already populated by the same descriptor argument. Behavior is bit-for-bit identical; the redundant surface is gone.
Commit 2 (Python) — lifts model: str = Field(description="Name of the chat model to use.") to BaseChatModelSetup (required, no default — matching BaseEmbeddingModelSetup). Removes the now-redundant per-subclass Field redeclarations from Ollama, Anthropic, OpenAI, Tongyi, and AzureOpenAI. __init__ signatures, per-integration defaults, and super().__init__(model=model, ...) call sites are all preserved.

The cross-language bridge (runtime/java/java_chat_model.py) pops model from kwargs and forwards it to super().__init__(), symmetric with the existing connection handling. Test mocks and the cross-language compatibility fixture (agent_plan.json) are updated to satisfy the new required field.

Tests

New: OllamaChatModelSetupTest (Java) locks the invariant that getModel() and getParameters().get("model") both return the descriptor-supplied value after the shadow removal.
New: test_chat_model_base.py (Python) — a minimal subclass that omits the local model field inherits it from the base; missing model raises ValidationError.
Each Python integration test file gains test_model_field_roundtrip (5 files); Anthropic / OpenAI / Tongyi also gain test_default_model_when_omitted to confirm __init__-level defaults still apply post-refactor.
All existing tests stay green: ./tools/ut.sh -j (Java), cd python && uv run pytest flink_agents -k "not e2e_tests" (Python).
./tools/lint.sh -c and ./tools/check-license.sh pass.

API

BaseChatModelSetup (Python) now declares model: str as a required base-class field, matching BaseEmbeddingModelSetup. Public __init__ signatures of every chat-model subclass are unchanged. Descriptor argument key remains "model".

One behavior tightening, intentional and aligned with the embedding contract: calling <Subclass>.model_validate({"connection": "x"}) without a "model" key now raises pydantic.ValidationError instead of falling back to a subclass field-level default. Construction through __init__ (the normal path) is unaffected — per-integration defaults still apply.

Documentation

doc-needed
doc-not-needed
doc-included

OllamaChatModelSetup declared its own `private final String model` that duplicated `BaseChatModelSetup.model`. Both fields were populated from the same `"model"` descriptor argument and held identical values, but `getParameters()` happened to read the subclass shadow while the inherited `getModel()` read the base field — an asymmetry that diverged from every other Java chat-model setup and from the parallel `BaseEmbeddingModelSetup` contract. Delete the shadow and the redundant constructor assignment. The identifier `model` in `getParameters()` now resolves to the inherited `protected String model` field declared at `BaseChatModelSetup:43` and populated in the base constructor at line 57. Behavior is bit-for-bit identical; the redundant field surface is gone. Add `OllamaChatModelSetupTest` locking the invariant that `getModel()` and `getParameters().get("model")` both return the descriptor-supplied value, so any future regression in the inherited-field linkage surfaces in CI. Part of apache#159.

Python BaseChatModelSetup did not declare a `model` field at all — every chat-model subclass had to redeclare `model: str = Field(...)` locally, in contrast to BaseEmbeddingModelSetup which has owned the field at the base since day one. Closes the Python side of issue apache#159: the ChatModelSetup abstraction now specifies `model` consistently, aligned with EmbeddingModelSetup. Declare `model: str = Field(description="Name of the chat model to use.")` on BaseChatModelSetup right after `connection`, matching the field order in BaseEmbeddingModelSetup. The field is required (no default) — per-integration defaults stay in subclass `__init__` parameters where they already live. Remove the now-redundant `model: str = Field(...)` redeclarations from the five Python chat-model subclasses (Ollama, Anthropic, OpenAI, Tongyi, AzureOpenAI). `__init__` signatures, defaults, and `super().__init__(model=model, ...)` call sites are untouched, so external callers see the same surface. The bridge wrapper `runtime/java/java_chat_model.py` now pops `model` from kwargs and forwards it to `super().__init__()`, symmetric with the existing `connection` handling — required because the base field is now required. Lock the new invariant with `test_chat_model_base.py`: a minimal subclass that omits the local `model` field inherits it; missing `model` raises ValidationError. Add round-trip tests per integration and default-honoring tests where `__init__` provides a default (Anthropic, OpenAI, Tongyi). Update test mocks (`MockChatModelImpl`, `MockChatModel`, `SlowMockChatModel`, BaseChatModelSetup mocks in `test_token_metrics`, `test_built_in_actions`, `test_get_resource_in_action`) to pass `model="..."`. Sync the cross-language compatibility fixture `agent_plan.json` with a `"model"` argument. Update the ReActAgent docstring example so copy-paste recipes stay valid. Closes apache#159.

The Python compat test agent (python_agent_plan_compatibility_test_agent.py) now passes `connection="mock_connection"` and `model="mock-model"` when constructing its MockChatModel ResourceDescriptor — the latter required after this PR's base-class lift of `model` to BaseChatModelSetup. The Java compat test (CreateJavaAgentPlanFromJson.main) hand-builds the EXPECTED PythonResourceProvider and compares it against what's deserialized from the Python-generated JSON. The expected kwargs map was still `{name, prompt, tools}` — missing `connection` and `model` — so the equality check at line 208 failed in the cross-language CI job: expected: <{... CHAT_MODEL={chat_model=...@ec4dc5ff}}> but was: <{... CHAT_MODEL={chat_model=...@443ca67b}}> Add the two missing kwargs entries so the Java expectation mirrors what the Python side now generates.

wenjin272

LGTM

weiqingy added 2 commits May 16, 2026 22:05

github-actions Bot added doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels May 17, 2026

wenjin272 approved these changes May 18, 2026

View reviewed changes

wenjin272 merged commit 66225cd into apache:main May 18, 2026
45 of 46 checks passed

weiqingy deleted the 159-impl branch May 18, 2026 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[api] Specify model in ChatModelSetup (Java + Python)#685

[api] Specify model in ChatModelSetup (Java + Python)#685
wenjin272 merged 3 commits into
apache:mainfrom
weiqingy:159-impl

weiqingy commented May 17, 2026

Uh oh!

wenjin272 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

weiqingy commented May 17, 2026

Purpose of change

Tests

API

Documentation

Uh oh!

wenjin272 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants