Skip to content

[Feature][Integration][Java] Add built-in support for Azure OpenAI Chat Model#695

Open
alnzng wants to merge 8 commits into
apache:mainfrom
alnzng:azure-openai-java
Open

[Feature][Integration][Java] Add built-in support for Azure OpenAI Chat Model#695
alnzng wants to merge 8 commits into
apache:mainfrom
alnzng:azure-openai-java

Conversation

@alnzng
Copy link
Copy Markdown
Collaborator

@alnzng alnzng commented May 20, 2026

Linked issue: #694

Purpose of change

Add Azure OpenAI as a first-class chat model provider for Java agents.

Tests

  • Run end-to-end test on top of a real Azure OpenAI model and verified the tests passed
  • Added new unit tests: AzureOpenAIChatModelConnectionTest and AzureOpenAIChatModelSetupTest.

API

  • Adds two new public resource classes (AzureOpenAIChatModelConnection, AzureOpenAIChatModelSetup) and two new constants on ResourceName.ChatModel`.

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

alnzng added 3 commits May 19, 2026 22:30
Adds AzureOpenAIChatModelConnection and AzureOpenAIChatModelSetup
built on the openai-java SDK's Azure client, registers the new
resource names, and wires the provider into the e2e ChatModel
integration test.

Extracts ChatMessage <-> OpenAI Chat Completions message conversion
from OpenAICompletionsConnection into a shared
OpenAIChatCompletionsUtils helper so both the OpenAI and Azure
OpenAI connections reuse the same conversion logic.
@github-actions github-actions Bot added doc-label-missing The Bot applies this label either because none or multiple labels were provided. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. doc-not-needed Your PR changes do not impact docs and removed doc-label-missing The Bot applies this label either because none or multiple labels were provided. labels May 20, 2026
@alnzng
Copy link
Copy Markdown
Collaborator Author

alnzng commented May 20, 2026

@wenjin272 @xintongsong @weiqingy please help take a look this PR, thanks.

Copy link
Copy Markdown
Collaborator

@wenjin272 wenjin272 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @alnzng, Ty for contributing this integration. I left server comments.

In addition, I believe we need to update the documentation by removing references to Python-only implementations and adding Java parameters and examples.

throw new IllegalArgumentException("azure_endpoint should not be null or empty.");
}

Integer timeoutSeconds = descriptor.getArgument("timeout");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAiCompletionsConnection validates the boundaries for timeout and maxRetries, so we can align with that here.

params.put("max_tokens", maxTokens);
}
if (additionalKwargs != null && !additionalKwargs.isEmpty()) {
params.putAll(additionalKwargs);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

putAll(additionalKwargs) runs after the first-class fields, so a colliding key (temperature, model, max_tokens, ...) silently overwrites the validated value. For example, additional_kwargs {"temperature": 5.0} bypasses the constructor's 0.0–2.0 range check.

It appears that the Python side has the same issue.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated and use a consistent logic from OpenAI / Anthropic model to handle additionalKwargs .

if (modelOfAzureDeployment != null
&& !modelOfAzureDeployment.isBlank()
&& completion.usage().isPresent()) {
recordTokenMetrics(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user has not configured modelOfAzureDeployment, we will not record token metrics at all, which may result in a poor user experience. I am wondering whether we can use the model as a fallback.

The same issue exists on the Python side as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure OpenAI APIs have different setup on the model, the model means a Azure deployment name, not a real LLM name. We have discussed this in the past: #478 (comment)

Fallback to model will confuse the user, I would prefer to keep current behavior. One thing we can improve is that we can call out the in the doc that modelOfAzureDeployment is required if the users want to see token metrics.

WDYT?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot our previous discussion. I think it’s reasonable to clarify this matter in the documentation.

@wenjin272 wenjin272 added doc-needed Your PR changes impact docs. and removed doc-not-needed Your PR changes do not impact docs labels May 21, 2026
@alnzng
Copy link
Copy Markdown
Collaborator Author

alnzng commented May 21, 2026

@wenjin272 PR is updated based on your comments.

(Map<String, Object>) mutableArgs.remove("additional_kwargs");
if (additionalKwargs != null) {
for (Map.Entry<String, Object> entry : additionalKwargs.entrySet()) {
builder.putAdditionalBodyProperty(
Copy link
Copy Markdown
Collaborator

@weiqingy weiqingy May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Setup-side fix (nesting additional_kwargs under its own key) closes the obvious case, but here in the connection we then forward those kwargs via builder.putAdditionalBodyProperty(...). The openai-java SDK serializes typed fields and additional body properties into the same JSON request body, so a caller can still pass e.g. additional_kwargs={"temperature": 5.0} alongside the typed temperature=0.3 and end up with both keys in the body (or one silently winning depending on SDK serialization order), bypassing the [0.0, 2.0] range check.

Should we detect collisions against the typed fields we already pop here (model, temperature, max_tokens, logprobs) and throw an IllegalArgumentException? That would keep the validation contract intact regardless of how the SDK merges the body.

The Python side has the same shape with **kwargs, **additional_kwargs (last one wins via dict expansion order) — should the fix apply on both sides?

* {@code message.refusal()} is written as {@code extraArgs["refusal"]} when present, preserving
* prior Java behavior.
*/
public static ChatMessage convertFromOpenAIMessage(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extraArgs parameter is dead today — both call sites (this one and AzureOpenAIChatModelConnection.chat) pass Map.of(). Also, the body copies caller args first (response.getExtraArgs().putAll(extraArgs)) and then writes refusal after, so a caller-supplied "refusal" key would be silently overwritten by message.refusal(). The precedence isn't obvious from the signature.

Should we drop the parameter until there's a real caller that needs it? That would keep the API surface minimal and avoid locking in surprising precedence. Or, if we want to keep it as a forward-looking hook, should we document the override behavior in the Javadoc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-needed Your PR changes impact docs. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants