feat: codex integration by haoshan98 · Pull Request #84 · vllm-project/agentic-api

haoshan98 · 2026-07-02T11:01:11Z

Codex CLI Tool Call Responses Compatibility

Summary

This PR adds Codex CLI compatibility to agentic-api as a stateful Responses API gateway in front of an
OpenAI-compatible vLLM endpoint.

The primary runtime path is:

Codex CLI -> agentic-api /v1 -> vLLM /v1 -> served model

A typical local validation path is:

Codex CLI -> agentic-api http://127.0.0.1:3018/v1 -> vLLM http://<vllm-host>:<port> -> served model

The gateway keeps Codex-facing request and response shapes stable while adapting tool declarations and tool calls to
the upstream model server's stricter function-tool surface. This lets Codex use agentic-api with both HTTP/SSE and
WebSocket Responses transports, while preserving stateful continuation through previous_response_id and stored
conversation history.

What Changed

Codex Responses Gateway

Added/merged Codex-compatible /v1/responses behavior for the latest main HTTP and WebSocket server paths.
Preserved latest main's more general tool framework and routed Codex-specific namespace flattening through that
framework instead of bypassing it.
Kept HTTP store=false requests without continuation IDs on the proxy path, with compatibility normalization for model
aliases, instructions, namespace tools, tool choice, and returned tool calls.
Kept HTTP store=true and WebSocket sessions on the stateful executor path with response storage, continuation, and
hydration.
Added optional model alias rewriting, for example:

codex-compatible=Qwen/Qwen3.6-35B-A3B

Tool Shape Support

The typed Responses request path now accepts the Codex tool shapes emitted by Codex CLI:

function
namespace
tool_search
custom
unknown/raw tool objects

Known tool types preserve extra fields, and unknown tool types remain raw JSON rather than being discarded. On the typed
stateful executor path, namespace members are flattened into upstream-safe function tools for vLLM. Non-function Codex
declarations remain accepted and preserved in metadata for continuations.

Namespace Flattening And Restoration

Codex namespace tools, such as MCP tools, are represented as grouped namespace/member declarations:

{
  "type": "namespace",
  "name": "mcp__agentic_fixture",
  "tools": [
    { "type": "function", "name": "add_numbers" }
  ]
}

To avoid upstreams that reserve or reject mcp__ function names, namespace members are flattened using an
upstream-visible name with an agentic-specific prefix:

agentic_ns__mcp__agentic_fixture__add_numbers

When the upstream model returns that flat tool call, agentic-api restores the Codex-facing shape:

{
  "type": "function_call",
  "namespace": "mcp__agentic_fixture",
  "name": "add_numbers"
}

The normalization layer also accepts observed variants where safe:

legacy dotted flat names such as mcp__agentic_fixture.add_numbers
underscore aliases such as mcp__agentic_fixture_add_numbers
unambiguous bare member names
namespace container calls when the intended member is unambiguous

Collision checks prevent unsafe flattening when a namespaced member would conflict with a top-level function name.

Tool Choice Preservation

tool_choice now preserves optional namespace information. Explicit namespaced choices are flattened only for upstream
requests and restored for Codex-facing response/storage paths.

Stateful Continuation

The executor stores the effective request metadata needed for continuations:

tools
tool choice
instructions
previous response linkage
conversation linkage

Later requests using previous_response_id or conversation_id hydrate the prior context and reuse the effective
Codex-compatible tool metadata unless the client explicitly overrides it.

Storage Rehydration Cleanup

Stored input/output item markers are used internally to avoid ambiguity, but raw rehydrated items strip internal markers
such as _agentic_item_kind before returning data to the client or forwarding hydrated history upstream.

Model Listing And Readiness

Supports the latest main model-list route by adapting /v1/models behavior for Codex clients while preserving
upstream proxy behavior where applicable.
Keeps skip_llm_ready_check support for the startup LLM readiness check used by hosted/OpenAI-compatible vLLM
endpoints where /health may not exist. The /ready route still keeps its existing upstream health probe semantics.

WebSocket Support

Codex can now be configured with:

[model_providers.agentic-local]
name = "agentic-api local"
base_url = "http://127.0.0.1:3018/v1"
wire_api = "responses"
supports_websockets = true

With supports_websockets = true, Codex uses the Responses WebSocket path when available. The gateway preserves the same
tool normalization, model aliasing, and stateful continuation semantics across HTTP/SSE and WebSocket transports.

Running A Codex Session

The PR includes scripts/codex-start-gateway.sh to start agentic-server against a configured vLLM endpoint and register
the Codex-facing model alias.

Start the gateway:

GATEWAY_PORT=3018 \
DATABASE_URL="sqlite:///tmp/agentic_api_codex_3018.db" \
V_API_BASE="http://<vllm-host>:<port>" \
V_API_KEY="" \
V_MODEL="Qwen/Qwen3.6-35B-A3B" \
./scripts/codex-start-gateway.sh

From the repo root, run a Codex HTTP/SSE session:

codex \
  --disable image_generation \
  -C "$PWD" \
  -m codex-compatible \
  -c model_reasoning_effort=low \
  -c model_provider=agentic-local \
  -c 'model_providers.agentic-local.name="agentic-api local"' \
  -c 'model_providers.agentic-local.base_url="http://127.0.0.1:3018/v1"' \
  -c 'model_providers.agentic-local.wire_api="responses"'

From the repo root, run a Codex WebSocket session:

codex \
  --disable image_generation \
  -C "$PWD" \
  -m codex-compatible \
  -c model_reasoning_effort=low \
  -c model_provider=agentic-local \
  -c 'model_providers.agentic-local.name="agentic-api local"' \
  -c 'model_providers.agentic-local.base_url="http://127.0.0.1:3018/v1"' \
  -c 'model_providers.agentic-local.wire_api="responses"' \
  -c model_providers.agentic-local.supports_websockets=true

Cassette Recording

This PR adds Codex cassette recording support under:

crates/agentic-core/tests/cassettes/

The main recorder wrapper is:

./crates/agentic-core/tests/cassettes/record_codex_cli_tool_call_cassettes.sh

It records YAML replay cassettes for the matrix covered by tests:

gateway HTTP/SSE function tool
gateway HTTP/SSE namespace tool
gateway WebSocket function tool
gateway WebSocket namespace tool
direct vLLM HTTP/SSE function tool
direct vLLM HTTP/SSE flattened namespace tool
OpenAI HTTPS/SSE function tool baseline
OpenAI HTTPS/SSE namespace tool baseline
OpenAI WebSocket function tool baseline
OpenAI WebSocket namespace tool baseline

The all target now regenerates every cassette used by the Codex cassette tests. Direct vLLM recording requires an
explicit VLLM_URL or V_API_BASE instead of defaulting to a private LAN endpoint.

Example:

OPENAI_API_KEY="$OPENAI_API_KEY" \
VLLM_URL="http://<vllm-host>:<port>" \
V_MODEL="Qwen/Qwen3.6-35B-A3B" \
GATEWAY_URL="http://127.0.0.1:3018" \
GATEWAY_CASSETTE_MODEL="Qwen/Qwen3.6-35B-A3B" \
./crates/agentic-core/tests/cassettes/record_codex_cli_tool_call_cassettes.sh all

Cassette Replay Tests

Replay and normalization tests were added for the recorded Codex cassettes:

crates/agentic-core/tests/accumulator_cassette_test.rs
- verifies gateway HTTP and WebSocket function/namespace responses
- verifies direct vLLM function and flat namespace upstream behavior
- verifies OpenAI HTTP and WebSocket function/namespace baselines
- verifies second-turn previous_response_id plus function_call_output continuation shape
crates/agentic-core/tests/tool_normalization_test.rs
- parses all recorded Codex request payloads as typed RequestPayload
- verifies WebSocket request bodies do not carry the HTTP stream field
- verifies namespace tools flatten to agentic_ns__mcp__agentic_fixture__add_numbers
- verifies direct vLLM flat namespace cassettes remain plain function tools

These tests make the Codex integration behavior replayable without requiring live OpenAI, vLLM, or Codex CLI access in
normal CI-style test runs.

Scope Boundaries

This PR does not implement:

gateway-side MCP execution
a full Codex runtime
hosted tools such as file search, web search, or code interpreter
a complete automatic tool-dispatch loop inside agentic-api
chat completions compatibility

Codex still owns MCP tool discovery and execution. agentic-api preserves, normalizes, forwards, stores, and restores
Responses API data so Codex can continue its loop correctly.

Test Plan

Formatting and diff checks:

cargo fmt -- --check
git diff --check
git diff --staged --check

Core Rust tests:

cargo test -p agentic-core
cargo test -p agentic-core --test accumulator_cassette_test --test tool_normalization_test

Additional checks run during review:

cargo clippy -p agentic-core --tests -- -D warnings
bash -n crates/agentic-core/tests/cassettes/record_codex_cli_tool_call_cassettes.sh
bash -n scripts/codex-start-gateway.sh
python3 -m py_compile crates/agentic-core/tests/cassettes/record_cassette.py

Cassette regeneration was run successfully with:

./crates/agentic-core/tests/cassettes/record_codex_cli_tool_call_cassettes.sh all

and cargo test passed afterward.

Manual live validation during development covered:

HTTP/SSE text path through agentic-api
WebSocket path through agentic-api with supports_websockets = true
MCP namespace tool round trip for mcp__agentic_fixture.add_numbers
direct vLLM baseline cassettes for upstream-visible function shapes
OpenAI baseline cassettes for original Responses API function and namespace behavior

Signed-off-by: haoshan98 <haoshanw@gmail.com>

franciscojavierarceo · 2026-07-02T22:12:36Z

awesome @haoshan98 let me know when this is ready for review

haoshan98 added 7 commits June 22, 2026 03:25

Codex integration design

e1852fc

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Codex integration

18abb8d

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Merge branch 'main' into codex-integration

4cc79b6

Add debug logging

a927272

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Test function tool type

58b3f62

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Updates

67d628f

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Cassette recordings and tests

15c564a

Signed-off-by: haoshan98 <haoshanw@gmail.com>

haoshan98 requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, maralbahari, noobHappylife, qandrew and tjtanaa as code owners July 2, 2026 11:01

haoshan98 changed the title ~~Codex integration~~ feat: codex integration Jul 2, 2026

make script executable

ef5a39a

Signed-off-by: haoshan98 <haoshanw@gmail.com>

haoshan98 marked this pull request as draft July 2, 2026 11:05

haoshan98 marked this pull request as ready for review July 2, 2026 12:50

haoshan98 marked this pull request as draft July 2, 2026 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: codex integration#84

feat: codex integration#84
haoshan98 wants to merge 8 commits into
vllm-project:mainfrom
EmbeddedLLM:codex-integration

haoshan98 commented Jul 2, 2026 •

edited

Loading

Uh oh!

franciscojavierarceo commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

haoshan98 commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codex CLI Tool Call Responses Compatibility

Summary

What Changed

Codex Responses Gateway

Tool Shape Support

Namespace Flattening And Restoration

Tool Choice Preservation

Stateful Continuation

Storage Rehydration Cleanup

Model Listing And Readiness

WebSocket Support

Running A Codex Session

Cassette Recording

Cassette Replay Tests

Scope Boundaries

Test Plan

Uh oh!

franciscojavierarceo commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haoshan98 commented Jul 2, 2026 •

edited

Loading