docs: Codex integration design by haoshan98 · Pull Request #68 · vllm-project/agentic-api

haoshan98 · 2026-06-19T10:10:51Z

No description provided.

ashwing

The Codex compatibility concepts here — tool registry, client vs gateway ownership, normalize for upstream, loop decision with client action — overlap significantly with #67 which proposes a generic tool framework covering all tool types. The registry, normalize pipeline, and RequiresAction loop variant are the same infrastructure. Worth coordinating so we don't end up with two parallel type systems solving the same routing problem.

Separately, this reads better as a standalone docs/design/codex-integration.md. The core-public-api.md doc is specifically about the executor loop phases and its ownership is already defined in the header.

ashwing · 2026-06-23T20:27:16Z

+
+```rust
+#[derive(Debug, Clone, Serialize, Deserialize)]
+#[serde(untagged)]


untagged with a Known/Unknown wrapper is fragile — serde tries variants in declaration order and takes the first successful parse. Debugging deserialization mismatches becomes painful. #[serde(tag = "type")] with a catch-all variant is more explicit and gives clearer errors when a new type appears. See #67 for that approach.

ashwing · 2026-06-23T20:27:25Z

+    pub defer_loading: bool,
+}
+
+pub struct CodexNamespaceTool {


namespace, tool_search, and custom are Codex CLI internals — they are not part of the OpenAI Responses API spec. Baking them as first-class types in agentic-core couples us to one client's implementation details. Alternative: accept via a catch-all variant, normalize to function tools for inference, and preserve the raw shape in the registry for passthrough on response. That way we don't break if Codex changes their internal wire format.

ashwing · 2026-06-23T20:27:33Z

+The current `ResponsesTool = FunctionTool` alias is too narrow for Codex. Replace it with a tagged tool enum that
+preserves unknown shapes while giving Codex-used variants first-class names.
+
+Do not implement this as `#[serde(tag = "type")]` wrapping the existing `FunctionTool` struct directly, because


The issue described here — "FunctionTool already stores the wire type field" — is a fixable implementation detail, not a reason to avoid tagged enums. The fix is to remove the redundant type_: String from the payload struct (serde consumes the tag during deserialization, so it shouldn't also be a field). #67 takes this approach and it works cleanly with #[serde(tag = "type")].

ashwing · 2026-06-23T20:27:41Z

+    pub registry: ToolRegistry,
+}
+
+pub struct ToolName {


Namespace splitting adds complexity the gateway likely doesn't need. vLLM emits flat function names — the model doesn't know about namespaces. If Codex organizes tools into namespaces client-side, the gateway can preserve the original declaration verbatim in the registry and return it on requires_action without parsing or reconstructing namespace structure. Simpler and less coupled to Codex's naming convention.

ashwing · 2026-06-23T20:27:49Z

+```rust
+pub enum LoopDecision {
+    Continue(Vec<InputItem>),
+    RequiresClientAction(Vec<OutputItem>),


Vec<OutputItem> is too broad here. The only items that require client action are function calls — not messages, not reasoning. Using Vec<FunctionToolCall> makes the type system enforce this constraint rather than relying on callers to only put the right variants in.

ashwing · 2026-06-23T20:27:56Z

+    pub parameters: Option<Value>,
+    pub strict: Option<bool>,
+    #[serde(default)]
+    pub defer_loading: bool,


defer_loading appears as a named field but there's no description of what the gateway does with it. If it's a client hint the gateway ignores, it should live in the catch-all raw Value rather than as a typed field — otherwise it implies the gateway acts on it.

ashwing · 2026-06-23T20:28:03Z

+configuration:
+
+```toml
+[model_aliases]


Model aliases are practical, but the issue #54 context suggests this also changes approval behavior (auto-review = skip human confirmation). Conflating model routing with approval semantics in a single alias is surprising — a request to codex-auto-review silently changes safety behavior. If both are needed, they should be separate config knobs.

maralbahari · 2026-06-24T09:29:19Z

The Codex compatibility concepts here — tool registry, client vs gateway ownership, normalize for upstream, loop decision with client action — overlap significantly with #67 which proposes a generic tool framework covering all tool types. The registry, normalize pipeline, and RequiresAction loop variant are the same infrastructure. Worth coordinating so we don't end up with two parallel type systems solving the same routing problem.

Separately, this reads better as a standalone docs/design/codex-integration.md. The core-public-api.md doc is specifically about the executor loop phases and its ownership is already defined in the header.

@haoshan98 it would be good to have separate docs/design/codex-integration.md for codex integration.

@ashwing note that the codex integration is from tool angle using codex for tool calls. regarding the overlap what is shared between any tools are execution in the execution_loop. however at this stage for tool implementation we can have a generic interface Tool where each would expose common trait like normalize and execute this way each contributors/developer they can implement their tools without interfering and overlapping then after all components are in we add them into execution_loop.
for instance currently we are handling the stream events by categorizing the items into reasoning, function_call and message since they're being added into a buffer we can use the interface I just mentioned above to identify the tool type then call tool.execute().
and we would need tool.normalize() after receiving the RequestPayload to get user selected tools filter the tool options normalize before call_inference().
while the execution_loop and LoopDecision involves orchestrating the all tools on flight and live streaming. while currently we are buffering the inflight item responses for us to add support each tool types at a time and verify their execution and correctness in execution() (in agentic-core/executor/engine.rs) and ResponsesAcumulator.process_event() one all that done then we implement the whole live orchestration with execution_loop and LoopDecision.
codex as tool call would be mostly use to bridge the gap of agentic-api tool execution and we do not own it is considered as custom tool. since we do not have any support of any function_call execution for fastest prototype we use use codex to demo agentic-api with tool call until we start implementing agentic-api built-in tools for file_search, code_interpreter etc.

@haoshan98 @ashwing let me know your thoughts and feedback on this approach. then @haoshan98 in a separate docs/design/codex-integration.md would write a more specific implementation that would be implement by her for CustomCodexTool and what components would be shared with design in #67

ashwing · 2026-06-25T06:45:43Z

@maralbahari @haoshan98 Agree with this layered approach — it maps cleanly to what #67 proposes. A few thoughts from the tool framework design side:

On the Tool trait with normalize() and execute(): This is exactly the ToolHandler trait in #67. The interface is validate → discover → normalize → execute → emit. Each tool type (Codex custom tool, MCP, web_search, etc.) implements that trait independently without touching the executor. Haoshan's CustomCodexTool would be one ToolHandler impl — discovery returns Codex's tool manifest, normalize flattens to function defs for vLLM, execute dispatches to Codex's sandbox.

On phased implementation: Agree we don't need ContinuePartial or live orchestration in phase 1. The order I'd propose (aligns with what you described):

Tool types + registry (PR A from docs: tool framework design — multi-type tool support #67's shipping plan) — ResponsesTool enum, ToolRegistry, ToolHandler trait skeleton, FunctionHandler as the trivial passthrough
Simple execution() dispatch — registry lookup determines gateway vs client routing, returns requires_action for function tools
Codex handler (haoshan) + MCP handler — first real execute() impls against the trait
Full execution_loop + LoopDecision with streaming — the live orchestration layer

On Codex as "custom tool" / bridge: Makes sense as a stopgap. From #67's perspective, Codex is just another ToolHandler that happens to route execute() to an external agent runtime. It slots in at step 3 above without any special-casing in the executor.

On the shared interface: The pieces both PRs need are identical — ToolRegistry for routing, normalize() for the vLLM boundary, execute() for dispatch. Happy to have #67 land the trait + registry skeleton, then haoshan builds CustomCodexTool against it. Avoids two parallel type systems.

The multi-turn tool call cassettes are already recorded (PR #77, now merged) — 12 cassettes covering linear, streaming, branching, parallel, and tool-output-only patterns against both vLLM and OpenAI. Those can feed directly into the simple execution_loop implementation at step 2.

haoshan98 · 2026-06-25T11:07:23Z

Thanks for the thoughtful feedbacks @ashwing @maralbahari and for clarifying the overlap with #67.

I agree that the generic tool framework should own the formal normalization, registry, ownership classification, execution, and loop-decision abstractions. I do not want this PR to create a parallel tool framework.

I updated the Codex design to frame this PR as a small MVP compatibility slice: accept and preserve Codex-used Responses tool shapes, preserve namespaced/function/custom/tool-search call items through request/response/storage paths, support previous_response_id continuation, and add model alias routing for vLLM-backed models.

The helper types in this PR are intentionally lightweight. They document the Codex requirements and let us verify the wire-shape/continuation behavior now, but #67 should formalize the canonical ToolHandler, registry, normalization lifecycle, ownership model, and requires-action loop decision. Once #67 lands, this slice can plug into or be refactored onto those shared abstractions.

franciscojavierarceo · 2026-06-25T15:15:36Z

+This PR should do only the minimum needed for Codex compatibility:
+
+- Add this standalone design doc.
+- Accept Codex-used tool declarations without rejecting requests.


probably log the tool call so we can create an issue for it

@franciscojavierarceo by log do you mean like record a cassettes with some request, response payloads using codex cli?

oh no i meant we can emit a log on the server so the inference request can re read and reused for debugging.

Signed-off-by: haoshan98 <haoshanw@gmail.com>

This reverts commit e1852fc. Signed-off-by: haoshan98 <haoshanw@gmail.com>

Signed-off-by: haoshan98 <haoshanw@gmail.com>

## Summary Implements PR A from the [tool framework design](docs/design/tool-framework.md): - **`ResponsesTool`** — replaces `pub type ResponsesTool = FunctionTool` alias with a `#[serde(tag = "type")]` tagged enum covering all five Responses API tool types (`function`, `mcp`, `web_search_preview`, `file_search`, `code_interpreter`). Wire-compatible: existing `{"type":"function",...}` requests deserialize unchanged. Marked `#[non_exhaustive]` for future API additions. - **`ToolRegistry`** — request-scoped `HashMap<name → ToolEntry>` built from `RequestPayload.tools` + any server-discovered tools. Provides `gateway_owned` / `client_owned` split for post-inference dispatch routing (PR B). - **`ToolHandler` trait** — `validate / discover / normalize / execute` lifecycle. `Send + Sync` for `Arc<dyn ToolHandler>`. Default `discover()` is a no-op; `execute()` must be implemented per handler type. - **`FunctionHandler`** — trivial `ToolHandler` impl for `type: "function"` tools. Normalizes to `FunctionTool`, never gateway-executes (client-owned). - **normalize pipeline** — `ResponsesTool::to_function_tool()` wired into `RequestPayload::to_upstream_request()`. vLLM always receives `Vec<FunctionTool>`, never raw `ResponsesTool` variants. Function tools pass through unchanged; gateway-managed types (`mcp`, `web_search_preview`, `file_search`, `code_interpreter`) are dropped with a `tracing::debug!` until their handlers land in PR B/C. - **`From<ToolOutput> for FunctionToolResultMessage`** — zero-friction conversion for PR B's dispatch loop. - `McpToolParam` includes `headers: Option<HashMap<String,String>>` for per-server auth forwarding. ### What vLLM sees after this PR | Client sends | vLLM receives | |---|---| | `{"type":"function","name":"foo",...}` | `{"type":"function","name":"foo",...}` ✅ unchanged | | `{"type":"mcp","server_label":"db",...}` | nothing — dropped until PR C | | `{"type":"web_search_preview"}` | nothing — dropped until PR C | Gateway-type tools are parsed, validated, and stored in `ResponseMetadata.effective_tools` for rehydration across turns — they just don't reach vLLM until real `normalize()` implementations land. No execution logic in this PR: no `dispatch_tools`, no `LoopDecision`, no changes to `engine.rs` / `rehydrate.rs` / `persist.rs`. Depends on: #64 (io types refactor — merged). Unblocks: PR B (dispatch), PR C (MCP handler), CustomCodexTool (#68). ## Test Plan ```bash cargo build --workspace # 0 errors cargo clippy --workspace --all-targets -- -D warnings # 0 warnings cargo test --workspace # 246 tests pass ``` Key new tests in `types::tools::tests`: - All 5 `ResponsesTool` variants round-trip through serde - `ToolRegistry` lookup, `gateway_owned` / `client_owned` split, `allowed_tools` filtering, duplicate-name warning, empty-name skip - `FunctionHandler::validate()` rejects empty names - `to_upstream_request()` with mixed tools — only function tool reaches vLLM - All-gateway tools → `tools: null` (empty array never sent to vLLM) --------- Signed-off-by: Ashwin Giridharan <girida@amazon.com>

haoshan98 requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, maralbahari, noobHappylife, qandrew and tjtanaa as code owners June 19, 2026 10:10

haoshan98 force-pushed the designdocs-codex-integration branch from d32858a to e1852fc Compare June 22, 2026 03:26

ashwing reviewed Jun 23, 2026

View reviewed changes

franciscojavierarceo reviewed Jun 25, 2026

View reviewed changes

franciscojavierarceo approved these changes Jun 25, 2026

View reviewed changes

haoshan98 force-pushed the designdocs-codex-integration branch from 4ec3d54 to 2a6f2ab Compare June 26, 2026 03:25

haoshan98 added 3 commits June 26, 2026 03:29

Codex integration design

39c294b

Signed-off-by: haoshan98 <haoshanw@gmail.com>

Revert "Codex integration design"

6d7ce49

This reverts commit e1852fc. Signed-off-by: haoshan98 <haoshanw@gmail.com>

Codex compatibility layer

80fe882

Signed-off-by: haoshan98 <haoshanw@gmail.com>

haoshan98 force-pushed the designdocs-codex-integration branch from 2a6f2ab to 80fe882 Compare June 26, 2026 03:30

maralbahari merged commit 647e49d into vllm-project:main Jun 29, 2026
1 check passed

ashwing mentioned this pull request Jun 30, 2026

feat: tool types + registry + ToolHandler trait (PR A) #80

Merged

Uh oh!

Conversation

haoshan98 commented Jun 19, 2026

Uh oh!

ashwing left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashwing commented Jun 25, 2026

Uh oh!

haoshan98 commented Jun 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

maralbahari commented Jun 24, 2026 •

edited

Loading