docs: Codex integration design#68
Conversation
d32858a to
e1852fc
Compare
ashwing
left a comment
There was a problem hiding this comment.
The Codex compatibility concepts here — tool registry, client vs gateway ownership, normalize for upstream, loop decision with client action — overlap significantly with #67 which proposes a generic tool framework covering all tool types. The registry, normalize pipeline, and RequiresAction loop variant are the same infrastructure. Worth coordinating so we don't end up with two parallel type systems solving the same routing problem.
Separately, this reads better as a standalone docs/design/codex-integration.md. The core-public-api.md doc is specifically about the executor loop phases and its ownership is already defined in the header.
|
|
||
| ```rust | ||
| #[derive(Debug, Clone, Serialize, Deserialize)] | ||
| #[serde(untagged)] |
There was a problem hiding this comment.
untagged with a Known/Unknown wrapper is fragile — serde tries variants in declaration order and takes the first successful parse. Debugging deserialization mismatches becomes painful. #[serde(tag = "type")] with a catch-all variant is more explicit and gives clearer errors when a new type appears. See #67 for that approach.
| pub defer_loading: bool, | ||
| } | ||
|
|
||
| pub struct CodexNamespaceTool { |
There was a problem hiding this comment.
namespace, tool_search, and custom are Codex CLI internals — they are not part of the OpenAI Responses API spec. Baking them as first-class types in agentic-core couples us to one client's implementation details. Alternative: accept via a catch-all variant, normalize to function tools for inference, and preserve the raw shape in the registry for passthrough on response. That way we don't break if Codex changes their internal wire format.
| The current `ResponsesTool = FunctionTool` alias is too narrow for Codex. Replace it with a tagged tool enum that | ||
| preserves unknown shapes while giving Codex-used variants first-class names. | ||
|
|
||
| Do not implement this as `#[serde(tag = "type")]` wrapping the existing `FunctionTool` struct directly, because |
There was a problem hiding this comment.
The issue described here — "FunctionTool already stores the wire type field" — is a fixable implementation detail, not a reason to avoid tagged enums. The fix is to remove the redundant type_: String from the payload struct (serde consumes the tag during deserialization, so it shouldn't also be a field). #67 takes this approach and it works cleanly with #[serde(tag = "type")].
| pub registry: ToolRegistry, | ||
| } | ||
|
|
||
| pub struct ToolName { |
There was a problem hiding this comment.
Namespace splitting adds complexity the gateway likely doesn't need. vLLM emits flat function names — the model doesn't know about namespaces. If Codex organizes tools into namespaces client-side, the gateway can preserve the original declaration verbatim in the registry and return it on requires_action without parsing or reconstructing namespace structure. Simpler and less coupled to Codex's naming convention.
| ```rust | ||
| pub enum LoopDecision { | ||
| Continue(Vec<InputItem>), | ||
| RequiresClientAction(Vec<OutputItem>), |
There was a problem hiding this comment.
Vec<OutputItem> is too broad here. The only items that require client action are function calls — not messages, not reasoning. Using Vec<FunctionToolCall> makes the type system enforce this constraint rather than relying on callers to only put the right variants in.
| pub parameters: Option<Value>, | ||
| pub strict: Option<bool>, | ||
| #[serde(default)] | ||
| pub defer_loading: bool, |
There was a problem hiding this comment.
defer_loading appears as a named field but there's no description of what the gateway does with it. If it's a client hint the gateway ignores, it should live in the catch-all raw Value rather than as a typed field — otherwise it implies the gateway acts on it.
| configuration: | ||
|
|
||
| ```toml | ||
| [model_aliases] |
There was a problem hiding this comment.
Model aliases are practical, but the issue #54 context suggests this also changes approval behavior (auto-review = skip human confirmation). Conflating model routing with approval semantics in a single alias is surprising — a request to codex-auto-review silently changes safety behavior. If both are needed, they should be separate config knobs.
@haoshan98 it would be good to have separate @ashwing note that the codex integration is from tool angle using codex for tool calls. regarding the overlap what is shared between any tools are execution in the @haoshan98 @ashwing let me know your thoughts and feedback on this approach. then @haoshan98 in a separate |
|
@maralbahari @haoshan98 Agree with this layered approach — it maps cleanly to what #67 proposes. A few thoughts from the tool framework design side: On the On phased implementation: Agree we don't need
On Codex as "custom tool" / bridge: Makes sense as a stopgap. From #67's perspective, Codex is just another On the shared interface: The pieces both PRs need are identical — The multi-turn tool call cassettes are already recorded (PR #77, now merged) — 12 cassettes covering linear, streaming, branching, parallel, and tool-output-only patterns against both vLLM and OpenAI. Those can feed directly into the simple |
|
Thanks for the thoughtful feedbacks @ashwing @maralbahari and for clarifying the overlap with #67. I agree that the generic tool framework should own the formal normalization, registry, ownership classification, execution, and loop-decision abstractions. I do not want this PR to create a parallel tool framework. I updated the Codex design to frame this PR as a small MVP compatibility slice: accept and preserve Codex-used Responses tool shapes, preserve namespaced/function/custom/tool-search call items through request/response/storage paths, support The helper types in this PR are intentionally lightweight. They document the Codex requirements and let us verify the wire-shape/continuation behavior now, but #67 should formalize the canonical |
| This PR should do only the minimum needed for Codex compatibility: | ||
|
|
||
| - Add this standalone design doc. | ||
| - Accept Codex-used tool declarations without rejecting requests. |
There was a problem hiding this comment.
probably log the tool call so we can create an issue for it
There was a problem hiding this comment.
@franciscojavierarceo by log do you mean like record a cassettes with some request, response payloads using codex cli?
There was a problem hiding this comment.
oh no i meant we can emit a log on the server so the inference request can re read and reused for debugging.
4ec3d54 to
2a6f2ab
Compare
Signed-off-by: haoshan98 <haoshanw@gmail.com>
This reverts commit e1852fc. Signed-off-by: haoshan98 <haoshanw@gmail.com>
Signed-off-by: haoshan98 <haoshanw@gmail.com>
2a6f2ab to
80fe882
Compare
## Summary
Implements PR A from the [tool framework
design](docs/design/tool-framework.md):
- **`ResponsesTool`** — replaces `pub type ResponsesTool = FunctionTool`
alias with a `#[serde(tag = "type")]` tagged enum covering all five
Responses API tool types (`function`, `mcp`, `web_search_preview`,
`file_search`, `code_interpreter`). Wire-compatible: existing
`{"type":"function",...}` requests deserialize unchanged. Marked
`#[non_exhaustive]` for future API additions.
- **`ToolRegistry`** — request-scoped `HashMap<name → ToolEntry>` built
from `RequestPayload.tools` + any server-discovered tools. Provides
`gateway_owned` / `client_owned` split for post-inference dispatch
routing (PR B).
- **`ToolHandler` trait** — `validate / discover / normalize / execute`
lifecycle. `Send + Sync` for `Arc<dyn ToolHandler>`. Default
`discover()` is a no-op; `execute()` must be implemented per handler
type.
- **`FunctionHandler`** — trivial `ToolHandler` impl for `type:
"function"` tools. Normalizes to `FunctionTool`, never gateway-executes
(client-owned).
- **normalize pipeline** — `ResponsesTool::to_function_tool()` wired
into `RequestPayload::to_upstream_request()`. vLLM always receives
`Vec<FunctionTool>`, never raw `ResponsesTool` variants. Function tools
pass through unchanged; gateway-managed types (`mcp`,
`web_search_preview`, `file_search`, `code_interpreter`) are dropped
with a `tracing::debug!` until their handlers land in PR B/C.
- **`From<ToolOutput> for FunctionToolResultMessage`** — zero-friction
conversion for PR B's dispatch loop.
- `McpToolParam` includes `headers: Option<HashMap<String,String>>` for
per-server auth forwarding.
### What vLLM sees after this PR
| Client sends | vLLM receives |
|---|---|
| `{"type":"function","name":"foo",...}` |
`{"type":"function","name":"foo",...}` ✅ unchanged |
| `{"type":"mcp","server_label":"db",...}` | nothing — dropped until PR
C |
| `{"type":"web_search_preview"}` | nothing — dropped until PR C |
Gateway-type tools are parsed, validated, and stored in
`ResponseMetadata.effective_tools` for rehydration across turns — they
just don't reach vLLM until real `normalize()` implementations land.
No execution logic in this PR: no `dispatch_tools`, no `LoopDecision`,
no changes to `engine.rs` / `rehydrate.rs` / `persist.rs`.
Depends on: #64 (io types refactor — merged). Unblocks: PR B (dispatch),
PR C (MCP handler), CustomCodexTool (#68).
## Test Plan
```bash
cargo build --workspace # 0 errors
cargo clippy --workspace --all-targets -- -D warnings # 0 warnings
cargo test --workspace # 246 tests pass
```
Key new tests in `types::tools::tests`:
- All 5 `ResponsesTool` variants round-trip through serde
- `ToolRegistry` lookup, `gateway_owned` / `client_owned` split,
`allowed_tools` filtering, duplicate-name warning, empty-name skip
- `FunctionHandler::validate()` rejects empty names
- `to_upstream_request()` with mixed tools — only function tool reaches
vLLM
- All-gateway tools → `tools: null` (empty array never sent to vLLM)
---------
Signed-off-by: Ashwin Giridharan <girida@amazon.com>
No description provided.