feat: add Codex-compatible GET v1/models and WebSocket session bug fixes#79
Merged
franciscojavierarceo merged 2 commits intoJun 30, 2026
Conversation
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>
…t requests instead of closing connection Signed-off-by: maral <maralbahari.98@gmail.com>
v1/models and split handler into modulesv1/models and WebSocket session bug fixes
franciscojavierarceo
approved these changes
Jun 30, 2026
ashwing
added a commit
to ashwing/agentic-api
that referenced
this pull request
Jun 30, 2026
Rebased on main after vllm-project#79 and vllm-project#81 merged. Adds src/tool/ — the behavioral layer that complements the wire types already in types/tools/ (merged via PR vllm-project#79): - tool/handler.rs — ToolHandler trait, ToolOutput, ToolError - tool/registry.rs — ToolType, ToolEntry, ToolRegistry::build/lookup/etc - tool/function.rs — FunctionHandler + From<&FunctionToolParam> for FunctionTool - tool/normalize.rs — ResponsesTool::to_function_tool(), From<ToolOutput> Also adds types/tools/ wire types (params.rs with ResponsesTool enum, param structs, NonEmptyToolName), EmptyToolNameError, and wires normalize into RequestPayload::to_upstream_request() so vLLM always receives Vec<FunctionTool>. 12 cassette-based tests in tool_normalization_test.rs validate the full pipeline against real multi-turn tool-call cassettes. Addresses all PR vllm-project#80 review feedback: - types/ → wire shapes only; tool/ → behaviors - From<&FunctionToolParam> for FunctionTool (typed conversion) - MCP registry entries deferred to PR C (discovery not yet wired) - EmptyToolNameError in types/ (no cross-layer import) - ToolOutput derives Debug + Clone Signed-off-by: Ashwin Giridharan <girida@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
handler.rshad grown to 630 lines mixing WebSocket logic, HTTP handlers, proxy utilities, and model transformation code in a single file. This PR splits it into a focused module tree and adds a Codex CLI-compatible/v1/modelsendpoint.GET /v1/models: Codex CLI compatibilityproxy_get(path, headers, state)toagentic-core/proxy.rs: a GET variant ofproxy_requestthat applies the same header filtering and auth injection.?client_version=<ver>query param (typedModelsParamsextractor).proxy_get.{ "object": "list", "data": [...] }transformed to{ "models": [...] }with the fullModelInfoshape Codex expects (slug,context_window,auto_review_model_override,apply_patch_tool_type, capabilities, etc.).ModelInfofields built once viaOnceLockand cloned per model; only the five per-model fields are patched at response time.WebSocket session bug fixes
Identified and fixed two bugs in
websocket/responses.rsthat prevented Codex CLI from persisting history to the database:Pipelined requests caused connection resets: Codex sends the next
response.createon the same WebSocket connection while the current stream is still active. The old handler returned aConcurrentMessageerror and closed the connection, forcing Codex to reconnect on every turn. Fixed by introducing aVecDequequeue: incoming requests that arrive mid-stream are enqueued and processed in order after the current stream completes. TheConcurrentMessageerror variant was removed as it is no longer reachable.store: falsebypassed the database: Codex CLI explicitly sends"store": falsein every request body, which caused the gateway to skip persistence entirely and proxy straight to vLLM. Fixed by forcingpayload.store = trueon the WebSocket path; the gateway is the stateful layer and should always persist regardless of what the client sends.Together these fixes ensure every completed Codex turn is written to the DB with its full SSE history, and that multi-turn conversations chain correctly via
previous_response_id.Handler reorganization
handler.rsdeleted. Replaced byhandler/module tree:common.rs: shared utilities used across handlers:convert_response,executor_error_response,read_bytes,resolve_exec_ctx,sse_responsehttp/conversations.rs:POST /v1/conversationshttp/models.rs:GET /health,GET /ready,GET /v1/models+ Codex model transform logichttp/responses.rs:POST /v1/responses(proxy and stateful paths)websocket/responses.rs: WebSocket/v1/responseshandler and streaming loopwebsocket/error.rs:WsErrorenum with status/code/frame helpershandler/mod.rsre-exports all public handlers; no import paths inapp.rschanged.Codex CLI setup
Create
$CODEX_HOME/config.toml(e.g.~/.codex/config.toml) pointing at agentic-api:Run with:
codex --disable image_generation -c model_provider=agentic-api -m "model_name"Test Plan
cargo clippy --all-targets -- -D warningscleancargo test: all tests pass, 0 failed