feat: add dispatch_tools + execute_loop (PR B — tool dispatch layer) by ashwing · Pull Request #83 · vllm-project/agentic-api

ashwing · 2026-07-02T05:56:11Z

Summary

Adds the gateway-side agentic tool loop on top of the ToolRegistry and GatewayExecutor traits from PR A (#80). The executor can now run multi-turn inference with gateway-owned tool calls (MCP, web_search, file_search) dispatched between turns.

This is PR B in the tool framework sequence from the design doc (#67):

PR A feat: tool types + registry + ToolHandler trait (PR A) #80 ✅ — ToolType, ResponsesTool, ToolRegistry, ToolHandler, GatewayExecutor
PR B (this) — dispatch_tools, LoopDecision, execute_loop
PR C (next) — HttpMcpHandler: first real GatewayExecutor impl + MCP discovery

Changes

executor/dispatch.rs (new, 143 lines)

LoopDecision enum — #[non_exhaustive]: Continue(Vec<InputItem>) / Done / Incomplete(String)
dispatch_tools() — classifies FunctionToolCall items via ToolRegistry::gateway_owned(), executes gateway-owned calls in parallel with 30s per-call timeout, isolates failures as error-JSON FunctionCallOutput items (never aborts the loop on tool error)
Client-owned type: "function" calls return Done for MVP; RequiresAction is deferred per staging recommendation in docs: tool framework design — multi-type tool support #67

executor/agentic_loop.rs (new, 173 lines)

execute_loop() — multi-turn orchestrator: clears all three persistence triggers before looping, restores original IDs on the final payload, hard guard of 128 iterations (inclusive 0..=effective_max range so Incomplete fires correctly), rejects stream=true
Fixes orphaned FC history: only injects FunctionCall input items whose call_id has a matching FunctionCallOutput — unmatched pairs cause vLLM to reject the conversation history

executor/mod.rs / lib.rs — wired and re-exported

Test Plan

cargo test --workspace — 248 tests pass (0 failures)
cargo test --workspace -- --include-ignored — 252 tests pass (vLLM cassettes included)
cargo clippy --workspace --all-targets -- -D warnings — clean
cargo fmt --check — clean

Test files:

tests/dispatch_test.rs (new, 18 unit tests) — moved from dispatch.rs inline block:

All dispatch_tools edge cases: Done/Incomplete/Continue paths, parallel execution, error JSON isolation, call_id preservation, mixed gateway+client batches

executor/agentic_loop.rs (8 unit tests inline):

Streaming rejection, persistence trigger clearing, loop guard boundary, constant assertions

tests/dispatch_loop_test.rs (new, 12 integration tests via MockServer):

7 behavioral tests: no-tools baseline, 1+2 parallel gateway tools, max_iterations → incomplete, tool error recovery, prev_id/conv_id restoration
5 gpt-4o cassette tests (required, always run): parallel 2-FC, 3-turn FC-only, tool-output-only Items input, 5-turn, branch
4 vLLM cassette tests (#[ignore] — run with --include-ignored)

Shared test infrastructure — MockGatewayExecutor and FailingExecutor added to tests/support/mod.rs for reuse by future integration tests.

Notes

Relation to PR #82 (Maral's MCP design): PR #82 proposes embedding handler inside ToolEntry and adding ToolRegistry::dispatch(). This PR uses a separate HashMap<ToolType, Arc<dyn GatewayExecutor>> passed to dispatch_tools. These two dispatch models need alignment before either the MCP loop PR or PR C can proceed. Raised this directly in PR #82 comments.

MCP discovery: ToolRegistry::build() runs without tools/list discovery; MCP tool names are absent until PR C. Any function_call for an MCP tool name is treated as client-owned (skipped) — documented in execute_loop doc comment.

Persistence: execute_loop never writes to the DB. Caller is responsible for persist_response using the RequestContext from rehydrate_conversation.

RequiresAction / ContinuePartial: Deferred per Maral's feedback in #67. LoopDecision is #[non_exhaustive] so adding these variants later won't break existing match arms.

AI assistance was used in drafting this PR.

Implements the gateway-side agentic tool loop on top of the ToolRegistry and GatewayExecutor traits landed in PR A (vllm-project#80): - executor/dispatch.rs: LoopDecision enum (#[non_exhaustive]) + dispatch_tools() — classifies FunctionToolCall items via ToolRegistry::gateway_owned(), executes in parallel with 30s per-call timeout, maps failures to error-JSON FunctionCallOutput items (never aborts the loop on tool error). - executor/agentic_loop.rs: execute_loop() — multi-turn orchestrator that clears all three persistence triggers before looping and restores original IDs on the final payload. Rejects stream=true (StreamTee is a future PR). Hard guard of 128 iterations, soft cap via max_iterations param (default: 10). Client-owned function tools (ToolType::Function) return Done for now; RequiresAction and ContinuePartial are deferred per staging agreement in PR vllm-project#67 — LoopDecision is #[non_exhaustive] to make the addition safe. MCP tool names are absent from the registry until PR C adds discovery; any function_call for an MCP tool name is treated as client-owned. 244 tests pass; cargo clippy --workspace --all-targets -- -D warnings clean. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

- Fix loop guard bug: change `0..MAX_LOOP_GUARD` to `0..=effective_max` so dispatch_tools is called at iteration == effective_max and returns Incomplete rather than silently exiting with status "completed" - Add #[must_use] to dispatch_tools and execute_loop (api-must-use rule) - Fix incorrect # Errors doc on dispatch_tools (iteration limit returns Ok(Incomplete), not Err) - Correct persistence contract doc: RequestContext is not returned, so describe the caller's responsibility accurately - Fix mixed gateway+client tool history: only include FunctionCall input items whose call_id has a matching tool result, preventing orphaned function_call entries that most LLM backends reject - Remove dead `let _ = original_store` (Concern-4) - Fix line exceeding 120-char rustfmt limit in dispatch.rs Signed-off-by: Ashwin Giridharan <girida@amazon.com>

… + execute_loop Adds 28 new tests across three files: dispatch.rs (inline, +7 tests): - zero_max_iterations_returns_incomplete - mixed_batch_gateway_and_client_owned - incomplete_message_contains_iteration_counts - multiple_tool_types_dispatch_to_correct_executor - error_json_output_is_valid_json - call_id_preserved_in_output - output_with_only_non_fc_items_returns_done agentic_loop.rs (inline, +3 tests): - streaming_request_returns_err_without_llm_call - loop_guard_boundary_formula_is_correct - default_and_guard_constants_are_sane (const assert) tests/dispatch_loop_test.rs (integration, 10 tests): - No-tool baseline, one + two parallel gateway tools - max_iterations → Incomplete with 3 LLM calls verified - Tool error feeds error JSON to model; loop continues - prev_id and conv_id restored on final payload - 3 cassette tests: responses_tool_calls_{3turn,parallel,5turn} verify all function-type tools → Done in 1 LLM call, no executor invocations Also adds inline code comments to dispatch.rs and agentic_loop.rs explaining the clone rationale, inclusive loop bound, orphaned FC filtering, and payload stub initialisation. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

…esolution) PR vllm-project#75 split engine.rs into inference.rs, persist.rs, rehydrate.rs and removed client_auth from ExecutionContext::new(). Update executor/mod.rs re-exports and test helpers accordingly. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Two new integration tests in dispatch_loop_test.rs: - test_cassette_openai_parallel_two_fcs_in_output: verifies the openai_responses_tool_calls_parallel cassette produces exactly 2 FunctionCall items (get_job_status + web_search) — the only cassette with confirmed parallel FCs. Proves the accumulator handles genuine parallel output. - test_cassette_tool_output_only_items_input_path: exercises the ResponsesInput::Items input path. Turn 1 uses Text input, extracts the resulting FC call_id, then turn 2 starts with function_call_output as Items input. Validates the accumulator handles item-list input correctly end-to-end — a path all other cassette tests skip. Also adapts ExecutionContext::new() calls in unit tests to the 4-arg signature introduced by PR vllm-project#75 (client_auth removed from constructor). Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Two new cassette tests covering the remaining OpenAI-recorded scenarios: - test_cassette_openai_3turn_no_reasoning_fc_only: gpt-4o produces no reasoning items — FC is at output[0] directly. Confirms the accumulator handles FC-only output (no preceding Reasoning item) and terminates in 1 LLM call with the correct name. - test_cassette_openai_tool_output_only_items_input: gpt-4o variant of the tool-output-only scenario. Two-turn test: t1 Text input -> FC (no reasoning), t2 Items(function_call_output) input -> message. Validates Items input path against gpt-4o-recorded data, distinct from the vLLM variant which has reasoning in t1. All 6 OpenAI cassettes are now covered (parallel 2-FC, 3turn, tool-output- only). Streaming cassettes remain untested (blocked by StreamTee PR). Signed-off-by: Ashwin Giridharan <girida@amazon.com>

…red baseline The 4 vLLM-recorded cassette tests are now skipped by default with #[ignore]. The 3 gpt-4o cassette tests (openai_parallel_two_fcs, openai_3turn_no_reasoning, openai_tool_output_only) run on every `cargo test` as the required baseline. vLLM cassettes still pass — run with `--include-ignored` for full coverage (e.g. when validating against a live vLLM deployment). Signed-off-by: Ashwin Giridharan <girida@amazon.com>

All 5 non-streaming gpt-4o cassettes are now covered: - openai_parallel (2 FCs) ✓ existing - openai_3turn (1 FC, no reasoning) ✓ existing - openai_tool_output (Items input path) ✓ existing - openai_5turn (1 FC, t1 only) ✓ new - openai_branch (1 FC, t1 only) ✓ new openai_3turn_streaming is blocked (execute_loop rejects stream=true). vLLM cassettes remain present but #[ignore]d — run with --include-ignored. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Three changes: 1. Move dispatch.rs inline tests → tests/dispatch_test.rs dispatch.rs was 624 lines (77% tests). Production module is now 144 lines of pure logic. 18 unit tests live in the dedicated file, consistent with engine.rs, registry.rs, inference.rs (all zero inline tests). 2. Add MockGatewayExecutor + FailingExecutor to tests/support/mod.rs Both were defined locally in dispatch_loop_test.rs. Moving them to the shared support module makes them available to any future integration test without redefinition. 3. Extract assert_single_fc_no_reasoning helper in dispatch_loop_test.rs The 3-turn, 5-turn, and branch gpt-4o cassette tests had identical assertion bodies (4 assertions each). One helper call replaces ~20 lines in each test. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

ashwing · 2026-07-02T23:40:05Z

cc @maralbahari @franciscojavierarceo — PR B from the tool framework design (#67). Builds on the types/registry from #80. The main open question is the dispatch model conflict with #82 (raised there already).

ashwing mentioned this pull request Jul 2, 2026

docs: MCP gateway integration design #82

Open

ashwing added 4 commits July 2, 2026 15:31

ashwing force-pushed the feat/tool-dispatch-loop branch from f66ae62 to fb075f3 Compare July 2, 2026 22:34

ashwing added 5 commits July 2, 2026 15:43

ashwing marked this pull request as ready for review July 2, 2026 23:36

ashwing requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, maralbahari, noobHappylife, qandrew and tjtanaa as code owners July 2, 2026 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add dispatch_tools + execute_loop (PR B — tool dispatch layer)#83

feat: add dispatch_tools + execute_loop (PR B — tool dispatch layer)#83
ashwing wants to merge 9 commits into
vllm-project:mainfrom
ashwing:feat/tool-dispatch-loop

ashwing commented Jul 2, 2026 •

edited

Loading

Uh oh!

ashwing commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ashwing commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test Plan

Notes

Uh oh!

ashwing commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ashwing commented Jul 2, 2026 •

edited

Loading