refactor: split `executor/engine.rs` and eliminate `ExecutionContext` clone per request by maralbahari · Pull Request #75 · vllm-project/agentic-api

maralbahari · 2026-06-23T08:17:53Z

Summary

ExecutionContext owns long-lived shared state: DB connection pool, HTTP client, and server config. Cloning it per request means each clone holds independent references to those resources, making it easy for future fields to silently diverge between the shared instance and the per-request copy. This refactor removes client_auth (the only per-request field) from ExecutionContext, making the struct purely shared and immutable for the server lifetime.

client_auth: Option<String> removed from ExecutionContext. Supplied per-request via ExecuteRequest::with_auth() instead, threaded as a parameter through run_blocking and run_stream.
ExecuteRequest builder added: ExecuteRequest::new(payload, exec_ctx).with_auth(token).run().await. The boundary between shared config and per-request state is now explicit in the type.
resolve_exec_ctx in agebtic-server/handlers/common.rs deleted. Replaced by extract_bearer(headers, config_key) and a direct ExecuteRequest construction, zero clones per request. extract_bearer returns the request-level bearer token if present, and falls back to openai_api_key from `AppState
execute() kept as a one-liner shim over ExecuteRequest so existing call sites compile without changes.
engine.rs split into four focused modules: inference.rs (HTTP transport), rehydrate.rs (history rehydration), persist.rs (response persistence), engine.rs (orchestration only). All public names re-exported from executor/mod.rs, no import paths changed.

Test Plan

cargo clippy --all-targets -- -D warnings clean
cargo test -- --test-threads=$(nproc): 201 tests pass, 0 failed

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

…cutor

ashwing

run_blocking and run_stream are private which is fine — rehydrate_conversation, call_inference, and persist_response are pub so Praxis filters can still compose steps individually. One thing worth thinking about before tool dispatch lands: ExecuteRequest::run() is currently a straight pipeline. Tool calls need to re-enter inference after execution, so run() will eventually need to loop or accept some kind of decision callback. Might be worth leaving that door open in the shape of ExecuteRequest now rather than having to reshape a stable API later.

…cutor Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari · 2026-06-29T09:43:56Z

run_blocking and run_stream are private which is fine — rehydrate_conversation, call_inference, and persist_response are pub so Praxis filters can still compose steps individually. One thing worth thinking about before tool dispatch lands: ExecuteRequest::run() is currently a straight pipeline. Tool calls need to re-enter inference after execution, so run() will eventually need to loop or accept some kind of decision callback. Might be worth leaving that door open in the shape of ExecuteRequest now rather than having to reshape a stable API later.

@ashwing Thank you for the feedback. I think we should modify APIs as needed when the feature actually lands. Right now we have no tool calls, so any shape we add to ExecuteRequest would be based on guesses about how tool dispatch will work in practice. Those guesses will almost certainly be wrong once we see the real requirements.
Adding a callback or loop shape today means future implementors have to work around our speculation rather than design from real constraints. The "open door" would end up being dead weight or worse, the wrong door entirely.
When tool execution is actually implemented, reshaping run() is cheap. ExecuteRequest is not a public stable API with many external callers. We can make any structural changes the real implementation needs at that point.

ashwing · 2026-06-29T17:43:48Z

Makes sense — no point adding shape we'll likely have to undo. Will revisit when tool dispatch is real.

…cutor Signed-off-by: maral <maralbahari.98@gmail.com>

…cutor

maralbahari · 2026-07-01T06:20:24Z

@franciscojavierarceo @leseb ready for review.

…esolution) PR vllm-project#75 split engine.rs into inference.rs, persist.rs, rehydrate.rs and removed client_auth from ExecutionContext::new(). Update executor/mod.rs re-exports and test helpers accordingly. Signed-off-by: Ashwin Giridharan <girida@amazon.com>

Two new integration tests in dispatch_loop_test.rs: - test_cassette_openai_parallel_two_fcs_in_output: verifies the openai_responses_tool_calls_parallel cassette produces exactly 2 FunctionCall items (get_job_status + web_search) — the only cassette with confirmed parallel FCs. Proves the accumulator handles genuine parallel output. - test_cassette_tool_output_only_items_input_path: exercises the ResponsesInput::Items input path. Turn 1 uses Text input, extracts the resulting FC call_id, then turn 2 starts with function_call_output as Items input. Validates the accumulator handles item-list input correctly end-to-end — a path all other cassette tests skip. Also adapts ExecutionContext::new() calls in unit tests to the 4-arg signature introduced by PR vllm-project#75 (client_auth removed from constructor). Signed-off-by: Ashwin Giridharan <girida@amazon.com>

maralbahari and others added 3 commits June 23, 2026 14:20

refactor: split and eliminate clone per request

34e8579

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: maral <maralbahari.98@gmail.com>

fix proxy_bench after refactor

d245747

Signed-off-by: maral <maralbahari.98@gmail.com>

Merge remote-tracking branch 'origin/main' into refactor-stateful-exe…

adec7f7

…cutor

maralbahari marked this pull request as ready for review June 24, 2026 03:07

maralbahari requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, noobHappylife, qandrew and tjtanaa as code owners June 24, 2026 03:07

ashwing reviewed Jun 28, 2026

View reviewed changes

Comment thread crates/agentic-server/src/handler.rs Outdated

ashwing reviewed Jun 28, 2026

View reviewed changes

Comment thread crates/agentic-core/src/executor/request.rs

ashwing reviewed Jun 28, 2026

View reviewed changes

Comment thread crates/agentic-server/tests/common/mod.rs

maralbahari added 2 commits June 29, 2026 16:35

Merge remote-tracking branch 'origin/main' into refactor-stateful-exe…

58cb8bf

…cutor Signed-off-by: maral <maralbahari.98@gmail.com>

add key for fallback

79876e3

Signed-off-by: maral <maralbahari.98@gmail.com>

maralbahari added 2 commits June 30, 2026 20:22

Merge remote-tracking branch 'origin/main' into refactor-stateful-exe…

3ca462c

…cutor Signed-off-by: maral <maralbahari.98@gmail.com>

Merge remote-tracking branch 'origin/main' into refactor-stateful-exe…

25b123a

…cutor

leseb approved these changes Jul 2, 2026

View reviewed changes

leseb merged commit a15e762 into vllm-project:main Jul 2, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: split `executor/engine.rs` and eliminate `ExecutionContext` clone per request#75

refactor: split `executor/engine.rs` and eliminate `ExecutionContext` clone per request#75
leseb merged 7 commits into
vllm-project:mainfrom
EmbeddedLLM:refactor-stateful-executor

maralbahari commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

ashwing left a comment

Uh oh!

Uh oh!

Uh oh!

maralbahari commented Jun 29, 2026 •

edited

Loading

Uh oh!

ashwing commented Jun 29, 2026

Uh oh!

maralbahari commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

maralbahari commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

Uh oh!

ashwing left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

maralbahari commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashwing commented Jun 29, 2026

Uh oh!

maralbahari commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maralbahari commented Jun 23, 2026 •

edited

Loading

maralbahari commented Jun 29, 2026 •

edited

Loading