fix(kiro): reuse Anthropic preflight for direct upstream#59
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a shared preflight preprocessing module for Anthropic-compatible requests, transitioning from raw JSON value manipulation to typed MessagesRequest payloads during upstream dispatch. The review feedback highlights two key performance optimizations in the new anthropic_upstream_payload module: first, avoiding parsing the entire request body into a generic serde_json::Value before converting it to MessagesRequest, and second, using std::borrow::Cow to prevent unnecessary cloning of the request payload when no model mapping is applied.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| pub(super) fn prepare_direct_anthropic_payload( | ||
| body: &[u8], | ||
| ) -> Result<DirectAnthropicPreparedPayload, DirectAnthropicPayloadError> { | ||
| let raw = serde_json::from_slice::<Value>(body) | ||
| .map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?; | ||
| let original_model = raw | ||
| .get("model") | ||
| .and_then(Value::as_str) | ||
| .map(str::trim) | ||
| .filter(|value| !value.is_empty()) | ||
| .ok_or(DirectAnthropicPayloadError::MissingModel)? | ||
| .to_string(); | ||
| let request = serde_json::from_value::<MessagesRequest>(raw) | ||
| .map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?; | ||
| let preflight = preprocess_messages_request(&request) | ||
| .map_err(|err| DirectAnthropicPayloadError::InvalidRequest(err.to_string()))?; | ||
|
|
||
| Ok(DirectAnthropicPreparedPayload { | ||
| original_model, | ||
| preflight, | ||
| }) | ||
| } |
There was a problem hiding this comment.
Parsing the entire request body (which can be up to 32MB) into a generic serde_json::Value and then converting it to MessagesRequest is highly inefficient and causes excessive memory allocations on a performance-critical hot path. We can optimize this by deserializing only the model field first using a lightweight helper struct, and then deserializing the full MessagesRequest directly from the byte slice.
pub(super) fn prepare_direct_anthropic_payload(
body: &[u8],
) -> Result<DirectAnthropicPreparedPayload, DirectAnthropicPayloadError> {
#[derive(serde::Deserialize)]
struct ModelOnly {
model: Option<serde_json::Value>,
}
let model_only: ModelOnly = serde_json::from_slice(body)
.map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
let model_val = model_only.model.ok_or(DirectAnthropicPayloadError::MissingModel)?;
let original_model = model_val
.as_str()
.map(str::trim)
.filter(|value| !value.is_empty())
.ok_or(DirectAnthropicPayloadError::MissingModel)?
.to_string();
let request = serde_json::from_slice::<MessagesRequest>(body)
.map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
let preflight = preprocess_messages_request(&request)
.map_err(|err| DirectAnthropicPayloadError::InvalidRequest(err.to_string()))?;
Ok(DirectAnthropicPreparedPayload {
original_model,
preflight,
})
}| fn apply_model_mapping_to_request( | ||
| model_name_map_json: &str, | ||
| payload: &mut MessagesRequest, | ||
| ) -> anyhow::Result<Option<String>> { | ||
| let trimmed = model_name_map_json.trim(); | ||
| if trimmed.is_empty() || trimmed == "{}" { | ||
| return Ok(None); | ||
| } | ||
| let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?; | ||
| let Some(target) = map.get(&payload.model).cloned() else { | ||
| return Ok(None); | ||
| }; | ||
| if target == payload.model { | ||
| return Ok(None); | ||
| } | ||
| payload.model = target.clone(); | ||
| Ok(Some(target)) | ||
| } | ||
|
|
||
| pub(super) fn build_route_payload( | ||
| model_name_map_json: &str, | ||
| payload: &MessagesRequest, | ||
| ) -> anyhow::Result<(MessagesRequest, Option<String>)> { | ||
| let mut route_payload = payload.clone(); | ||
| let mapped_model = apply_model_mapping_to_request(model_name_map_json, &mut route_payload)?; | ||
| Ok((route_payload, mapped_model)) | ||
| } |
There was a problem hiding this comment.
The current implementation of build_route_payload unconditionally clones the entire MessagesRequest (which can be very large due to conversation history, system prompts, and tool definitions) for every route in the loop, even when no model mapping is applied. We can avoid this overhead by returning a std::borrow::Cow<'a, MessagesRequest> and only cloning/modifying the payload when a mapping is actually applied.
| fn apply_model_mapping_to_request( | |
| model_name_map_json: &str, | |
| payload: &mut MessagesRequest, | |
| ) -> anyhow::Result<Option<String>> { | |
| let trimmed = model_name_map_json.trim(); | |
| if trimmed.is_empty() || trimmed == "{}" { | |
| return Ok(None); | |
| } | |
| let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?; | |
| let Some(target) = map.get(&payload.model).cloned() else { | |
| return Ok(None); | |
| }; | |
| if target == payload.model { | |
| return Ok(None); | |
| } | |
| payload.model = target.clone(); | |
| Ok(Some(target)) | |
| } | |
| pub(super) fn build_route_payload( | |
| model_name_map_json: &str, | |
| payload: &MessagesRequest, | |
| ) -> anyhow::Result<(MessagesRequest, Option<String>)> { | |
| let mut route_payload = payload.clone(); | |
| let mapped_model = apply_model_mapping_to_request(model_name_map_json, &mut route_payload)?; | |
| Ok((route_payload, mapped_model)) | |
| } | |
| pub(super) fn build_route_payload<'a>( | |
| model_name_map_json: &str, | |
| payload: &'a MessagesRequest, | |
| ) -> anyhow::Result<(std::borrow::Cow<'a, MessagesRequest>, Option<String>)> { | |
| let trimmed = model_name_map_json.trim(); | |
| if trimmed.is_empty() || trimmed == "{}" { | |
| return Ok((std::borrow::Cow::Borrowed(payload), None)); | |
| } | |
| let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?; | |
| if let Some(target) = map.get(&payload.model) { | |
| if target != &payload.model { | |
| let mut route_payload = payload.clone(); | |
| route_payload.model = target.clone(); | |
| return Ok((std::borrow::Cow::Owned(route_payload), Some(target.clone()))); | |
| } | |
| } | |
| Ok((std::borrow::Cow::Borrowed(payload), None)) | |
| } |
Summary
Direct Anthropic upstream requests for Kiro-owned keys now reuse the shared Kiro Anthropic preflight normalization path before dispatch, and usage diagnostics now expose what that preflight changed.
Root cause
The direct Anthropic branch parsed requests as raw JSON and forwarded them after route-specific model mapping, bypassing the Kiro Anthropic request normalization that already handles malformed-but-recoverable request shapes such as invalid historical
tool_use.idvalues. Because the branch also only emitted basic route diagnostics, operators could not see when a direct request had been normalized before reaching upstream.What changed
anthropic::preflightmodule fromllm-access-kirothat returns a normalized AnthropicMessagesRequestplus normalization metadata.anthropic_upstream_payloadmodule inllm-accessfor direct request parsing, shared preflight invocation, and typed route model mapping.MessagesRequestinstead of mutating raw JSON.tool_use.idnormalization, diagnostics, and UI summary text.Test Plan
CARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p llm-access provider::anthropic_upstream_dispatch::tests --jobs 4 -- --nocaptureCARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p static-flow-frontend anthropic_routing_summary_includes_preflight_change_count --jobs 4 -- --nocaptureCARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p llm-access-kiro preflight_normalizes_tool_use_ids_for_shared_kiro_anthropic_surface --jobs 4 -- --nocaptureCARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo clippy -p llm-access-kiro -p llm-access -p static-flow-frontend --jobs 4 -- -D warnings