Skip to content

fix(kiro): reuse Anthropic preflight for direct upstream#59

Merged
acking-you merged 3 commits into
masterfrom
fix/kiro-direct-anthropic-preflight
Jun 30, 2026
Merged

fix(kiro): reuse Anthropic preflight for direct upstream#59
acking-you merged 3 commits into
masterfrom
fix/kiro-direct-anthropic-preflight

Conversation

@acking-you

@acking-you acking-you commented Jun 30, 2026

Copy link
Copy Markdown
Owner

Summary

Direct Anthropic upstream requests for Kiro-owned keys now reuse the shared Kiro Anthropic preflight normalization path before dispatch, and usage diagnostics now expose what that preflight changed.

Root cause

The direct Anthropic branch parsed requests as raw JSON and forwarded them after route-specific model mapping, bypassing the Kiro Anthropic request normalization that already handles malformed-but-recoverable request shapes such as invalid historical tool_use.id values. Because the branch also only emitted basic route diagnostics, operators could not see when a direct request had been normalized before reaching upstream.

What changed

  • Exposed a focused anthropic::preflight module from llm-access-kiro that returns a normalized Anthropic MessagesRequest plus normalization metadata.
  • Added a small anthropic_upstream_payload module in llm-access for direct request parsing, shared preflight invocation, and typed route model mapping.
  • Updated direct Anthropic dispatch to forward the preprocessed MessagesRequest instead of mutating raw JSON.
  • Added direct Anthropic routing diagnostics with preflight counts, capped rewrite/event details, and tool schema summary.
  • Added conditional structured logs when direct Anthropic preflight performs an actual normalization.
  • Updated the Kiro admin usage summary to show direct Anthropic preflight change counts from routing diagnostics.
  • Added regression coverage for invalid tool_use.id normalization, diagnostics, and UI summary text.

Test Plan

  • CARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p llm-access provider::anthropic_upstream_dispatch::tests --jobs 4 -- --nocapture
  • CARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p static-flow-frontend anthropic_routing_summary_includes_preflight_change_count --jobs 4 -- --nocapture
  • CARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo test -p llm-access-kiro preflight_normalizes_tool_use_ids_for_shared_kiro_anthropic_surface --jobs 4 -- --nocapture
  • CARGO_TARGET_DIR=/mnt/wsl/data4tb/static-flow-data/cargo-target/static_flow cargo clippy -p llm-access-kiro -p llm-access -p static-flow-frontend --jobs 4 -- -D warnings

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a shared preflight preprocessing module for Anthropic-compatible requests, transitioning from raw JSON value manipulation to typed MessagesRequest payloads during upstream dispatch. The review feedback highlights two key performance optimizations in the new anthropic_upstream_payload module: first, avoiding parsing the entire request body into a generic serde_json::Value before converting it to MessagesRequest, and second, using std::borrow::Cow to prevent unnecessary cloning of the request payload when no model mapping is applied.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +17 to +38
pub(super) fn prepare_direct_anthropic_payload(
body: &[u8],
) -> Result<DirectAnthropicPreparedPayload, DirectAnthropicPayloadError> {
let raw = serde_json::from_slice::<Value>(body)
.map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
let original_model = raw
.get("model")
.and_then(Value::as_str)
.map(str::trim)
.filter(|value| !value.is_empty())
.ok_or(DirectAnthropicPayloadError::MissingModel)?
.to_string();
let request = serde_json::from_value::<MessagesRequest>(raw)
.map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
let preflight = preprocess_messages_request(&request)
.map_err(|err| DirectAnthropicPayloadError::InvalidRequest(err.to_string()))?;

Ok(DirectAnthropicPreparedPayload {
original_model,
preflight,
})
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Parsing the entire request body (which can be up to 32MB) into a generic serde_json::Value and then converting it to MessagesRequest is highly inefficient and causes excessive memory allocations on a performance-critical hot path. We can optimize this by deserializing only the model field first using a lightweight helper struct, and then deserializing the full MessagesRequest directly from the byte slice.

pub(super) fn prepare_direct_anthropic_payload(
    body: &[u8],
) -> Result<DirectAnthropicPreparedPayload, DirectAnthropicPayloadError> {
    #[derive(serde::Deserialize)]
    struct ModelOnly {
        model: Option<serde_json::Value>,
    }

    let model_only: ModelOnly = serde_json::from_slice(body)
        .map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
    let model_val = model_only.model.ok_or(DirectAnthropicPayloadError::MissingModel)?;
    let original_model = model_val
        .as_str()
        .map(str::trim)
        .filter(|value| !value.is_empty())
        .ok_or(DirectAnthropicPayloadError::MissingModel)?
        .to_string();

    let request = serde_json::from_slice::<MessagesRequest>(body)
        .map_err(|_| DirectAnthropicPayloadError::InvalidPayload)?;
    let preflight = preprocess_messages_request(&request)
        .map_err(|err| DirectAnthropicPayloadError::InvalidRequest(err.to_string()))?;

    Ok(DirectAnthropicPreparedPayload {
        original_model,
        preflight,
    })
}

Comment on lines +67 to +93
fn apply_model_mapping_to_request(
model_name_map_json: &str,
payload: &mut MessagesRequest,
) -> anyhow::Result<Option<String>> {
let trimmed = model_name_map_json.trim();
if trimmed.is_empty() || trimmed == "{}" {
return Ok(None);
}
let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?;
let Some(target) = map.get(&payload.model).cloned() else {
return Ok(None);
};
if target == payload.model {
return Ok(None);
}
payload.model = target.clone();
Ok(Some(target))
}

pub(super) fn build_route_payload(
model_name_map_json: &str,
payload: &MessagesRequest,
) -> anyhow::Result<(MessagesRequest, Option<String>)> {
let mut route_payload = payload.clone();
let mapped_model = apply_model_mapping_to_request(model_name_map_json, &mut route_payload)?;
Ok((route_payload, mapped_model))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of build_route_payload unconditionally clones the entire MessagesRequest (which can be very large due to conversation history, system prompts, and tool definitions) for every route in the loop, even when no model mapping is applied. We can avoid this overhead by returning a std::borrow::Cow<'a, MessagesRequest> and only cloning/modifying the payload when a mapping is actually applied.

Suggested change
fn apply_model_mapping_to_request(
model_name_map_json: &str,
payload: &mut MessagesRequest,
) -> anyhow::Result<Option<String>> {
let trimmed = model_name_map_json.trim();
if trimmed.is_empty() || trimmed == "{}" {
return Ok(None);
}
let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?;
let Some(target) = map.get(&payload.model).cloned() else {
return Ok(None);
};
if target == payload.model {
return Ok(None);
}
payload.model = target.clone();
Ok(Some(target))
}
pub(super) fn build_route_payload(
model_name_map_json: &str,
payload: &MessagesRequest,
) -> anyhow::Result<(MessagesRequest, Option<String>)> {
let mut route_payload = payload.clone();
let mapped_model = apply_model_mapping_to_request(model_name_map_json, &mut route_payload)?;
Ok((route_payload, mapped_model))
}
pub(super) fn build_route_payload<'a>(
model_name_map_json: &str,
payload: &'a MessagesRequest,
) -> anyhow::Result<(std::borrow::Cow<'a, MessagesRequest>, Option<String>)> {
let trimmed = model_name_map_json.trim();
if trimmed.is_empty() || trimmed == "{}" {
return Ok((std::borrow::Cow::Borrowed(payload), None));
}
let map = serde_json::from_str::<BTreeMap<String, String>>(trimmed)?;
if let Some(target) = map.get(&payload.model) {
if target != &payload.model {
let mut route_payload = payload.clone();
route_payload.model = target.clone();
return Ok((std::borrow::Cow::Owned(route_payload), Some(target.clone())));
}
}
Ok((std::borrow::Cow::Borrowed(payload), None))
}

@acking-you acking-you merged commit 822bb14 into master Jun 30, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant