feat(go/anthropic): add prompt caching support via message metadata#5103
feat(go/anthropic): add prompt caching support via message metadata#5103mikewiacek wants to merge 2 commits intogenkit-ai:mainfrom
Conversation
Enable Anthropic's prompt caching feature for system messages. When a
system message has Metadata["cache"] = map[string]any{"type": "ephemeral"},
the corresponding TextBlockParam gets CacheControl set to ephemeral.
This follows the same pattern as the googlegenai plugin's cache support
(which uses Metadata["cache"]["ttlSeconds"]).
Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking.
Changes:
- toAnthropicRequest: check system message metadata for cache config
- toGenkitResponse: add cache creation tokens to Custom metrics
There was a problem hiding this comment.
Code Review
This pull request implements prompt caching for Anthropic system messages and adds tracking for cache creation tokens within the model response usage metrics. The review feedback recommends simplifying the metadata parsing logic to improve readability and support generic TTL fields, as well as ensuring the custom metrics map is updated safely without overwriting existing entries.
| if message.Metadata != nil { | ||
| if cacheVal, ok := message.Metadata["cache"]; ok && cacheVal != nil { | ||
| if cacheMap, ok := cacheVal.(map[string]any); ok { | ||
| if cacheType, ok := cacheMap["type"].(string); ok && cacheType == "ephemeral" { | ||
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The metadata check can be simplified and should support the ttlSeconds field used by the WithCacheTTL helper in the ai package. This ensures consistency across different model providers as mentioned in the PR description. Additionally, the nested if statements can be flattened for better readability.
| if message.Metadata != nil { | |
| if cacheVal, ok := message.Metadata["cache"]; ok && cacheVal != nil { | |
| if cacheMap, ok := cacheVal.(map[string]any); ok { | |
| if cacheType, ok := cacheMap["type"].(string); ok && cacheType == "ephemeral" { | |
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | |
| } | |
| } | |
| } | |
| } | |
| if cache, ok := message.Metadata["cache"].(map[string]any); ok { | |
| // Anthropic only supports "ephemeral" caching. We enable it if explicitly | |
| // requested or if the generic "ttlSeconds" metadata is present. | |
| t, _ := cache["type"].(string) | |
| _, hasTTL := cache["ttlSeconds"] | |
| if t == "ephemeral" || hasTTL { | |
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | |
| } | |
| } |
There was a problem hiding this comment.
Applied — simplified the metadata check and added ttlSeconds support for cross-provider consistency. Thanks for the suggestion.
| r.Usage.Custom = map[string]float64{ | ||
| "cacheCreationInputTokens": float64(m.Usage.CacheCreationInputTokens), | ||
| } |
There was a problem hiding this comment.
It is safer to initialize the Custom map only if it's nil and then set the specific key, rather than overwriting the entire map. This prevents accidentally losing other custom metrics that might be added in the future or by middleware.
if r.Usage.Custom == nil {
r.Usage.Custom = make(map[string]float64)
}
r.Usage.Custom["cacheCreationInputTokens"] = float64(m.Usage.CacheCreationInputTokens)There was a problem hiding this comment.
Applied — using nil check + key set instead of overwriting the map. Good catch.
Enable Anthropic's prompt caching feature for system messages in the Go plugin. When a system message has
Metadata["cache"] = map[string]any{"type": "ephemeral"}, the correspondingTextBlockParamgetsCacheControlset to ephemeral.This follows the same pattern as the googlegenai plugin's cache support (which uses
Metadata["cache"]["ttlSeconds"]), giving Go developers a consistent API for enabling caching across providers.Also tracks
CacheCreationInputTokensinUsage.Customfor cost tracking (cache reads were already tracked viaCachedContentTokens).Fixes: #817
Changes:
toAnthropicRequest: check system messageMetadatafor cache config, setCacheControlon matchingTextBlockParamtoGenkitResponse: addCacheCreationInputTokenstoUsage.CustommetricsUsage example:
Anthropic's prompt caching reduces input token costs by up to 90% for repeated system prompts (cache reads are $0.30/MTok vs $3/MTok for Sonnet). Unlike Gemini's 32K minimum, Anthropic has no minimum token requirement for caching.
Checklist: