Skip to content

feat(go/anthropic): add prompt caching support via message metadata#5103

Open
mikewiacek wants to merge 2 commits intogenkit-ai:mainfrom
mikewiacek:anthropic-cache-control
Open

feat(go/anthropic): add prompt caching support via message metadata#5103
mikewiacek wants to merge 2 commits intogenkit-ai:mainfrom
mikewiacek:anthropic-cache-control

Conversation

@mikewiacek
Copy link
Copy Markdown

Enable Anthropic's prompt caching feature for system messages in the Go plugin. When a system message has Metadata["cache"] = map[string]any{"type": "ephemeral"}, the corresponding TextBlockParam gets CacheControl set to ephemeral.

This follows the same pattern as the googlegenai plugin's cache support (which uses Metadata["cache"]["ttlSeconds"]), giving Go developers a consistent API for enabling caching across providers.

Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking (cache reads were already tracked via CachedContentTokens).

Fixes: #817

Changes:

  • toAnthropicRequest: check system message Metadata for cache config, set CacheControl on matching TextBlockParam
  • toGenkitResponse: add CacheCreationInputTokens to Usage.Custom metrics

Usage example:

resp, err := genkit.Generate(ctx, g,
    ai.WithMessages(&ai.Message{
        Role:    ai.RoleSystem,
        Content: []*ai.Part{ai.NewTextPart(systemPrompt)},
        Metadata: map[string]any{
            "cache": map[string]any{"type": "ephemeral"},
        },
    }),
    ai.WithPrompt(userPrompt),
    ai.WithModelName("anthropic/claude-sonnet-4-5-20250929"),
)
// resp.Usage.CachedContentTokens > 0 on cache hits
// resp.Usage.Custom["cacheCreationInputTokens"] > 0 on first call (cache write)

Anthropic's prompt caching reduces input token costs by up to 90% for repeated system prompts (cache reads are $0.30/MTok vs $3/MTok for Sonnet). Unlike Gemini's 32K minimum, Anthropic has no minimum token requirement for caching.

Checklist:

Enable Anthropic's prompt caching feature for system messages. When a
system message has Metadata["cache"] = map[string]any{"type": "ephemeral"},
the corresponding TextBlockParam gets CacheControl set to ephemeral.

This follows the same pattern as the googlegenai plugin's cache support
(which uses Metadata["cache"]["ttlSeconds"]).

Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking.

Changes:
- toAnthropicRequest: check system message metadata for cache config
- toGenkitResponse: add cache creation tokens to Custom metrics
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements prompt caching for Anthropic system messages and adds tracking for cache creation tokens within the model response usage metrics. The review feedback recommends simplifying the metadata parsing logic to improve readability and support generic TTL fields, as well as ensuring the custom metrics map is updated safely without overwriting existing entries.

Comment on lines +231 to +239
if message.Metadata != nil {
if cacheVal, ok := message.Metadata["cache"]; ok && cacheVal != nil {
if cacheMap, ok := cacheVal.(map[string]any); ok {
if cacheType, ok := cacheMap["type"].(string); ok && cacheType == "ephemeral" {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The metadata check can be simplified and should support the ttlSeconds field used by the WithCacheTTL helper in the ai package. This ensures consistency across different model providers as mentioned in the PR description. Additionally, the nested if statements can be flattened for better readability.

Suggested change
if message.Metadata != nil {
if cacheVal, ok := message.Metadata["cache"]; ok && cacheVal != nil {
if cacheMap, ok := cacheVal.(map[string]any); ok {
if cacheType, ok := cacheMap["type"].(string); ok && cacheType == "ephemeral" {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}
}
}
if cache, ok := message.Metadata["cache"].(map[string]any); ok {
// Anthropic only supports "ephemeral" caching. We enable it if explicitly
// requested or if the generic "ttlSeconds" metadata is present.
t, _ := cache["type"].(string)
_, hasTTL := cache["ttlSeconds"]
if t == "ephemeral" || hasTTL {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied — simplified the metadata check and added ttlSeconds support for cross-provider consistency. Thanks for the suggestion.

Comment on lines +492 to +494
r.Usage.Custom = map[string]float64{
"cacheCreationInputTokens": float64(m.Usage.CacheCreationInputTokens),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It is safer to initialize the Custom map only if it's nil and then set the specific key, rather than overwriting the entire map. This prevents accidentally losing other custom metrics that might be added in the future or by middleware.

		if r.Usage.Custom == nil {
			r.Usage.Custom = make(map[string]float64)
		}
		r.Usage.Custom["cacheCreationInputTokens"] = float64(m.Usage.CacheCreationInputTokens)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied — using nil check + key set instead of overwriting the map. Good catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant