Anthropic Prompt caching by danhorner · Pull Request #24 · pasky/claude.vim

danhorner · 2024-12-04T09:54:06Z

This commit adds prompt caching to claude.nvim.

I added cache points for the buffers as well as for the conversation, so it should be possible to wipe the claude chat buffer and still get the benefit of cached data.
I also updated the usage display to include costing for cached tokens.
I don't think I broke the bedrock mode, but I didn't test. I'm counting on it to ignore the cache_control instructions in the python helper.

Hope this is helpful.

- Expand user messages to contain content blocks instead of strings - Move the shared buffers into the content block of the first user message - Sort buffers by modification time into old (>2min) and recent (>2min) - Apply cache_control break points to old buffers, new buffers, and the most recent conversation message - Update cost reporting message with pricing for new cached prompts

- extract pricing and usage categories into a config var

- Previous version broke tool results and diffs

pasky · 2024-12-05T03:32:14Z

Oh, nice ideas there!

I'll test the bedrock mode once I get a moment.

In the example https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching the context is part of the system prompt, but you have it as part of messages (and also inflate all other messages to content blocks) - does it matter? (Keeping it as part of the system prompt would feel a bit cleaner to me, unless there's a reason not to.)

What about instead of busy/quiet buffer distinction simply sorting buffers by lastmodified?

danhorner · 2024-12-05T05:30:54Z

In the example https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching the context is part of the system prompt, but you have it as part of messages ... does it matter?

I'm trying to remember. I think I tried to use the system prompt first. But the system prompt needs to be inflated in order to apply the cache breakpoints. Later, it's passed as a string to the bedrock helper and when I tried to modify the bedrock helper I got scared.

I'm not sure if it makes a difference. It's hard to understand from the anthropic documentation whether it really matters -- different examples show both.

(and also inflate all other messages to content blocks)

I inflate all user messages to content blocks for the same reason. The most recent user message needs to have a cache-breakpoint. Actually, the multi-turn example says that the second-last user message also needs one, but it seems to work fine without.

What about instead of busy/quiet buffer distinction simply sorting buffers by lastmodified?

I'm really not sure what is best here. I think sorting alone is not sufficient. The problem is we only have 5 cache breakpoints, and I'm using one for the final message. Even if we sort, we still need to arbitrarily decide where to put the cache breakpoints and how many to use.

I decided on 2 breakpoints: one for quiet buffers (documentation / background code) and one for busy buffers that claude will edit. It might be helpful to go one more step and separate the single most recently edited buffer into its own block.

It's hard to optimize this stuff for unknown use cases! Mostly I think it is useful to cache documentation and read-only source files data that claude will not edit.

danhorner added 3 commits December 4, 2024 01:22

Make pricing configurable

ac334ac

- extract pricing and usage categories into a config var

Fixup: Correctly expand message text

674b4cb

- Previous version broke tool results and diffs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Anthropic Prompt caching#24

Anthropic Prompt caching#24
danhorner wants to merge 3 commits intopasky:mainfrom
danhorner:prompt_caching

danhorner commented Dec 4, 2024

Uh oh!

pasky commented Dec 5, 2024

Uh oh!

danhorner commented Dec 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

danhorner commented Dec 4, 2024

Uh oh!

pasky commented Dec 5, 2024

Uh oh!

danhorner commented Dec 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants