Skip to content

feat(client): SEP-2549 — honor cacheHints (ttlMs/scope) on the response-cache substrate#2340

Open
felixweinberger wants to merge 2 commits into
v2-2026-07-28from
fweinberger/cachehints-honoring
Open

feat(client): SEP-2549 — honor cacheHints (ttlMs/scope) on the response-cache substrate#2340
felixweinberger wants to merge 2 commits into
v2-2026-07-28from
fweinberger/cachehints-honoring

Conversation

@felixweinberger

Copy link
Copy Markdown
Contributor

Client-side honoring of the SEP-2549 CacheableResult freshness hints (ttlMs, cacheScope) on the response-cache substrate from #2336.

Motivation and Context

The 2026-07-28 spec requires tools/list, prompts/list, resources/list, resources/templates/list, resources/read, and server/discover results to carry ttlMs and cacheScope. The server SDK already stamps these (the examples/caching/ story); this PR makes the client honor them: a still-fresh entry is served from the response cache without a round-trip; list_changed and resources/updated evict regardless of TTL.

How Has This Been Tested?

Client suite (581, +34 over base), full e2e (2594p/157xf), run:examples 63/63 (the caching story now asserts the second listTools() is cache-served via a server-side request counter). Partition isolation is covered by an adversarial-server-name test (a server crafting serverInfo.name to collide with another server's principal partition does not succeed).

Breaking Changes

None — all options additive. With no options set, the only observable change is that a second no-arg listTools() within the server's ttlMs is served from cache (no round-trip).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

New options: ClientOptions.cachePartition?: string (the principal slice — e.g., the auth subject — for stores shared across principals), ClientOptions.defaultCacheTtlMs?: number (applied when the server omits the hint; default 0 = always fetch but still store for SEP-2243 mirroring), RequestOptions.cacheMode?: 'use' | 'refresh' | 'bypass'.

Partition model: every cache entry is automatically scoped by the connected server's identity (derived from serverInfo.name@version). The full partition is JSON.stringify([serverIdentity, principal]) — collision-free by construction regardless of what a server puts in its name. public entries land at [serverIdentity, ''] (shared across principals on this server); private at [serverIdentity, cachePartition]. The shared-partition fallback only serves entries with scope === 'public'. Note: serverIdentity is self-reported by the server — two distinct origins claiming the same Implementation would share a public slice on a shared store; treat the store boundary accordingly.

InMemoryResponseCacheStore now has a maxEntries cap (default 512, oldest-out) so per-URI resources/read writes cannot grow unbounded. notifications/resources/updated evicts the matching resources/read entry. ttlMs is clamped to 24h. keyOf uses JSON encoding so NUL/quote in resource URIs cannot cause key collisions.

@felixweinberger felixweinberger requested a review from a team as a code owner June 22, 2026 21:44
@changeset-bot

changeset-bot Bot commented Jun 22, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: bf3bc85

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@modelcontextprotocol/client Major
@modelcontextprotocol/core Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jun 22, 2026

Copy link
Copy Markdown

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/client@2340

@modelcontextprotocol/codemod

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/codemod@2340

@modelcontextprotocol/server

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server@2340

@modelcontextprotocol/server-legacy

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server-legacy@2340

@modelcontextprotocol/express

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/express@2340

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/fastify@2340

@modelcontextprotocol/hono

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/hono@2340

@modelcontextprotocol/node

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/node@2340

commit: bf3bc85

@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 80597d6 to 10f7d17 Compare June 22, 2026 21:53
Comment thread packages/client/src/client/client.ts
Comment thread packages/client/src/client/responseCache.ts
Comment thread packages/client/src/client/responseCache.ts
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 10f7d17 to de00587 Compare June 22, 2026 22:15
Comment thread packages/client/test/client/mcpParamMirroring.test.ts
Comment thread packages/client/src/client/responseCache.ts
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from de00587 to 3a2b2b4 Compare June 22, 2026 22:35
Comment thread packages/client/src/client/client.ts
Comment thread packages/core/src/shared/protocol.ts
Comment thread packages/client/src/client/client.ts
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 3a2b2b4 to 362d306 Compare June 22, 2026 22:54
Comment thread packages/client/src/client/responseCache.ts
Comment thread packages/client/src/client/responseCache.ts
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 362d306 to f3e7d47 Compare June 22, 2026 23:09
Comment thread packages/client/src/client/client.ts
Comment thread packages/client/src/client/responseCache.ts Outdated
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from f3e7d47 to 3fd9970 Compare June 22, 2026 23:38
Comment thread packages/client/src/client/client.ts
Comment thread .changeset/client-honor-cache-hints.md
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 3fd9970 to 16bb283 Compare June 23, 2026 00:01
Comment thread packages/core/src/shared/protocol.ts
Comment thread packages/client/src/client/client.ts
…st*/readResource

The four list verbs and readResource now serve a still-fresh ResponseCacheStore
entry without a round trip when the server-stamped ttlMs has not elapsed.
Additive on the substrate (#2336): _listAllPages now stamps {expiresAt, scope}
on the aggregate write; a _serveFromCache front gates each verb on freshness;
readResource is newly cached (URI-keyed; only stored when ttl > 0, since the
URI keyspace is unbounded and there is no derived index).

Per-call CacheableRequestOptions.cacheMode ('use' | 'refresh' | 'bypass') maps
to mcp.d's CacheMode. ClientOptions.cachePartition is the per-principal slot
for 'private'-scoped entries (the spec's MUST-NOT-share-across-authz-contexts);
'public' entries always live at partition '' so a shared store serves them to
every co-tenant. ClientResponseCache reads probe own-partition then '' (mcp.d's
two-probe order — own-first because scope is only known after a fetch); the
toolDefinition/outputValidator derived indices use the same probe so SEP-2243
mirroring works under partitioning. readResource applies the same partition
derivation as the list verbs and treats absent cacheScope as 'private', so a
shared store cannot serve one principal's resource body to another.

ClientOptions.defaultCacheTtlMs (default 0) supplies the TTL when the result
lacks one (e.g. a legacy-era response); an explicit server-sent ttlMs:0 is
honoured as immediately stale. List aggregates are always stored regardless of
TTL (mcp.d's retainForSchema posture) so callTool's mirroring/output-validation
index keeps working at any TTL while the freshness gate never serves a stale
entry. A list_changed eviction beats TTL (the existing partition-agnostic
evict). Clock seam (now) injectable on ClientResponseCache for tests.

New exports: CacheMode, CacheableRequestOptions.
…dds cacheMode + custom-store sections

The client now calls listTools() and readResource() twice each and asserts the
second of each pair is cache-served — the server's resource handler counts how
many times it ran and exposes that via a read-count tool, so the example
verifies (server-side) that the cache hit never reached the wire. Demonstrates
cacheMode:'refresh' and the post-refresh return to cache-serving.

README drops the follow-up note (honouring is shipped), adds a §cacheMode
section, and adds a §Custom store section showing the four-method
ResponseCacheStore interface shape with the cachePartition guidance for shared
stores.
@felixweinberger felixweinberger force-pushed the fweinberger/cachehints-honoring branch from 16bb283 to bf3bc85 Compare June 23, 2026 01:04
Comment thread docs/migration.md
Comment on lines +578 to +595
### Client honours server cache hints (SEP-2549)

On a 2026-07-28 connection the cacheable verbs — `listTools()`, `listPrompts()`, `listResources()`, `listResourceTemplates()`, and `readResource()` — now serve a still-fresh held entry without a round trip when the server-stamped `ttlMs` has not elapsed. The behaviour is opt-in **by server hint**: a server that sends `ttlMs: 0` (the conservative default the SDK's `McpServer` stamps unless configured otherwise) sees byte-identical behaviour — every call fetches. A `list_changed` notification still evicts immediately regardless of TTL.

Per-call control via the new `CacheableRequestOptions.cacheMode` (`'use'` is the default):

```typescript
await client.listTools(); // serve from cache if fresh
await client.listTools(undefined, { cacheMode: 'refresh' }); // always fetch, then re-store
await client.listTools(undefined, { cacheMode: 'bypass' }); // fetch; do not read or write the cache
```

New `ClientOptions`:

- `cachePartition?: string` — the opaque per-principal identifier for `'private'`-scoped entries (the spec's "MUST NOT share across authorization contexts"). Entries are automatically scoped by connected-server identity (derived from `serverInfo`), so one `responseCacheStore` may back several clients without consumer-side encoding; set `cachePartition` to your principal identifier (e.g. the auth subject) when sharing a store across principals. With the default `''` every entry — public or private — lives at the connected server's shared partition (the safe single-tenant posture). Note `serverInfo` is self-reported, so a server that deliberately impersonates another's `name`/`version` shares its `'public'` slot; the per-principal isolation holds regardless.
- `defaultCacheTtlMs?: number` — applied when a cacheable result lacks `ttlMs` (e.g. a legacy-era response). Default `0` — never serve from cache; the list aggregate is still **stored** so `callTool`'s mirroring/output-validation index keeps working regardless. The server-supplied `ttlMs` is clamped at 24 h (`MAX_CACHE_TTL_MS`).

The `ResponseCacheStore` interface gained `delete(key)` (the per-URI invalidation `notifications/resources/updated` drives) — custom stores written against the alpha substrate need to add it. The default `InMemoryResponseCacheStore` is now bounded (default 512 entries, oldest-first eviction; configurable via `{ maxEntries }`).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new SEP-2549 cache-honouring behaviour (cache-served listTools()/readResource(), the per-call cacheMode option, and the new ClientOptions cachePartition/defaultCacheTtlMs) is documented here in migration.md and in examples/caching/README.md, but the canonical client feature guide docs/client.md was not updated — its Tools and Resources sections still describe these verbs as always reaching the server, with no mention of cache-serving or how to force a fetch. Consider adding a short 'Response caching (SEP-2549)' subsection (or a sentence in the Tools/Resources sections) of docs/client.md covering the cache-serving behaviour, cacheMode, and cachePartition/defaultCacheTtlMs.

Extended reasoning...

The gap. This PR introduces user-visible client behaviour: listTools(), listPrompts(), listResources(), listResourceTemplates(), and readResource() may now be served from the response cache without a round trip when the server-stamped ttlMs has not elapsed, plus the new per-call cacheMode option ('use' | 'refresh' | 'bypass'), the new ClientOptions.cachePartition / defaultCacheTtlMs semantics, and new public exports (CacheMode, CacheableRequestOptions, InMemoryResponseCacheStoreOptions, MAX_CACHE_TTL_MS). Prose was added to docs/migration.md (this hunk), docs/migration-SKILL.md, and examples/caching/README.md — but docs/client.md, the canonical client feature reference, is not touched by the PR.

What docs/client.md says today. A grep of docs/client.md for cacheMode / cacheHint / ttlMs / cachePartition / defaultCacheTtlMs / responseCacheStore returns nothing related to this feature; the only cache-related prose is the SEP-2243 'internal tools/list cache' paragraph and the listChanged local-cache option. Its Tools section (~line 255) describes listTools() as 'walks every page on your behalf' and the Resources section (~lines 318–332) describes listResources()/readResource() purely as discovering and reading server-provided data — the readResource example there even uses the same config://app URI the caching example now cache-serves. Nothing tells a reader of the feature guide that, after this PR, a second call within the server's ttlMs may never reach the server, or that cacheMode: 'refresh' / 'bypass' exists to force a fetch.

Why this matters. The migration guide targets upgraders and the example README targets the example; a user consulting the feature reference for listTools()/readResource() (e.g. while debugging why a request never hit their server) will not learn that calls can be cache-served, nor how to opt out per call. The repo's review checklist asks for prose documentation of new features and for updating docs that describe the pre-change behaviour.

Why this is an inconsistency, not a different convention. The repo's own precedent is that comparable client-side behaviour changes in this stack got prose in docs/client.md: the predecessor PR's auto-aggregation behaviour is the source of the 'walks every page on your behalf' wording there, and the SEP-2243 mirroring feature has its own subsection. SEP-2549 cache honouring is the same kind of user-visible Client behaviour change and is the only one of the set absent from client.md.

Concrete walkthrough of the reader-facing gap. (1) A developer's host calls client.readResource({ uri: 'config://app' }) against a server stamping ttlMs: 60_000; they then change the resource server-side and call readResource again within 60 s. (2) The second call returns the old body with no wire request — by design. (3) They open docs/client.md → Resources to understand why; the section describes readResource() as reading server data with no mention of the response cache, ttlMs, or cacheMode, so the behaviour looks like a bug rather than a documented feature with a documented escape hatch ({ cacheMode: 'refresh' }).

Suggested fix. Add a short 'Response caching (SEP-2549)' subsection to docs/client.md (or a sentence each in the Tools and Resources sections) stating that the cacheable verbs serve a still-fresh entry without a round trip when the server stamps a positive ttlMs, that cacheMode: 'refresh' | 'bypass' forces a fetch, and pointing at ClientOptions.cachePartition / defaultCacheTtlMs / responseCacheStore for shared-store setups — largely a condensed copy of the migration.md section added in this PR. Filed as a nit: the feature is documented (migration guide, example README, JSDoc), just not in the feature reference users of these verbs actually consult.

Comment on lines +1568 to 1575
// The aggregate is ALWAYS written: even when the resolved TTL is ≤0
// the entry is stored already-stale (mcp.d's `retainForSchema`
// posture) so the `tools/list`-derived index keeps working regardless,
// while the freshness gate in `_serveFromCache` never serves it.
// Page-1 carries the result-level `ttlMs`/`cacheScope` (`acc` IS the
// mutated page-1 object).
await this._cache.write(method, acc, generation, this._freshness(acc));
return acc;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 When _listAllPages aggregates a multi-page list, the terminal cache write computes freshness from this._freshness(acc) where acc is the page-1 result object, so ttlMs/cacheScope hints carried by pages 2..N are silently discarded. A later page's stricter hint is therefore ignored: a page-2 ttlMs: 0 ("do not cache") aggregate is served from cache for page-1's full TTL, and a page-2 cacheScope: 'private' is downgraded to page-1's 'public', storing the private-scoped page contents at the shared [serverIdentity, ''] partition where another principal's shared-partition probe will serve them on a shared store. Resolving most-restrictively while walking (min ttlMs across pages, 'private' if any page is private) is a small change in the loop and matches the conservative posture this PR takes everywhere else.

Extended reasoning...

The mechanism. _listAllPages (packages/client/src/client/client.ts:1568-1575) aggregates every page into acc, which IS the page-1 result object — append(acc, page) only pushes the later pages' items, and the per-page page objects are then discarded. The terminal cache write is await this._cache.write(method, acc, generation, this._freshness(acc)), and _freshness (lines 1592-1600) reads acc.ttlMs / acc.cacheScope — i.e. page 1's hints only. The ttlMs/cacheScope fields carried by pages 2..N are never consulted anywhere. The inline comment "Page-1 carries the result-level ttlMs/cacheScope" asserts hint uniformity rather than choosing a resolution for the heterogeneous case.\n\nWhy heterogeneous per-page hints are spec-legal and expressible. SEP-2549's CacheableResult fields are per-result, and each page of a paginated walk is an independent result that the 2026-07-28 codec requires to carry them. This SDK's own server resolves hints most-specific-author-first — attachCacheHintFallback only fills fields the handler did not set — so a low-level paginated list handler can legitimately return different ttlMs/cacheScope per page (e.g. a volatile or per-principal tail page), and third-party servers can too. Every multi-page test in responseCache.test.ts / mcpParamMirroring.test.ts stamps identical hints on all pages, which is why nothing pins this.\n\nConsequence 1 — TTL over-caching. Page 1 stamps ttlMs: 60_000; page 2 stamps ttlMs: 0 (the spec's "immediately stale" / do-not-cache). The aggregate — including the page-2 items the server asked not to cache — is stored with expiresAt = now + 60s and served from cache for the full minute. The server's only remaining lever is a list_changed notification, which it has no reason to send (the list did not change; it just declared part of it uncacheable).\n\nConsequence 2 — scope downgrade onto the shared/public partition. Page 1 stamps cacheScope: 'public'; a later page stamps 'private' (per-principal items mixed into the tail). _freshness(acc) resolves scope: 'public', so write() stores the whole aggregate — including the private-scoped page's contents — at the shared partition [serverIdentity, ''] with scope: 'public'. On a responseCacheStore shared across principals (the arrangement this PR's docs/README explicitly endorse, with cachePartition set per principal), another principal's client then gets a shared-partition hit — _probe's fallback is gated on the stored scope === 'public', which this entry now claims — and is served the private-scoped page contents without a round trip. That is exactly the cross-authorization-context sharing the spec's private scope forbids and that the rest of this PR's partition design (the two-probe scope gate, the misconfigured-co-tenant guard test, the JSON-encoded partition) is built to prevent.\n\nStep-by-step proof. (1) Configure a scripted modern server whose tools/list page 1 returns { ttlMs: 60_000, cacheScope: 'public', tools: [...], nextCursor: '1' } and page 2 returns { ttlMs: 0, cacheScope: 'private', tools: [privateTool] }. (2) Client (cachePartition: 'alice') calls listTools()_listAllPages walks both pages, acc is the page-1 object with page-2's tools appended. (3) _freshness(acc) reads acc.ttlMs = 60_000, acc.cacheScope = 'public'write() stores the aggregate at [serverIdentity, ''] with scope: 'public', expiresAt = now + 60s. (4) A second client (cachePartition: 'bob') on the same store calls listTools()_probe's own-partition miss falls through to the shared partition, finds the entry with scope === 'public', and serves it — including privateTool from the page the server marked private and ttlMs: 0 — with no wire request, for up to 60 s.\n\nWhy nothing else prevents it. The freshness seam is the single _freshness(acc) call; no other code reads later pages' hint fields. The list_changed / HEADER_MISMATCH evictions are orthogonal. The single-page case and the SDK server with uniform per-method ServerOptions.cacheHints are unaffected, which is why no test catches it.\n\nFix. Resolve the aggregate's freshness most-restrictively while walking: track ttlMs = min(...) across pages and scope = 'private' if any page is private (one or two extra lines in the page loop), and pass that to _freshness/write instead of reading acc (page 1) alone. Alternatively, document that only page-1 hints are honoured — but the most-restrictive resolution is cheap and matches the conservative posture the PR takes everywhere else (24h clamp, private-by-default, scope-gated shared probe).\n\nSeverity. Filed as a non-blocking nit: the trigger requires a server emitting heterogeneous per-page hints (unusual but spec-legal), and consequence 2 additionally requires a store shared across principals — but that sharing is an explicitly documented configuration of this same PR, so the scope-downgrade half is a genuine gap in the isolation story the PR documents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant