feat(core,server,client): SEP-2106 JSON Schema 2020-12 — $ref guard, per-dialect validation, non-object structuredContent#2337
Conversation
🦋 Changeset detectedLatest commit: dfdc1a7 The changes in this PR will be included in the next version bump. This PR includes changesets to release 7 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
…peless-root stamping, result mutation, dead-surface JSDoc
| `The default validator supports JSON Schema 2020-12 only; pass a pre-configured ` + | ||
| `Ajv instance to AjvJsonSchemaValidator(ajv) to validate other dialects.` | ||
| ); | ||
| } | ||
|
|
||
| const ajvValidator = | ||
| '$id' in schema && typeof schema.$id === 'string' | ||
| ? (this._ajv.getSchema(schema.$id) ?? this._ajv.compile(schema)) |
There was a problem hiding this comment.
🟣 The retained getSchema($id) ?? compile(schema) fast-path on a single long-lived Ajv2020 instance means the first schema compiled under a given $id wins for the life of the provider — a later, different schema that reuses the same $id (two tools sharing one $id, or a list_changed update that changes a tool's outputSchema while keeping its $id) is silently ignored and data keeps validating against the stale schema, even though the new SEP-2106 guards run on the new schema text. This pattern is pre-existing (not introduced by this PR), but since this function is being rewritten as the untrusted-schema hardening point, consider a follow-up: drop the $id fast-path, key the cache by full schema identity, or removeSchema/re-add when the schema text differs.
Extended reasoning...
What the bug is
AjvJsonSchemaValidator.getValidator() (packages/core/src/validators/ajvProvider.ts:119-126) keeps the pre-existing pattern:
const ajvValidator =
'$id' in schema && typeof schema.$id === 'string'
? (this._ajv.getSchema(schema.$id) ?? this._ajv.compile(schema))
: this._ajv.compile(schema);Ajv registers every compiled schema that declares $id in its instance-level registry, and the provider holds a single Ajv2020 instance for its entire lifetime (per-connection on the Client, per-server on McpServer; nothing ever calls removeSchema). So the first schema compiled under a given $id wins forever: any later, different schema declaring the same $id resolves through getSchema($id) to the old compiled validator and the new schema text is silently ignored. (The getSchema-first ordering exists precisely to dodge Ajv's "schema with key or id already exists" duplicate-id error, which is why this never throws — it just returns the stale validator.)
The code path that triggers it
Schemas reaching this function come from an untrusted peer (a server's advertised tool outputSchemas on the client side). Two realistic triggers:
- Two tools in one listing reuse one
$id— the second tool'sstructuredContentis validated against the first tool's schema. - A
list_changedre-list changes a tool'soutputSchemawhile keeping its$id— the response cache's stamp-keyed index correctly re-invokes_compileOutputValidator, the SEP-2106 guards (assertSchemaSafeToCompile, the dialect check) run on the new schema text, but the validator actually returned is the stalegetSchema($id)entry compiled from the old schema.
Why the new code in this PR doesn't prevent it
This PR's hardening all happens before the lookup: assertSchemaSafeToCompile(schema) and the $schema dialect check inspect the new schema object, then the lookup may discard that object entirely in favor of the cached compile. The PR's invalidation story ("compile state invalidates with the cached tool definition — list_changed, reconnect, re-list", and the new substrate-held compile-error lifecycle in responseCache.ts/client.ts) therefore does not hold for $id-bearing schemas: the recompile resolves to the same stale entry on the persistent Ajv instance. The CfWorkerJsonSchemaValidator is unaffected (it constructs a fresh Validator per schema).
Step-by-step proof
- Server lists tool
AwithoutputSchema: { $id: 'https://example.com/s', type: 'object', required: ['x'] }. Client callsA;getValidatorfinds no registered$id, falls through tocompile(schema), and Ajv registers the compiled function underhttps://example.com/s. - Server emits
tools/list_changedand re-listsAwith a changed schema keeping the same$id:{ $id: 'https://example.com/s', type: 'object', required: ['y'] }. - Client re-lists; the cache stamp changes, so
outputValidator()re-derives its index and calls_compileOutputValidatorwith the new schema.assertSchemaSafeToCompileand the dialect check pass on the new text. getValidatorhitsthis._ajv.getSchema('https://example.com/s'), which returns the validator compiled in step 1. Therequired: ['y']schema is never compiled.- The tool returns
structuredContent: { x: 1 }(valid against the old schema, invalid against the advertised one) — the client accepts it. Conversely,{ y: 1 }(valid against the advertised schema) is rejected withInvalidParams. The validator and the advertised schema have permanently diverged for the rest of the connection.
Impact
Wrong-schema validation rather than a crash: data is accepted/rejected against a schema other than the one currently advertised, undermining the per-tool isolation and recompile-on-relist guarantees this PR documents. The security angle is limited (both schemas come from the same peer, which could already advertise anything), but the benign stale-after-list_changed case is a genuine correctness gap.
Why pre-existing, and how to fix
The getSchema($id) ?? compile line is unchanged context in this diff — the pattern predates the PR — so this is filed as a follow-up rather than a blocker. Note that simply deleting the getSchema lookup is not a drop-in fix (Ajv would then throw the duplicate-$id error on the second compile of the same $id). Reasonable options:
- Key the validator cache by full schema identity (e.g. serialized schema text) instead of
$id, compiling $id-bearing schemas on a throwaway/scoped basis. removeSchema(schema.$id)and re-compilewhen the registered schema's text differs from the incoming one.- Drop the $id fast-path and strip/namespace
$idbefore compiling, so untrusted peers cannot collide in the shared registry at all.
There was a problem hiding this comment.
Pre-existing (this PR doesn't introduce it — same pattern at base). Tracked as a baseline follow-up: drop the $id fast-path or key the cache by full schema identity. Out of scope here per the security-hardening boundary; will address separately.
…ialect validation schemaBounds.ts (SchemaCompileError, isSameDocumentRef, assertSchemaSafeToCompile cycle-guarded). Both built-in providers reject non-same-document $ref/$dynamicRef and over-bound schemas before compile, read $schema and route to a matching engine (Ajv2020/Ajv2019/draft-07; cfworker by draft), and reject unknown dialects with a typed error. Default (no $schema) is 2020-12 (SEP-1613). Custom-instance ctor bypasses routing — caller owns dialect.
…on tools/list
cacheToolMetadata wraps getValidator in try/catch and stores any throw per-tool; callTool fails fast (before the request) with ProtocolError(InvalidParams, {reason}) when a compileError is present. listTools resolves with every tool; only the bad one is unreachable.
standardSchemaToJsonSchema('output') no longer forces type:'object'. Public CallToolResult.structuredContent widens to unknown via a types.ts override; legacy runtime Zod parse stays z.record (Q10-L2 byte-identity). Type-level pins narrowed to the structuredContent field only.
…back, legacy projection/strip structuredContent presence check is === undefined (null/0/false/'' are legal values, validated against outputSchema). McpServer auto-appends a TextContent JSON block when structuredContent is non-object and the handler authored none (any text block opts out); on the legacy era the non-object value is then omitted (invalid 2025 wire data). Non-object outputSchema is dropped from the legacy tools/list projection with a warn-once.
…n-schema expected-failures
everythingClient registers json-schema-ref-no-deref handler; everythingServer's json_schema_2020_12_tool advertises the full keyword set via fromJsonSchema. Both expected-failures entries removed. e2e: jsonschema.test.ts covers ref-denied, same-document-ref, unsupported-dialect, bad-schema-isolates, non-object-output, prefixItems (Ajv2020 pin), falsy-validated, dialect:{draft-07,default-2020-12}, array textfallback (+author opt-out), primitive sc, legacy projection/strip.
…schema-validators array example migration.md §JSON Schema covers the three breaks (ref-denied, Ajv default 2020-12 with per-dialect routing, structuredContent unknown) with opt-backs. schema-validators example gains list-forecasts (array outputSchema) demonstrating the known-server cast idiom and the auto-TextContent fallback / legacy strip on all four legs. Thanks @mattzcarey (#2249).
Drop per-dialect Ajv routing. The default validator supports JSON Schema
2020-12 only — the spec's only MUST (SEP-1613). Schemas declaring a
different $schema are rejected with SchemaCompileError{unsupported-dialect};
the existing custom-engine constructor (Ajv instance / {draft}) skips the
check so callers who own their dialect still can.
The pre-PR draft-07 'support' was accidental (the engine never read
$schema), and 2020-12-only matches the Go and C# SDKs. Net: one Ajv
class on the default path instead of three — restores the
@modelcontextprotocol/node bundle build at default heap.
- ajvProvider: single Ajv2020 instance; drop Ajv2019 import, AjvDialect,
KNOWN_DIALECT_URIS, _byDialect/_instanceFor; drop Ajv2019/Ajv2020
re-exports (consumers import from ajv directly; Ajv stays for the
draft-07 opt-back)
- cfWorkerProvider: same $schema check; explicit {draft} bypasses it
- client/server validators/ajv: re-export addFormats, Ajv,
AjvJsonSchemaValidator only
- dialect.test: 2020-12 compiles, draft-07/2019-09 reject, custom
engine bypasses
- e2e: drop client:jsonschema:dialect:draft-07-enforced; keep
default-is-2020-12
- migration docs: replace per-dialect routing with 2020-12-only +
opt-back
…peless-root stamping, result mutation, dead-surface JSDoc
…f (discriminatedUnion case)
…SCHEMA_KEYWORDS JSDoc accuracy isProvablyObjectShapedRoot: a typeless oneOf/anyOf/allOf root whose every member is type:'object' is also stamped, so z.discriminatedUnion / z.union of objects / z.intersection of objects keep their 2025-era outputSchema advertisement (the previous unconditional stamp happened to work for these; the fix-#2 narrow heuristic regressed them). A typeless union with any non-object member is still returned as-is. 3 new output-arm unit tests. SUBSCHEMA_KEYWORDS JSDoc: 'consumed only within this module's tests' → 'has no consumers' (the test file does not import it either).
21988e6 to
8623fba
Compare
…solve, isCallToolResult widening, isError skip, opt-back snippet
…le-level warn-once, recursive object-shape check, Ajv2020 customization docs
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
packages/core/src/types/types.ts:401-409— Two sibling docs were missed in the structuredContent-widening sweep and now contradict the behavior this PR ships: (1) examples/guides/clientGuide.examples.ts (#callTool_structuredOutput, ~lines 221-224) and the docs/client.md snippet rendered from it still use the old falsy presence checkif (result.structuredContent)and call structuredContent "a machine-readable JSON object", while the identical example in client.examples.ts / the callTool JSDoc was migrated to the!== undefined+ narrowing pattern in this PR; (2) docs/server.md's NOTE (~lines 131-141) still marks a named interface for structuredContent as a type error (not assignable to{ [key: string]: unknown }) — with structuredContent now typedunknownthat type error no longer occurs. Update the guide example/prose to match client.examples.ts and delete or rewrite the server.md NOTE (TypeScript no longer checks the value; runtime validateToolOutput is the guard).Extended reasoning...
What is stale
This PR widens
CallToolResult.structuredContenttounknownviaWidenStructuredContent(packages/core/src/types/types.ts:401-409) and migrates the structured-output examples to the SEP-2106 pattern: presence checked with!== undefinedand the value narrowed before property access. The PR updatespackages/client/src/client/client.examples.ts(#Client_callTool_structuredOutput) and thecallToolJSDoc accordingly, anddocs/migration.md/docs/migration-SKILL.mdexplicitly instruct users to replaceif (!result.structuredContent)with the=== undefinedform. Two sibling docs that describe the same surfaces were not updated and now describe the pre-PR behavior.1.
examples/guides/clientGuide.examples.ts+docs/client.mdThe
#callTool_structuredOutputregion inexamples/guides/clientGuide.examples.ts(~lines 221-224) still reads:// Machine-readable output for the client application if (result.structuredContent) { console.log(result.structuredContent); // e.g. { bmi: 22.86 } }
and
docs/client.md(~lines 272-283) renders that snippet verbatim, introduced by prose callingstructuredContent"a machine-readable JSON object". After this PRstructuredContentis typedunknownand may legally be any JSON value — includingnull,0,false,''— which the falsy check misclassifies as absent. This is the byte-identical example the PR already migrated inclient.examples.tsand thecallToolJSDoc, so the surviving copy is exactly the partial-migration shape: the prose guide now teaches the deprecated pattern that the PR's own migration table tells users to remove.Step-by-step: a reader follows docs/client.md, copies the snippet, and connects to a SEP-2106 server whose tool returns
structuredContent: 0(the PR's ownfalsy-structured-content-validatede2e fixture). Theif (result.structuredContent)gate is falsy → their app silently treats a present, schema-valid value as missing. Meanwhile the migration guide they are also reading tells them to use=== undefined— the two docs contradict each other.2.
docs/server.mdinterface-vs-type-alias NOTEdocs/server.md(~lines 131-141) contains a NOTE telling authors to use a type alias rather than an interface forstructuredContent, because named interfaces lack implicit index signatures and so are not assignable to{ [key: string]: unknown }; it marksinterface BmiResult { bmi: number }as a type error and recommends thestructuredContent: { ...result }spread workaround. That was true pre-PR (the field wasRecord<string, unknown>). After this PR the handler return path uses the widened publicCallToolResult(ToolCallback's result type), sostructuredContentisunknown— any value, including a named-interface-typed object, is assignable. The PR's own new test inpackages/server/test/server/mcp.compat.test.tsdemonstrates this: evenstructuredContent: 'not-an-array'compiles, with a comment noting that runtimevalidateToolOutputis the guard. The documented type error therefore no longer occurs, and the type-alias guidance and spread workaround are obsolete.Why this matters / why nothing catches it
Neither file is in the diff and neither is exercised by tests, so the suite stays green while the published guides describe behavior the PR removed. Both verifier passes confirmed these are the only surviving instances of the old falsy-check example and the only doc still describing the old assignability constraint; no existing review comment covers them (the prior comments target migration.md, the JSDoc, and the changesets).
How to fix
- In
examples/guides/clientGuide.examples.ts#callTool_structuredOutput, apply the same!== undefined+ narrowing pattern used inclient.examples.ts, and adjust the docs/client.md introducing sentence ("a machine-readable JSON object" → "any JSON value, typedunknown"); regenerate the rendered snippet. - Delete the docs/server.md NOTE, or replace it with the new guidance:
structuredContentis typedunknown, so TypeScript no longer checks the handler's value against the declaredoutputSchema— runtimevalidateToolOutputis the guard.
Docs/example consistency only — no runtime impact — but worth fixing in this PR since it is the same example/constraint the PR already updated elsewhere.
- In
…ceiling), ToolResultContent widen, migration prose alignment
…ing-surface structuredContent widen
…eaks cf-workers), reword to 'install ajv yourself', clientGuide/server.md SEP-2106 alignment
Implements SEP-2106 (tool
inputSchema/outputSchemaconform to JSON Schema 2020-12;structuredContentmay be any JSON value) and the SEP-1613 dialect default, with the spec's$ref/composition-bounds security guidance.Motivation and Context
The 2026-07-28 spec widens tool schemas to the full JSON Schema 2020-12 vocabulary and requires implementations to (a) default to a 2020-12 validator, (b) refuse to auto-dereference network
$refs, (c) handle unsupported$schemadialects gracefully, and (d) accept any JSON value asstructuredContent. The Node default validator was draft-07 withstrict:false(2020-12 keywords silently ignored), the providers did not read$schema, andstructuredContentwas typed/parsed as a record.How Has This Been Tested?
schemaBounds.test.ts(ref classifier, bounds, cycle-guard, truncation),dialect.test.ts(no$schema→ 2020-12 default; non-2020-12$schema→SchemaCompileError; custom-instance ctor bypasses the check),importGuard.test.ts(no HTTP client reachable fromvalidators/**).test/e2e/scenarios/jsonschema.test.ts):ref-denied,same-document-ref-ok,unsupported-dialect-graceful,bad-schema-isolates-tool,non-object-output,2020-12:prefixItems,falsy-structured-content-validated,dialect:default-is-2020-12,array-structured-content-textfallback(+ author opt-out),primitive-structured-content,2025:jsonschema:{object-root-still-enforced, non-object-structured-content-stripped}.json-schema-ref-no-deref(client) andjson-schema-2020-12(server) burned from expected-failures.examples/schema-validatorsruns on all four transport×era legs (run:examples):list-forecastsreturns arraystructuredContent; modern clients receive the array, legacy clients receive the auto-injected JSON text block.=== undefinedfalsy guard, or the$refguard each fails the suite.Breaking Changes
Three, all documented in
docs/migration.md§ JSON Schema:CallToolResult.structuredContentis now typedunknown(wasRecord<string, unknown>). Source-breaking for typed consumers; runtime parse on the 2025 path is unchanged. Cast or narrow before property access.strict:false). Absent$schemadefaults to 2020-12;$schemadeclaring anything other than 2020-12 is rejected withSchemaCompileError{kind:'unsupported-dialect'}. Pass a pre-configured Ajv instance toAjvJsonSchemaValidator(ajv)to validate other dialects.$ref/$dynamicRefare rejected withSchemaCompileError{ref-denied}before compile (the spec's MUST-NOT auto-dereference). Per-tool failure —tools/listresolves with every tool;callToolon the bad one fails withInvalidParamsbefore the request is sent. Use#/$defs/…or#anchor.Types of changes
Checklist
Additional context
AjvJsonSchemaValidatorvalidates as JSON Schema 2020-12; declared$schemavalues other than 2020-12 are rejected with a typed error pointing at the custom-instance escape hatch. (Per-dialect routing for draft-07/2019-09 is a tracked follow-up.)schemaBounds.ts.assertSchemaSafeToCompile(cycle-guarded; depth 64 / subschema-count 10 000),isSameDocumentRef,SchemaCompileError(attacker-controlled$ref/$schemastrings truncated to 200 chars). Exported for customjsonSchemaValidatorimplementations to call.outputSchemafails to compile (ref-denied, unsupported dialect, bounds, engine error) does not poisontools/list; the failure is stored on the response cache's per-tool entry and surfaced asProtocolError(InvalidParams)when that tool is called. Compile state invalidates with the cached tool definition (list_changed, reconnect, re-list).structuredContent, McpServer returns a result with a{type:'text', text: JSON.stringify(sc)}block appended (any author-supplied text block opts out) and, on the 2025 era, with the non-object value omitted so a strictly-conformant 2025 client doesn't reject the entire response. A non-object (or typeless non-object-shaped)outputSchemais dropped from the legacytools/listprojection with a warn-once.schemaBounds.tsand several runtime fixes are adapted from feat(core,server,client): implement SEP-2106 (tool schemas conform to JSON Schema 2020-12) #2249.