feat(agent-core): progressive tool disclosure via select_tools#1369
feat(agent-core): progressive tool disclosure via select_tools#1369starquakee wants to merge 3 commits into
Conversation
🦋 Changeset detectedLatest commit: 24c9006 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Keep MCP tool schemas out of the immutable top-level tools[] and let the model load them on demand, preserving the provider prompt cache: - kosong: Message.tools (append-only load primitive, serialized as Kimi messages[].tools with type:function wrapping and no content), Tool.deferred (stripped once in generate() so loaded tools stay executable without re-entering the top level), select_tools capability bit (UNKNOWN/catalog default false). - select_tools builtin: load-by-exact-name, three-branch semantics settled per name (Loaded / Already available / Unknown), schemas read from the live registry, injection-origin schema messages survive undo. - ToolsDiffInjector: <tools_added>/<tools_removed> announcements at turn boundaries and post-compaction, folded from history (undo/compaction/ resume self-heal), appended only when the loadable set changes. - Loaded-tools ledger = history scan + defer-window pending set (cleared on /clear); loop re-reads the executable table per step so a selected tool dispatches on the next step of the same turn; preflight distinguishes not-loaded from loaded-but-disconnected. - Cross-cuts: projection strips protocol context for non-select_tools models (lossless mid-session model switch both ways), compaction filters it from the summarizer input and rebuilds loaded schemas keep-all after folding, token estimation counts message.tools, request logging reflects the post-strip wire tools. - Three-condition gate: capability.select_tools x capability.tool_use x tool-select experimental flag (KIMI_CODE_EXPERIMENTAL_TOOL_SELECT). Any gate closed reproduces the inline request byte-for-byte; all current models keep the capability off, so behavior is unchanged until a supporting model is catalogued. The SDK catalog-to-alias mapping forwards the capability so catalog-driven setups can enable it.
55afb83 to
ed5958f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 55afb8319e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Deliberately not main-only: subagents run their own disclosure and | ||
| // need select_tools just as much (their ToolManager computes its own | ||
| // loadable set from its profile's mcp__* patterns). | ||
| this.agent.toolSelectEnabled && new b.SelectToolsTool(this.agent), |
There was a problem hiding this comment.
Refresh builtins when the tool-select flag flips
When a user enables experimental.tool-select through the existing settings/config RPC in a running session, the shared FlagResolver changes immediately but initializeBuiltinTools() is not called. That makes agent.toolSelectEnabled true for loopTools and injectToolsDiff, so MCP tools are removed from the top-level list and announcements tell the model to call select_tools; however the builtin map was built while the flag was off, so this conditional never added SelectToolsTool and selectToolsName stays empty. The session loses MCP access until another model/cwd change or restart refreshes builtins.
Useful? React with 👍 / 👎.
| tools, | ||
| // Fetch the tool table fresh each step so mid-turn loads are | ||
| // dispatchable immediately; fall back to the static snapshot. | ||
| tools: buildTools !== undefined ? buildTools() : tools, |
There was a problem hiding this comment.
Rebuild the tool table after beforeStep mutations
In each loop iteration, buildTools() runs here before executeLoopStep invokes hooks.beforeStep; TurnFlow.beforeStep can run full compaction, which now clears the dynamic-tool ledger and may trim schema rebuilds. In that case the request messages are built after compaction and no longer contain a loaded MCP schema, but preflight still uses this pre-compaction tools array and can execute a direct MCP call that should have been rejected/re-selected.
Useful? React with 👍 / 👎.
Message-level tool declarations (messages[].tools) are a Kimi wire feature. The other providers' explicit field construction already keeps the tools field off the wire, but the content-free leftover message would be rejected (OpenAI: system message without content) or serialize as a garbage <system></system> turn (Anthropic/Google system-to-user wrapping). Skip such messages entirely via a shared predicate; a message that also carries content only loses the tools field, as before. Unreachable in kimi-code (the projection gate strips dynamic-tool context for models without the select_tools capability before any provider sees it) — defense-in-depth for direct kosong consumers.
… post-compaction state Two fixes from PR review: - Register select_tools unconditionally and gate only its exposure in loopTools. The tool-select flag can flip at runtime (config reload calls setConfigOverrides on the live resolver) without initializeBuiltinTools re-running; previously the disclosure shape activated while the tool itself was unregistered, cutting the session off from MCP entirely until a model/cwd change rebuilt the builtins. A profile listing the name explicitly still never surfaces it in inline mode, and execution guards the flip race defensively. - Resolve the per-step tool table AFTER beforeStep, next to buildMessages. beforeStep can run full compaction, which trims loaded schemas and rewrites the ledger; a table captured before it could still dispatch a tool whose schema the model no longer has. The executable table and the request messages now always reflect the same state, so a trimmed tool is rejected with select guidance instead of executed.
|
Both Codex findings verified as real and fixed in 24c9006:
|
Related Issue
No GitHub issue — this implements an internally reviewed design for progressive tool disclosure, targeting the message-level tool declaration contract (
messages[].tools). Problem statement below.Problem
Sessions with many MCP servers pay for every tool schema inside the request's top-level
tools[]on every step: token cost scales with the number of registered tools, tool-choice accuracy degrades with schema noise, and any change to the tool set mutates the request prefix and busts the provider prompt cache.Message-level tool declarations (
messages[].toolson arole:"system"message) make a different shape possible: keep the top-leveltools[]immutable (core tools only) and load MCP tool schemas on demand as append-only conversation context.What changed
Client-side progressive tool disclosure, gated by three conditions (
capability.select_tools×capability.tool_use×tool-selectexperimental flag, envKIMI_CODE_EXPERIMENTAL_TOOL_SELECT). Behavior is unchanged by default: every current model keeps the capability off, and with any gate closed the outbound request stays byte-identical to main (pinned by gate-closed regression tests).Message.tools— the append-only load primitive, serialized by the Kimi provider asmessages[].tools({type:"function", function:{...}}wrapping, nocontent, shared schema normalization and$builtin branch);Tool.deferred— client-internal marker stripped once ingenerate()so loaded tools stay executable without re-entering the top level;select_toolscapability bit (UNKNOWN/catalog default false).select_toolsbuiltin tool: load-by-exact-name with per-name settlement (Loaded / Already available / Unknown); schemas always read from the live registry, never from history; the injected schema message hasinjectionorigin so undo keeps it (tool loading is protocol state, not conversation). Not main-agent-only — subagents run their own disclosure.<tools_added>/<tools_removed>announcements maintained by turn-boundary diffs (ToolsDiffInjector; also after full compaction). Announcements fold from history — no separate persisted ledger — so undo/compaction/resume self-heal. Quiet turns append nothing, keeping the prompt cache warm.core + select_toolsforever; loaded MCP tools joinloopToolsasdeferredextras (dispatchable, stripped from the wire). The loop re-reads the table per step (RunTurnInput.buildTools), so a selected tool dispatches on the very next step of the same turn. Preflight distinguishes "available but not loaded" from "loaded but its server is disconnected".message.tools; the loaded-set pending ledger is cleared at/clearand at the compaction boundary.Review notes:
RunTurnInputgains optionalbuildTools(wins over the statictoolssnapshot). Deliberate semantic change pinned inconfig.test.ts: a mid-turnsetActiveToolsnow applies from the next step instead of the next turn.lastCompactedTokenCount) is baselined on the counter that includes the schema rebuild;tokensAfterkeeps its persisted users+summary semantics (folding the rebuild in would double-count against the pending estimate on both live and restore paths).toolsoff the wire, but the content-free leftover would be rejected (OpenAI: system message without content) or serialize as an empty<system></system>turn. Unreachable in kimi-code — the projection gate strips first; this is defense-in-depth for direct kosong consumers.message.tools.400 tokenization failedon select — the client ships ahead of server support; the capability stays off for all catalogued models until a supporting model lands.Checklist
gen-changesetsskill, or this PR needs no changeset. (Changeset hand-authored per.changeset/README.md:@moonshot-ai/kimi-codeminor +@moonshot-ai/kimi-code-sdkminor.)gen-docsskill, or this PR needs no doc update. (No doc update: the feature is inert until a model with the capability is catalogued — the[experimental]config-files section is currently commented out on main; the user-facing description lives in the changeset. Docs to follow with the first supporting model.)