Skip to content

feat(agent-core): progressive tool disclosure via select_tools#1369

Open
starquakee wants to merge 3 commits into
MoonshotAI:mainfrom
starquakee:feat/select-tools
Open

feat(agent-core): progressive tool disclosure via select_tools#1369
starquakee wants to merge 3 commits into
MoonshotAI:mainfrom
starquakee:feat/select-tools

Conversation

@starquakee

@starquakee starquakee commented Jul 4, 2026

Copy link
Copy Markdown

Related Issue

No GitHub issue — this implements an internally reviewed design for progressive tool disclosure, targeting the message-level tool declaration contract (messages[].tools). Problem statement below.

Problem

Sessions with many MCP servers pay for every tool schema inside the request's top-level tools[] on every step: token cost scales with the number of registered tools, tool-choice accuracy degrades with schema noise, and any change to the tool set mutates the request prefix and busts the provider prompt cache.

Message-level tool declarations (messages[].tools on a role:"system" message) make a different shape possible: keep the top-level tools[] immutable (core tools only) and load MCP tool schemas on demand as append-only conversation context.

What changed

Client-side progressive tool disclosure, gated by three conditions (capability.select_tools × capability.tool_use × tool-select experimental flag, env KIMI_CODE_EXPERIMENTAL_TOOL_SELECT). Behavior is unchanged by default: every current model keeps the capability off, and with any gate closed the outbound request stays byte-identical to main (pinned by gate-closed regression tests).

  • kosong: Message.tools — the append-only load primitive, serialized by the Kimi provider as messages[].tools ({type:"function", function:{...}} wrapping, no content, shared schema normalization and $ builtin branch); Tool.deferred — client-internal marker stripped once in generate() so loaded tools stay executable without re-entering the top level; select_tools capability bit (UNKNOWN/catalog default false).
  • select_tools builtin tool: load-by-exact-name with per-name settlement (Loaded / Already available / Unknown); schemas always read from the live registry, never from history; the injected schema message has injection origin so undo keeps it (tool loading is protocol state, not conversation). Not main-agent-only — subagents run their own disclosure.
  • Loadable-tools manifest: <tools_added>/<tools_removed> announcements maintained by turn-boundary diffs (ToolsDiffInjector; also after full compaction). Announcements fold from history — no separate persisted ledger — so undo/compaction/resume self-heal. Quiet turns append nothing, keeping the prompt cache warm.
  • Executable-table split: top-level request view stays core + select_tools forever; loaded MCP tools join loopTools as deferred extras (dispatchable, stripped from the wire). The loop re-reads the table per step (RunTurnInput.buildTools), so a selected tool dispatches on the very next step of the same turn. Preflight distinguishes "available but not loaded" from "loaded but its server is disconnected".
  • Cross-cuts: projection strips schema messages + announcements for models without the capability (lossless mid-session model switch in both directions — canonical history is never rewritten); compaction excludes protocol context from the summarizer input and rebuilds loaded schemas keep-all afterwards, with a budget guard (rebuilt floor stays within half the compaction trigger, dropped schemas are simply re-selectable) so the rebuild can never push the context back into the auto-compaction band; token estimation counts message.tools; the loaded-set pending ledger is cleared at /clear and at the compaction boundary.

Review notes:

  • RunTurnInput gains optional buildTools (wins over the static tools snapshot). Deliberate semantic change pinned in config.test.ts: a mid-turn setActiveTools now applies from the next step instead of the next turn.
  • Compaction anti-loop guard (lastCompactedTokenCount) is baselined on the counter that includes the schema rebuild; tokensAfter keeps its persisted users+summary semantics (folding the rebuild in would double-count against the pending estimate on both live and restore paths).
  • Non-Kimi providers (OpenAI / Anthropic / Google) default-skip tool-declaration-only messages (second commit): their explicit field construction already keeps tools off the wire, but the content-free leftover would be rejected (OpenAI: system message without content) or serialize as an empty <system></system> turn. Unreachable in kimi-code — the projection gate strips first; this is defense-in-depth for direct kosong consumers.
  • Deliberately out of scope: schema offloading, budgeted (partial) rebuild beyond the trigger guard, server-catalog capability passthrough, vis rendering of message.tools.
  • Expected against current production endpoints: enabling the gate anyway yields 400 tokenization failed on select — the client ships ahead of server support; the capability stays off for all catalogued models until a supporting model lands.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset. (Changeset hand-authored per .changeset/README.md: @moonshot-ai/kimi-code minor + @moonshot-ai/kimi-code-sdk minor.)
  • Ran gen-docs skill, or this PR needs no doc update. (No doc update: the feature is inert until a model with the capability is catalogued — the [experimental] config-files section is currently commented out on main; the user-facing description lives in the changeset. Docs to follow with the first supporting model.)

@changeset-bot

changeset-bot Bot commented Jul 4, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 24c9006

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@moonshot-ai/kimi-code Minor
@moonshot-ai/kimi-code-sdk Minor
@moonshot-ai/acp-adapter Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Keep MCP tool schemas out of the immutable top-level tools[] and let the
model load them on demand, preserving the provider prompt cache:

- kosong: Message.tools (append-only load primitive, serialized as Kimi
  messages[].tools with type:function wrapping and no content),
  Tool.deferred (stripped once in generate() so loaded tools stay
  executable without re-entering the top level), select_tools capability
  bit (UNKNOWN/catalog default false).
- select_tools builtin: load-by-exact-name, three-branch semantics
  settled per name (Loaded / Already available / Unknown), schemas read
  from the live registry, injection-origin schema messages survive undo.
- ToolsDiffInjector: <tools_added>/<tools_removed> announcements at turn
  boundaries and post-compaction, folded from history (undo/compaction/
  resume self-heal), appended only when the loadable set changes.
- Loaded-tools ledger = history scan + defer-window pending set (cleared
  on /clear); loop re-reads the executable table per step so a selected
  tool dispatches on the next step of the same turn; preflight
  distinguishes not-loaded from loaded-but-disconnected.
- Cross-cuts: projection strips protocol context for non-select_tools
  models (lossless mid-session model switch both ways), compaction
  filters it from the summarizer input and rebuilds loaded schemas
  keep-all after folding, token estimation counts message.tools, request
  logging reflects the post-strip wire tools.
- Three-condition gate: capability.select_tools x capability.tool_use x
  tool-select experimental flag (KIMI_CODE_EXPERIMENTAL_TOOL_SELECT).
  Any gate closed reproduces the inline request byte-for-byte; all
  current models keep the capability off, so behavior is unchanged
  until a supporting model is catalogued. The SDK catalog-to-alias
  mapping forwards the capability so catalog-driven setups can enable it.
@starquakee starquakee force-pushed the feat/select-tools branch from 55afb83 to ed5958f Compare July 4, 2026 09:45

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55afb8319e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Deliberately not main-only: subagents run their own disclosure and
// need select_tools just as much (their ToolManager computes its own
// loadable set from its profile's mcp__* patterns).
this.agent.toolSelectEnabled && new b.SelectToolsTool(this.agent),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Refresh builtins when the tool-select flag flips

When a user enables experimental.tool-select through the existing settings/config RPC in a running session, the shared FlagResolver changes immediately but initializeBuiltinTools() is not called. That makes agent.toolSelectEnabled true for loopTools and injectToolsDiff, so MCP tools are removed from the top-level list and announcements tell the model to call select_tools; however the builtin map was built while the flag was off, so this conditional never added SelectToolsTool and selectToolsName stays empty. The session loses MCP access until another model/cwd change or restart refreshes builtins.

Useful? React with 👍 / 👎.

tools,
// Fetch the tool table fresh each step so mid-turn loads are
// dispatchable immediately; fall back to the static snapshot.
tools: buildTools !== undefined ? buildTools() : tools,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rebuild the tool table after beforeStep mutations

In each loop iteration, buildTools() runs here before executeLoopStep invokes hooks.beforeStep; TurnFlow.beforeStep can run full compaction, which now clears the dynamic-tool ledger and may trim schema rebuilds. In that case the request messages are built after compaction and no longer contain a loaded MCP schema, but preflight still uses this pre-compaction tools array and can execute a direct MCP call that should have been rejected/re-selected.

Useful? React with 👍 / 👎.

fengchenchen added 2 commits July 4, 2026 18:02
Message-level tool declarations (messages[].tools) are a Kimi wire
feature. The other providers' explicit field construction already keeps
the tools field off the wire, but the content-free leftover message
would be rejected (OpenAI: system message without content) or serialize
as a garbage <system></system> turn (Anthropic/Google system-to-user
wrapping). Skip such messages entirely via a shared predicate; a message
that also carries content only loses the tools field, as before.

Unreachable in kimi-code (the projection gate strips dynamic-tool
context for models without the select_tools capability before any
provider sees it) — defense-in-depth for direct kosong consumers.
… post-compaction state

Two fixes from PR review:

- Register select_tools unconditionally and gate only its exposure in
  loopTools. The tool-select flag can flip at runtime (config reload
  calls setConfigOverrides on the live resolver) without
  initializeBuiltinTools re-running; previously the disclosure shape
  activated while the tool itself was unregistered, cutting the session
  off from MCP entirely until a model/cwd change rebuilt the builtins.
  A profile listing the name explicitly still never surfaces it in
  inline mode, and execution guards the flip race defensively.

- Resolve the per-step tool table AFTER beforeStep, next to
  buildMessages. beforeStep can run full compaction, which trims loaded
  schemas and rewrites the ledger; a table captured before it could
  still dispatch a tool whose schema the model no longer has. The
  executable table and the request messages now always reflect the same
  state, so a trimmed tool is rejected with select guidance instead of
  executed.
@starquakee

Copy link
Copy Markdown
Author

Both Codex findings verified as real and fixed in 24c9006:

  1. Runtime flag flip — confirmed: config reload calls setConfigOverrides on the live resolver (rpc/core-impl.ts), so the gate could open without initializeBuiltinTools() re-running, activating the disclosure shape while select_tools itself was unregistered (session loses MCP access). Fix: SelectToolsTool is now registered unconditionally; only its exposure is gated in loopTools (plus a defensive guard in execute for the flip race, and explicit-listing in a profile still never surfaces it in inline mode). Regression test: full off→on→off flip at runtime with the select→dispatch chain exercised.

  2. Stale tool table across beforeStep — confirmed: buildTools() was evaluated at the executeLoopStep call site, before hooks.beforeStep (where compaction can trim schema rebuilds and rewrite the ledger), so preflight could dispatch a tool the freshly-built messages no longer contain. Fix: the table is now resolved inside the step, after beforeStep and next to buildMessages, so tools and messages always come from the same state. Regression test: compaction-trim at step boundary followed by a direct call to the trimmed tool → rejected with select guidance, MCP client never invoked (verified the test fails under the old evaluation order).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant