MoonshotAI · RealKai42 · Mar 2, 2026 · Mar 2, 2026 · Mar 2, 2026 · Mar 2, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -11,6 +11,8 @@ Only write entries that are worth mentioning to users.
 
 ## Unreleased
 
+- Core: Add `compaction_trigger_ratio` config option (default `0.85`) to control when auto-compaction triggers — compaction now fires when context usage reaches the configured ratio or when remaining space falls below `reserved_context_size`, whichever comes first
+- Core: Support custom instructions in `/compact` command (e.g., `/compact keep database discussions`) to guide what the compaction preserves
 - Web: Add URL action parameters (`?action=create` to open create-session dialog, `?action=create-in-dir&workDir=xxx` to create a session directly) for external integrations, and support Cmd/Ctrl+Click on new-session buttons to open session creation in a new browser tab
 - Web: Add todo list display in prompt toolbar — shows task progress with expandable panel when the `SetTodoList` tool is active
 - ACP: Add authentication check for session operations with `AUTH_REQUIRED` error responses for terminal-based login flow

diff --git a/docs/en/configuration/config-files.md b/docs/en/configuration/config-files.md
@@ -57,6 +57,7 @@ max_steps_per_turn = 100
 max_retries_per_step = 3
 max_ralph_iterations = 0
 reserved_context_size = 50000
+compaction_trigger_ratio = 0.85
 
 [services.moonshot_search]
 base_url = "https://api.kimi.com/coding/v1/search"
@@ -123,6 +124,7 @@ capabilities = ["thinking", "image_in"]
 | `max_retries_per_step` | `integer` | `3` | Maximum retries per step |
 | `max_ralph_iterations` | `integer` | `0` | Extra iterations after each user message; `0` disables; `-1` is unlimited |
 | `reserved_context_size` | `integer` | `50000` | Reserved token count for LLM response generation; auto-compaction triggers when `context_tokens + reserved_context_size >= max_context_size` |
+| `compaction_trigger_ratio` | `float` | `0.85` | Context usage ratio threshold for auto-compaction (0.5–0.99); auto-compaction triggers when `context_tokens >= max_context_size * compaction_trigger_ratio`, whichever condition is met first with `reserved_context_size` |
 
 ### `services`
 

diff --git a/docs/en/customization/wire-mode.md b/docs/en/customization/wire-mode.md
@@ -73,7 +73,7 @@ interface JSONRPCError {
 Added in Wire 1.1. Legacy clients can skip this request and send `prompt` directly.
 :::
 
-- **Direction**: client → agent
+- **Direction**: Client → Agent
 - **Type**: Request (requires response)
 
 Optional handshake request for negotiating protocol version, submitting external tool definitions, and retrieving the slash command list.
@@ -330,7 +330,7 @@ If no turn is in progress:
 
 ### `event`
 
-- **Direction**: agent → client
+- **Direction**: Agent → Client
 - **Type**: Notification (no response needed)
 
 Events emitted by the agent during a turn. No `id` field, client doesn't need to respond.
@@ -351,7 +351,7 @@ interface EventParams {
 
 ### `request`
 
-- **Direction**: agent → client
+- **Direction**: Agent → Client
 - **Type**: Request (requires response)
 
 Requests from the agent to the client, used for approval confirmation or external tool calls. The client must respond before the agent can continue execution.

diff --git a/docs/en/guides/interaction.md b/docs/en/guides/interaction.md
@@ -87,7 +87,7 @@ Each question supports 2–4 predefined options, and the AI will set appropriate
 The AI only uses this tool when your choice genuinely affects subsequent actions. For decisions that can be inferred from context, the AI will decide on its own and continue execution.
 :::
 
-## Approvals
+## Approvals and confirmations
 
 When the AI needs to perform operations that may have an impact (such as modifying files or running commands), Kimi Code CLI will request your confirmation.
 

diff --git a/docs/en/guides/sessions.md b/docs/en/guides/sessions.md
@@ -72,6 +72,12 @@ Enter `/compact` to have the AI summarize the current conversation and replace t
 /compact
 ```
 
+You can also append custom instructions after the command to tell the AI what content to prioritize preserving during compaction:
+
+```
+/compact keep the database-related discussion
+```
+
 Compacting preserves key information while reducing token consumption. This is useful when the conversation is long but you still want to retain some context.
 
 ::: tip

diff --git a/docs/en/reference/kimi-web.md b/docs/en/reference/kimi-web.md
@@ -186,6 +186,7 @@ Web UI provides a unified prompt toolbar above the input box, displaying various
 - **Activity status**: Shows the current agent state (processing, waiting for approval, etc.)
 - **Message queue**: Queue follow-up messages while the AI is processing; queued messages are sent automatically when the current response completes
 - **File changes**: Detects Git repository status, showing the number of new, modified, and deleted files (including untracked files). Click to view a detailed list of changes
+- **Todo list**: When the `SetTodoList` tool is active, shows task progress with support for expanding to view the detailed list
 
 ::: info Changed
 Git diff status bar added in version 1.5. Activity status indicator added in version 1.9. Version 1.10 unified it into the prompt toolbar. Version 1.11 moved the context usage indicator to the prompt toolbar.

diff --git a/docs/en/reference/slash-commands.md b/docs/en/reference/slash-commands.md
@@ -63,7 +63,7 @@ This command is only available when using the default configuration file. If a c
 
 ### `/editor`
 
-Set the default external editor. When called without arguments, displays an interactive selection interface; you can also specify the editor command directly, e.g., `/editor vim`. After configuration, pressing `Ctrl-O` will open this editor to edit the current input content. See [Keyboard shortcuts](./keyboard.md#external-editor) for details.
+Set the external editor. When called without arguments, displays an interactive selection interface; you can also specify the editor command directly, e.g., `/editor vim`. After configuration, pressing `Ctrl-O` will open this editor to edit the current input content. See [Keyboard shortcuts](./keyboard.md#external-editor) for details.
 
 ### `/reload`
 
@@ -82,7 +82,7 @@ Debug information is displayed in a pager, press `q` to exit.
 
 Display API usage and quota information, showing quota usage with progress bars and remaining percentages.
 
-Aliases: `/status`
+Alias: `/status`
 
 ::: tip
 This command only works with the Kimi Code platform.
@@ -118,7 +118,7 @@ Alias: `/reset`
 
 ### `/compact`
 
-Manually compact the context to reduce token usage.
+Manually compact the context to reduce token usage. You can append custom instructions after the command to tell the AI which information to prioritize preserving during compaction, e.g., `/compact preserve database-related discussions`.
 
 When the context is too long, Kimi Code CLI will automatically trigger compaction. This command allows manually triggering the compaction process.
 

diff --git a/docs/en/release-notes/changelog.md b/docs/en/release-notes/changelog.md
@@ -4,6 +4,8 @@ This page documents the changes in each Kimi Code CLI release.
 
 ## Unreleased
 
+- Core: Add `compaction_trigger_ratio` config option (default `0.85`) to control when auto-compaction triggers — compaction now fires when context usage reaches the configured ratio or when remaining space falls below `reserved_context_size`, whichever comes first
+- Core: Support custom instructions in `/compact` command (e.g., `/compact keep database discussions`) to guide what the compaction preserves
 - Web: Add URL action parameters (`?action=create` to open create-session dialog, `?action=create-in-dir&workDir=xxx` to create a session directly) for external integrations, and support Cmd/Ctrl+Click on new-session buttons to open session creation in a new browser tab
 - Web: Add todo list display in prompt toolbar — shows task progress with expandable panel when the `SetTodoList` tool is active
 - ACP: Add authentication check for session operations with `AUTH_REQUIRED` error responses for terminal-based login flow

diff --git a/docs/zh/configuration/config-files.md b/docs/zh/configuration/config-files.md
@@ -57,6 +57,7 @@ max_steps_per_turn = 100
 max_retries_per_step = 3
 max_ralph_iterations = 0
 reserved_context_size = 50000
+compaction_trigger_ratio = 0.85
 
 [services.moonshot_search]
 base_url = "https://api.kimi.com/coding/v1/search"
@@ -123,6 +124,7 @@ capabilities = ["thinking", "image_in"]
 | `max_retries_per_step` | `integer` | `3` | 单步最大重试次数 |
 | `max_ralph_iterations` | `integer` | `0` | 每个 User 消息后额外自动迭代次数；`0` 表示关闭；`-1` 表示无限 |
 | `reserved_context_size` | `integer` | `50000` | 预留给 LLM 响应生成的 token 数量；当 `context_tokens + reserved_context_size >= max_context_size` 时自动触发压缩 |
+| `compaction_trigger_ratio` | `float` | `0.85` | 触发自动压缩的上下文使用率阈值（0.5–0.99）；当 `context_tokens >= max_context_size * compaction_trigger_ratio` 时自动触发压缩，与 `reserved_context_size` 条件取先触发者 |
 
 ### `services`
 

diff --git a/docs/zh/guides/sessions.md b/docs/zh/guides/sessions.md
@@ -72,6 +72,12 @@ kimi --session abc123
 /compact
 ```
 
+你也可以在命令后附带自定义指引，告诉 AI 在压缩时优先保留哪些内容：
+
+```
+/compact 保留数据库相关的讨论
+```
+
 压缩会保留关键信息，同时减少 token 消耗。这在对话很长但你还想保留一些上下文时很有用。
 
 ::: tip 提示

diff --git a/docs/zh/reference/kimi-web.md b/docs/zh/reference/kimi-web.md
@@ -186,6 +186,7 @@ Web UI 在输入框上方提供统一的提示工具栏，以可折叠标签页
 - **活动状态**：显示 Agent 当前状态（处理中、等待审批等）
 - **消息队列**：在 AI 处理过程中可以排队发送后续消息，待当前回复完成后自动发送
 - **文件变更**：检测 Git 仓库状态，显示新增、修改和删除的文件数量（包含未跟踪文件），点击可查看详细的变更列表
+- **待办事项**：当 `SetTodoList` 工具处于活动状态时，显示任务进度，支持展开查看详细列表
 
 ::: info 变更
 Git diff 状态栏新增于 1.5 版本。1.9 版本添加了活动状态指示器。1.10 版本将其统一为提示工具栏。1.11 版本将上下文用量指示器移至提示工具栏。

diff --git a/docs/zh/reference/slash-commands.md b/docs/zh/reference/slash-commands.md
@@ -118,7 +118,7 @@
 
 ### `/compact`
 
-手动压缩上下文，减少 token 使用。
+手动压缩上下文，减少 token 使用。可以在命令后附带自定义指引，告诉 AI 在压缩时优先保留哪些信息，例如 `/compact 保留数据库相关的讨论`。
 
 当上下文过长时，Kimi Code CLI 会自动触发压缩。此命令可手动触发压缩过程。
 

diff --git a/docs/zh/release-notes/changelog.md b/docs/zh/release-notes/changelog.md
@@ -4,6 +4,8 @@
 
 ## 未发布
 
+- Core：新增 `compaction_trigger_ratio` 配置项（默认 `0.85`），用于控制自动压缩的触发时机——当上下文用量达到配置比例或剩余空间低于 `reserved_context_size` 时触发压缩，以先满足的条件为准
+- Core：`/compact` 命令支持自定义指令（如 `/compact keep database discussions`），可指导压缩时重点保留的内容
 - Web：新增 URL 操作参数（`?action=create` 打开创建会话对话框，`?action=create-in-dir&workDir=xxx` 直接创建会话）用于外部集成，支持 Cmd/Ctrl+点击新建会话按钮在新标签页中打开会话创建
 - Web：在提示输入工具栏中添加待办列表显示——当 `SetTodoList` 工具激活时，显示任务进度并支持展开面板查看详情
 - ACP：为会话操作添加认证检查，未认证时返回 `AUTH_REQUIRED` 错误响应，支持终端登录流程

diff --git a/src/kimi_cli/config.py b/src/kimi_cli/config.py
@@ -80,7 +80,12 @@ class LoopControl(BaseModel):
     """Extra iterations after the first turn in Ralph mode. Use -1 for unlimited."""
     reserved_context_size: int = Field(default=50_000, ge=1000)
     """Reserved token count for LLM response generation. Auto-compaction triggers when
-    context_tokens + reserved_context_size >= max_context_size. Default is 50000."""
+    either context_tokens + reserved_context_size >= max_context_size or
+    context_tokens >= max_context_size * compaction_trigger_ratio. Default is 50000."""
+    compaction_trigger_ratio: float = Field(default=0.85, ge=0.5, le=0.99)
+    """Context usage ratio threshold for auto-compaction. Default is 0.85 (85%).
+    Auto-compaction triggers when context_tokens >= max_context_size * compaction_trigger_ratio
+    or when context_tokens + reserved_context_size >= max_context_size."""
 
 
 class MoonshotSearchConfig(BaseModel):

diff --git a/src/kimi_cli/soul/compaction.py b/src/kimi_cli/soul/compaction.py
@@ -53,15 +53,37 @@ def _estimate_text_tokens(messages: Sequence[Message]) -> int:
     return total_chars // 4
 
 
+def should_auto_compact(
+    token_count: int,
+    max_context_size: int,
+    *,
+    trigger_ratio: float,
+    reserved_context_size: int,
+) -> bool:
+    """Determine whether auto-compaction should be triggered.
+
+    Returns True when either condition is met (whichever fires first):
+    - Ratio-based: token_count >= max_context_size * trigger_ratio
+    - Reserved-based: token_count + reserved_context_size >= max_context_size
+    """
+    return (
+        token_count >= max_context_size * trigger_ratio
+        or token_count + reserved_context_size >= max_context_size
+    )
+
+
 @runtime_checkable
 class Compaction(Protocol):
-    async def compact(self, messages: Sequence[Message], llm: LLM) -> CompactionResult:
+    async def compact(
+        self, messages: Sequence[Message], llm: LLM, *, custom_instruction: str = ""
+    ) -> CompactionResult:
         """
         Compact a sequence of messages into a new sequence of messages.
 
         Args:
             messages (Sequence[Message]): The messages to compact.
             llm (LLM): The LLM to use for compaction.
+            custom_instruction: Optional user instruction to guide compaction focus.
 
         Returns:
             CompactionResult: The compacted messages and token usage from the compaction LLM call.
@@ -82,8 +104,10 @@ class SimpleCompaction:
     def __init__(self, max_preserved_messages: int = 2) -> None:
         self.max_preserved_messages = max_preserved_messages
 
-    async def compact(self, messages: Sequence[Message], llm: LLM) -> CompactionResult:
-        compact_message, to_preserve = self.prepare(messages)
+    async def compact(
+        self, messages: Sequence[Message], llm: LLM, *, custom_instruction: str = ""
+    ) -> CompactionResult:
+        compact_message, to_preserve = self.prepare(messages, custom_instruction=custom_instruction)
         if compact_message is None:
             return CompactionResult(messages=to_preserve, usage=None)
 
@@ -118,7 +142,9 @@ class PrepareResult(NamedTuple):
         compact_message: Message | None
         to_preserve: Sequence[Message]
 
-    def prepare(self, messages: Sequence[Message]) -> PrepareResult:
+    def prepare(
+        self, messages: Sequence[Message], *, custom_instruction: str = ""
+    ) -> PrepareResult:
         if not messages or self.max_preserved_messages <= 0:
             return self.PrepareResult(compact_message=None, to_preserve=messages)
 
@@ -151,5 +177,13 @@ def prepare(self, messages: Sequence[Message]) -> PrepareResult:
             compact_message.content.extend(
                 part for part in msg.content if not isinstance(part, ThinkPart)
             )
-        compact_message.content.append(TextPart(text="\n" + prompts.COMPACT))
+        prompt_text = "\n" + prompts.COMPACT
+        if custom_instruction:
+            prompt_text += (
+                "\n\n**User's Custom Compaction Instruction:**\n"
+                "The user has specifically requested the following focus during compaction. "
+                "You MUST prioritize this instruction above the default compression priorities:\n"
+                f"{custom_instruction}"
+            )
+        compact_message.content.append(TextPart(text=prompt_text))
         return self.PrepareResult(compact_message=compact_message, to_preserve=to_preserve)
diff --git a/src/kimi_cli/soul/kimisoul.py b/src/kimi_cli/soul/kimisoul.py
@@ -32,7 +32,7 @@
     wire_send,
 )
 from kimi_cli.soul.agent import Agent, Runtime
-from kimi_cli.soul.compaction import CompactionResult, SimpleCompaction
+from kimi_cli.soul.compaction import CompactionResult, SimpleCompaction, should_auto_compact
 from kimi_cli.soul.context import Context
 from kimi_cli.soul.message import check_message, system, tool_result_to_message
 from kimi_cli.soul.slash import registry as soul_slash_registry
@@ -392,8 +392,12 @@ async def _pipe_approval_to_wire():
             step_outcome: StepOutcome | None = None
             try:
                 # compact the context if needed
-                reserved = self._loop_control.reserved_context_size
-                if self._context.token_count + reserved >= self._runtime.llm.max_context_size:
+                if should_auto_compact(
+                    self._context.token_count,
+                    self._runtime.llm.max_context_size,
+                    trigger_ratio=self._loop_control.compaction_trigger_ratio,
+                    reserved_context_size=self._loop_control.reserved_context_size,
+                ):
                     logger.info("Context too long, compacting...")
                     await self.compact_context()
 
@@ -544,7 +548,7 @@ async def _grow_context(self, result: StepResult, tool_results: list[ToolResult]
         await self._context.append_message(tool_messages)
         # token count of tool results are not available yet
 
-    async def compact_context(self) -> None:
+    async def compact_context(self, custom_instruction: str = "") -> None:
         """
         Compact the context.
 
@@ -558,7 +562,9 @@ async def compact_context(self) -> None:
         async def _run_compaction_once() -> CompactionResult:
             if self._runtime.llm is None:
                 raise LLMNotSet()
-            return await self._compaction.compact(self._context.history, self._runtime.llm)
+            return await self._compaction.compact(
+                self._context.history, self._runtime.llm, custom_instruction=custom_instruction
+            )
 
         @tenacity.retry(
             retry=retry_if_exception(self._is_retryable_error),

diff --git a/src/kimi_cli/soul/slash.py b/src/kimi_cli/soul/slash.py
@@ -51,13 +51,13 @@ async def init(soul: KimiSoul, args: str):
 
 @registry.command
 async def compact(soul: KimiSoul, args: str):
-    """Compact the context"""
+    """Compact the context (optionally with a custom focus, e.g. /compact keep db discussions)"""
     if soul.context.n_checkpoints == 0:
         wire_send(TextPart(text="The context is empty."))
         return
 
     logger.info("Running `/compact`")
-    await soul.compact_context()
+    await soul.compact_context(custom_instruction=args.strip())
     wire_send(TextPart(text="The context has been compacted."))
     wire_send(StatusUpdate(context_usage=soul.status.context_usage))
 

diff --git a/tests/core/test_config.py b/tests/core/test_config.py
@@ -32,6 +32,7 @@ def test_default_config_dump():
                 "max_retries_per_step": 3,
                 "max_ralph_iterations": 0,
                 "reserved_context_size": 50000,
+                "compaction_trigger_ratio": 0.85,
             },
             "services": {"moonshot_search": None, "moonshot_fetch": None},
             "mcp": {"client": {"tool_call_timeout_ms": 60000}},
@@ -92,3 +93,23 @@ def test_load_config_max_steps_per_run():
 def test_load_config_reserved_context_size_too_low():
     with pytest.raises(ConfigError, match="reserved_context_size"):
         load_config_from_string('{"loop_control": {"reserved_context_size": 500}}')
+
+
+def test_load_config_compaction_trigger_ratio():
+    config = load_config_from_string('{"loop_control": {"compaction_trigger_ratio": 0.8}}')
+    assert config.loop_control.compaction_trigger_ratio == 0.8
+
+
+def test_load_config_compaction_trigger_ratio_default():
+    config = load_config_from_string("{}")
+    assert config.loop_control.compaction_trigger_ratio == 0.85
+
+
+def test_load_config_compaction_trigger_ratio_too_low():
+    with pytest.raises(ConfigError, match="compaction_trigger_ratio"):
+        load_config_from_string('{"loop_control": {"compaction_trigger_ratio": 0.3}}')
+
+
+def test_load_config_compaction_trigger_ratio_too_high():
+    with pytest.raises(ConfigError, match="compaction_trigger_ratio"):
+        load_config_from_string('{"loop_control": {"compaction_trigger_ratio": 1.0}}')