Skip to content

feat(agent-core): detect stalled turns and force text-only recovery#1312

Open
flame4 wants to merge 2 commits into
MoonshotAI:mainfrom
flame4:feat/progress-detector
Open

feat(agent-core): detect stalled turns and force text-only recovery#1312
flame4 wants to merge 2 commits into
MoonshotAI:mainfrom
flame4:feat/progress-detector

Conversation

@flame4

@flame4 flame4 commented Jul 2, 2026

Copy link
Copy Markdown

Related Issue

Resolve #1314

Problem

In long-running turns the model can fall into a tool-use loop where it emits placeholder/no-op calls instead of answering the user. The existing ToolCallDeduplicator only catches exact same-step duplicates, so loops with varied but meaningless calls are not stopped.

Anonymized excerpt from a stuck turn (session 0a8e1647-edc1-4cf4-a25b-d11a6cbba943):

step  2: Read('/dev/null')
step  3: Read('/dev/null')
step  4: Read('/dev/null')
step  5: Read('')
step  9: Bash('true')
step 10: Bash("printf ''")
step 14: Bash('true')
step 22: Bash(':')
step 23: Bash('true')
step 24: Bash(': ')
step 25: Bash(':')
step 26: Bash('true')
step 27: Bash(':')
step 29: Bash(':')
step 30: Bash('true')
step 31: Bash(':')
step 32: Bash(':')
step 33: Bash(':')
step 34: Bash(':')
step 35: Bash(':')
step 36: Bash(':')
step 37: Bash(':')
step 38: Bash(':')

The turn kept running Bash(':'), Bash('true'), Read('/dev/null'), and echo placeholders without changing any file or returning useful new information, eventually exhausting max_steps_per_turn.

What changed

Added a ProgressDetector that measures progress from external, observable state instead of interpreting model intent:

  1. External state change
    • git status --porcelain in the working directory.
    • Lifecycle/status changes of background tasks.
  2. Information gain
    • Successful tool outputs that are non-trivial and have not been seen before in the current turn.
    • Successful Edit and Write tool results are always counted as progress, even when their output is short, so repeated edits to an already-dirty file are not misclassified as stalled.

If a configurable number of consecutive steps pass without either signal, the harness:

  • appends a system reminder telling the model to stop calling tools and respond in text;
  • clears the available tool list for the next model step ({ tools: [] }), so the model can only produce text.

Two new loop_control options are exposed:

Config key Default Description
progress_stall_threshold 8 Consecutive idle steps before forcing text-only mode.
progress_min_info_gain_length 60 Minimum successful output length (chars) to count as information gain.

Example config.toml:

[loop_control]
max_steps_per_turn = 100
progress_stall_threshold = 12
progress_min_info_gain_length = 120

Files changed

  • packages/agent-core/src/agent/turn/progress-detector.ts (new)
    • ProgressDetector class with snapshot-taking, output hashing, and stall counting.
  • packages/agent-core/src/agent/turn/index.ts
    • Instantiates ProgressDetector per turn with config-driven threshold and min length.
    • afterStep records progress and triggers forceTextMode after threshold.
    • beforeStep injects reminder and returns { tools: [] } when forceTextMode is set.
  • packages/agent-core/src/loop/types.ts / turn-step.ts
    • Adds BeforeStepResult.tools so the harness can override the tool set.
  • packages/agent-core/src/config/schema.ts
    • Adds progressStallThreshold and progressMinInfoGainLength to LoopControlSchema.
  • packages/agent-core/test/agent/turn/progress-detector.test.ts (new)
    • Unit tests for state-change detection, information-gain hashing, stall counting, and Edit/Write progress.
  • packages/agent-core/test/config/configs.test.ts
    • Updates COMPLETE_TOML fixture and round-trip assertions for the new loop_control keys.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset. (Added .changeset/progress-detector-stalled-turns.md for @moonshot-ai/kimi-code.)
  • Ran gen-docs skill, or this PR needs no doc update. (No user-facing docs change; behavior is internal to the agent loop.)

Co-authored-by: Kimi kimi@moonshot.cn

@changeset-bot

changeset-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 40579ac

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@flame4 flame4 force-pushed the feat/progress-detector branch from e2f029d to 6e30841 Compare July 2, 2026 10:53

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b95a98d184

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


private async runGitStatus(cwd: string): Promise<string> {
try {
const proc = await this.agent.kaos.exec('git', '-C', cwd, 'status', '--porcelain');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Detect content changes in already-dirty files

When a turn keeps editing a file that is already modified or untracked, git status --porcelain stays identical (for example, M src/foo.ts) even though the file contents changed; Edit/Write successes also often return short outputs below the 60-character information-gain threshold. In that common single-file refactor case, eight real edits can be classified as stalled and the next step is forced into text-only mode, preventing the agent from making further needed changes. Please include a content-sensitive signal (e.g. diff/hash/mtime for dirty paths) or otherwise count successful write/edit tool results as progress.

Useful? React with 👍 / 👎.

@flame4 flame4 force-pushed the feat/progress-detector branch from 6e30841 to 75f04a8 Compare July 2, 2026 10:55
flame4 and others added 2 commits July 2, 2026 19:03
Add a ProgressDetector that watches external state (git status, background
tasks) and information gain (new non-trivial tool outputs) to detect when a
turn is spinning without progress. After 8 consecutive idle steps, the harness
injects a system reminder and forces the next model step to run with no tools
available, requiring a text-only response.

Successful Edit and Write tool results are now counted as progress even when
their output is short, so repeated edits to the same already-dirty file are
not misclassified as stalled.

The stall threshold and minimum information-gain length are configurable via
loop_control.progress_stall_threshold and
loop_control.progress_min_info_gain_length.

This prevents the no-op tool loops seen with commands like Bash(:), Read
/dev/null, and echo placeholders, where the model keeps emitting tool calls
instead of responding to the user.

- packages/agent-core/src/agent/turn/progress-detector.ts (new)
- packages/agent-core/src/agent/turn/index.ts
- packages/agent-core/src/loop/turn-step.ts
- packages/agent-core/src/loop/types.ts
- packages/agent-core/src/config/schema.ts
- packages/agent-core/test/agent/turn/progress-detector.test.ts (new)
- packages/agent-core/test/config/configs.test.ts

Co-authored-by: Kimi <kimi@moonshot.cn>
@flame4 flame4 force-pushed the feat/progress-detector branch from 75f04a8 to 40579ac Compare July 2, 2026 11:03
@flame4

flame4 commented Jul 2, 2026

Copy link
Copy Markdown
Author

@chatgpt-codex-connector Thanks for the review. Addressed the P2 feedback: ProgressDetector now treats successful Edit and Write results as progress regardless of output length, so repeated edits to an already-dirty file are not misclassified as stalled. Added a test covering this case as well.

@chatgpt-codex-connector

Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: detect stalled turns and force text-only recovery

1 participant