fix: prevent agent loops when ReadMediaFile is called on text files (#1218)#1263
fix: prevent agent loops when ReadMediaFile is called on text files (#1218)#1263Goodeve0 wants to merge 2 commits into
Conversation
…oonshotAI#1218) - read-media.md: move file-type constraint to first line with CRITICAL prefix - read-media.ts: strengthen error messages and parameter descriptions - tool-dedup.ts: add wrong-tool error detection that escalates faster (r2 reminder at streak 2, force-stop at streak 4 for category errors) - Add unit tests for wrong-tool error classification and escalation - Update existing test assertions for new error message wording
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5c89c6d8eb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| // Wrong-tool errors (category errors that retries will never fix) | ||
| // escalate faster than generic transient failures. | ||
| if (isWrongToolError(result.output)) { |
There was a problem hiding this comment.
Constrain wrong-tool detection to tool errors
When the repeated tool result is successful output that happens to contain one of these phrases (for example Reading a file that says "is a text file" or rerunning a command whose stdout includes "must use the Read tool"), this branch treats it as a wrong-tool error because it only inspects result.output. That changes normal repeat handling from the generic thresholds to an R2 reminder on the 2nd repeat and stopTurn on the 4th, even though no tool error occurred; gate this path on result.isError === true (and ideally the relevant tool/error class) before fast-stopping the turn.
Useful? React with 👍 / 👎.
| @@ -1,13 +1,14 @@ | |||
| **CRITICAL: This tool ONLY reads image files (PNG, JPG, GIF, WebP, SVG, etc.) and video files (MP4, MOV, AVI, etc.). It CANNOT read text files — never pass a .md, .ts, .json, .txt, or any other text file path to this tool. For text files, use the Read tool. To list directories, use `ls` via Bash for a known directory, or Glob for pattern search.** | |||
There was a problem hiding this comment.
Remove unsupported SVG from media examples
For SVG inputs, this new leading description now tells the model that ReadMediaFile supports SVG, but detectFileType maps .svg through TEXT_MIME_BY_SUFFIX as text and ReadMediaFile.execution rejects fileType.kind === 'text'. In the common case where a user asks to inspect an SVG asset, this prompt encourages the exact wrong tool and will produce the text-file error instead of reading the file with Read; either remove SVG from the supported examples or add real SVG support.
Useful? React with 👍 / 👎.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
Update (second commit)Addressed both P2 Codex review items in a follow-up commit:
New tests added:
|
Related Issue
Resolve #1218
Problem
When an agent uses
ReadMediaFileon a text file (e.g. a UTF-16 LE encodedAGENTS.md), the tool returns an error like "is a text file". On retry the model often picksReadMediaFileagain, leading to an unproductive loop until the repeat-dedup escalation finally kicks in.What changed
Multi-layered fix:
read-media.md: Moved the file-type constraint to the first line with a clear statement that this tool only supports image and video files — to reduce the probability the model picks it for text files in the first place.
read-media.ts: Strengthened the error messages and parameter descriptions to be more explicit about supported vs unsupported file types.
tool-dedup.ts: Added wrong-tool error detection — a new fast-escalation path for categorical tool errors where the model fundamentally chose the wrong tool (e.g. ReadMediaFile on text). These errors escalate in 2 reminder rounds instead of waiting for generic repeat-dedup thresholds (r2 reminder at streak 2, force-stop at streak 4).
Tests: Added
tool-dedup.test.tscovering wrong-tool error classification and escalation timing; updatedread-media.test.tsfor new error message wording;tool-call.e2e.test.tsgets a new assertion for the e2e path.Benchmark note
A/B benchmark was run (5 runs per group, UTF-16 LE AGENTS.md + "请阅读 AGENTS.md 并总结" prompt). Both groups showed 0% ReadMediaFile hit rate — the bug is intermittent in this setup and could not be reproduced with a single-shot prompt. The fix remains semantically correct and the multi-layer mitigation (description + error messages + fast escalation) addresses the root cause.