fix(run): raise review findings JSON cap 5000→20000 chars#167
Open
johnkattenhorn wants to merge 1 commit into
Open
fix(run): raise review findings JSON cap 5000→20000 chars#167johnkattenhorn wants to merge 1 commit into
johnkattenhorn wants to merge 1 commit into
Conversation
The review loop parser truncates the extracted REVIEW_FINDINGS JSON to 5000 chars before handing it to jq for validation. In practice any review with 5+ detailed findings (each with a non-trivial `suggestion` field) exceeds 5000, getting cut mid-object, and then failing jq's syntax check. The only user-visible signal is a WARN line in ralph.log: Review findings JSON is malformed — skipping Findings are silently dropped, never reach the next implementation loop, and Ralph never acts on them. Observed in a real project with `REVIEW_MODE=ultimate` and a project-specific REVIEW_PROMPT.md: 4 of 6 reviews lost their output this way (loops producing 5073, 5546, 5931, 5500 chars of JSON — all just over the cap). 20000 chars (≈8–12 detailed findings) gives comfortable headroom for richer review prompts while keeping the cap in place as a safety rail against a runaway Claude response. No behavior change for smaller reviews. Also applies to the fallback extraction path (when the first sed does not find the markers in the raw file and we extract via `jq -r .result`).
johnkattenhorn
added a commit
to johnkattenhorn/bmalph
that referenced
this pull request
Apr 17, 2026
Patch release for the 5000-char REVIEW_FINDINGS truncation fix. Upstream PR: LarsCowe#167
johnkattenhorn
added a commit
to johnkattenhorn/bmalph
that referenced
this pull request
Apr 17, 2026
Patch release for the 5000-char REVIEW_FINDINGS truncation fix. Upstream PR: LarsCowe#167
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In
run_review_loop(ralph/ralph_loop.sh:1957, 1963) the extracted JSON between the---REVIEW_FINDINGS---and---END_REVIEW_FINDINGS---markers is truncated to 5000 chars before being passed tojqfor validation.findings_json=$(sed -n '/---REVIEW_FINDINGS---/,/---END_REVIEW_FINDINGS---/{//!p;}' "$review_output_file" 2>/dev/null | tr -d '\n' | head -c 5000)Any review that produces more than 5000 chars of JSON — which is trivial to hit with a project-specific
REVIEW_PROMPT.mdthat asks forfile,line,category,issue, and a substantivesuggestionper detail entry — gets cut mid-object.jqthen rejects it as malformed syntax and the whole batch of findings is silently dropped:Ralph never sees the findings. The review loop burns full tokens but provides no signal to the next implementation loop.
Repro
Observed in a real project running
REVIEW_MODE=ultimatewith a customisedREVIEW_PROMPT.md. Six reviews ran:Every review over 5000 chars failed. The lost findings included real HIGH-severity items the reviewer caught (missing cross-user isolation tests, pattern drift, architectural deviations) that Ralph would otherwise have addressed in the following loop.
Fix
Raise both
head -c 5000occurrences tohead -c 20000. That's comfortable headroom for ~8–12 detailed findings per review while preserving the cap as a safety rail against a runaway model response.No behaviour change for smaller reviews.
Test plan
jq .issues_foundreturns the correct count.tests/bash/structure is mature but I didn't want to speculate.Related
Discovered alongside #165 (prepare-script fix) while running a real project against a custom REVIEW_PROMPT.md. Both are independent.