fix(hooks): remove interim truncation caps — reducer is the size gate by aaronsb · Pull Request #99 · aaronsb/agent-ways

aaronsb · 2026-05-21T22:26:54Z

Summary

Removes the character-based truncation blocks from check-bash-pre.sh (PR #95) and check-prompt.sh (PR #96). They were misframed in their own commit messages as "belt and suspenders" — in practice they ran strictly upstream of the ADR-130 reducer (PR #98), starving it by chopping the input before it could score sentence salience.

The custom-agent discriminator in check-task-pre.sh (PR #94) is not a cap — it stops ways scan task from running at all for dispatches to custom agents. Untouched.

Why now, not after an observation window

The caps make every signal we'd want to observe worse:

They pre-clamp inputs to a safe range, so a reducer-bounds bug would never surface in practice
They pre-cut prose, so any match-quality regression after removal would be conflated with reducer behavior
They add small shell-side overhead that masks reducer perf

Keeping them was actively delaying the observation we wanted to do.

What's still guaranteed

ADR-130's reduce_for_embed enforces a per-hook token budget (110 for prompt/task, 75 for command, 30 for file) and bounds output regardless of input size. The CJK-aware approximate tokenizer (added in the PR #98 review fixes) prevents the Japanese/Chinese under-counting that would have been a regression here.

Test plan

bash -n clean on both scripts
6.4KB heredoc-bodied gh pr create through bash hook → 0 crashes
6KB task-notification through prompt hook → 0 crashes
Hook context still emitted normally (no scan path regression)

PRs #95 (bash 256-char cap) and #96 (prompt 1024-char cap) were character-based pre-clamps that ran strictly upstream of the ADR-130 sentence-salience reducer. They weren't redundant — they were *starving* the reducer by chopping off the back of the input before the reducer could score it. Concrete shape of the bug, on a 4000-char `gh pr create --body "$(cat <<EOF…)"`: hook receives 4000-char CMD ↓ check-bash-pre.sh truncates to 256 chars ← cap fires here ↓ ways scan command --command "<256 chars>" ↓ reduce_for_embed("<256 chars>", 75) ← input fits, passthrough ↓ batch_embed_score("<256 chars>") The reducer's whole pitch — preserve prose distributed across the whole document — was defeated as long as the caps ran first. The caps also masked any reducer-bounds bug by clamping inputs to a safe range before the reducer could exercise its full path. ADR-130's reducer provides the size guarantee these caps were meant to provide. With the caps gone, the reducer sees the full input, scores sentence salience across it, and hands the embedder a bounded query. Custom-agent discriminator in check-task-pre.sh (PR #94) is *not* a cap — it's a "don't invoke ways scan task at all for dispatches to custom agents" gate. Unchanged. Tested live: 6.4KB bash command, 6KB task-notification prompt both run cleanly through the production hook chain, 0 SIGABRTs.

aaronsb merged commit 15f0952 into main May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hooks): remove interim truncation caps — reducer is the size gate#99

fix(hooks): remove interim truncation caps — reducer is the size gate#99
aaronsb merged 1 commit into
mainfrom
fix/remove-truncation-caps

aaronsb commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aaronsb commented May 21, 2026

Summary

Why now, not after an observation window

What's still guaranteed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant