skills(review-gates): structural classification doesn't override Gate 1 by tend-agent · Pull Request #602 · max-sixty/tend

tend-agent · 2026-05-25T16:47:30Z

Problem

The structural-vs-stochastic classification in review-gates.md currently includes a 1-occurrence bypass: "One clear occurrence is sufficient evidence for a targeted fix." This lets non-Critical structural findings clear Gate 1 with a single occurrence, even when the evidence level says High (needs 2–3) or Medium (needs 5+).

In practice this misroutes findings to PRs that should sit in the evidence gist until they accumulate. Feedback on #593:

no! we need to change our criteria; this is structural and it's not critical
— @max-sixty

PR #593 was structural + High (1 occurrence) and passed gates only because the structural carve-out overrode the High threshold.

Fix

Tighten the structural bullet so the classification only affects recurrence confidence, not the magnitude bar. Non-Critical structural failures fall back to their evidence-level thresholds (High = 2–3, Medium = 5+). Only Critical structural failures act on a single occurrence — which matches what Gate 1 already says for Critical.

Stochastic guidance is unchanged: still needs 5+ occurrences.

Effect on the rejected PR

Applying the revised criteria to #593: evidence level High, structural, 1 occurrence → falls short of High's 2–3 threshold → record in evidence gist, no PR.

The structural carve-out lets a single occurrence pass any evidence level for a targeted fix, even when Gate 1 would require 2-3 or 5+. Tighten so structural raises recurrence confidence but doesn't lower the magnitude bar; only Critical structural failures act on a single occurrence.

## Why Cut the 0.1.2 release so consumer repos (and tend's own nightly regen) pick up the new `claude-interactive` harness and per-workflow `harness`/`model` overrides. ## What's new since 0.1.1 **New `claude-interactive` harness** (`max-sixty/tend/interactive@0.1.2`) — opt-in alternative to the released `claude` harness. PTY-supervised interactive `claude` via `script(1)`, end-of-turn detected through Stop/StopFailure hooks. Built as the trial path ahead of Anthropic's June 15 billing split between Agent-SDK metering and the flat Claude Code subscription. Smoke tested end-to-end on tend itself. PRs: #609, #611, #613, #614, #615, #616. **Per-workflow `harness` / `model` override** — adopters can flip a single workflow to a different harness or model without changing `.config/tend.yaml` defaults. #612. **Skill refinements** — nightly upstream-bot rebases (#605), running-in-ci PR bar (#604) and recheck (#573), env-filter loophole fix (#599), authorAssociation warning (#600), review-gates Gate 1 (#602). **Bug fix** — mention queue-delay now uses `comment.updated_at` so edit events report accurately (#595). ## Compatibility Released `claude` harness path is byte-identical; `claude-interactive` is strictly additive and opt-in. Consumer repos that don't touch `harness:` see no change beyond the new skill text and the mention-edit fix.

tend-agent mentioned this pull request May 25, 2026

skills(running-in-ci): forbid ScheduleWakeup and fire-and-forget background bash in CI #593

Closed

max-sixty merged commit 33a9cbb into main May 25, 2026
4 checks passed

max-sixty deleted the fix/gate-criteria-structural-26410769945 branch May 25, 2026 16:49

tend-agent mentioned this pull request May 26, 2026

review-runs-tracking: 2026-05 #365

Closed

max-sixty mentioned this pull request May 27, 2026

chore: release 0.1.2 #618

Merged

tend-agent mentioned this pull request Jun 8, 2026

skills(triage): drop "run full suite" — defer to PR CI per "ship before end_turn" #671

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skills(review-gates): structural classification doesn't override Gate 1#602

skills(review-gates): structural classification doesn't override Gate 1#602
max-sixty merged 1 commit into
mainfrom
fix/gate-criteria-structural-26410769945

tend-agent commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tend-agent commented May 25, 2026

Problem

Fix

Effect on the rejected PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants