Skip to content

skills(running-in-ci): raise the bar for repo-overlay PRs#604

Merged
max-sixty merged 1 commit into
mainfrom
fix/issue-603-26411035567
May 26, 2026
Merged

skills(running-in-ci): raise the bar for repo-overlay PRs#604
max-sixty merged 1 commit into
mainfrom
fix/issue-603-26411035567

Conversation

@tend-agent

Copy link
Copy Markdown
Collaborator

Problem

The bundled running-in-ci skill's Learning from FeedbackWhen to propose section currently lets a single observed incident qualify for a repo-overlay PR (the third "Signals" bullet reads "The same correction has surfaced before, or would plausibly surface again" — the "or would plausibly surface again" clause covers any one-off failure). This produces low-signal skill churn: stochastic failures get codified as repo-specific rules that every future session then has to read past.

Reported in #603. Triggering thread: PRQL/prql#5945 comment — a single ~5h40m orphan-loop incident triggered an overlay PR that was closed at maintainer request with the ask that we apply the same bar to overlay PRs that tend's own bundled-skill changes operate under.

Solution

Tighten When to propose so opening a repo-overlay PR requires generalizability and at least one of:

  • Recurrence: same correction observed at least twice, or evidence the failure mode is recurring.
  • Invisible failure mode: bad behavior would not surface as a future CI failure (cancelled/timed-out runs whose work succeeded), so without codification it would not be caught next time.
  • Maintainer-explicit codification request: a maintainer has explicitly asked for the rule after a single occurrence.

The rationale: bundled-skill changes pass through human review on the tend repo, which acts as an implicit recurrence/impact filter. Per-repo overlays don't get that scrutiny, so the bar belongs in the bundled skill.

Testing

Skill text change only — no executable code touched. The rule will exercise on the next maintainer-correction in any consumer.


Closes #603 — automated triage

Closes #603

Co-Authored-By: Claude <noreply@anthropic.com>
@max-sixty max-sixty merged commit e60f5ae into main May 26, 2026
4 checks passed
@max-sixty max-sixty deleted the fix/issue-603-26411035567 branch May 26, 2026 01:15
@max-sixty max-sixty mentioned this pull request May 27, 2026
max-sixty added a commit that referenced this pull request May 27, 2026
## Why

Cut the 0.1.2 release so consumer repos (and tend's own nightly regen)
pick up the new
`claude-interactive` harness and per-workflow `harness`/`model`
overrides.

## What's new since 0.1.1

**New `claude-interactive` harness**
(`max-sixty/tend/interactive@0.1.2`) — opt-in
alternative to the released `claude` harness. PTY-supervised interactive
`claude` via
`script(1)`, end-of-turn detected through Stop/StopFailure hooks. Built
as the
trial path ahead of Anthropic's June 15 billing split between Agent-SDK
metering and
the flat Claude Code subscription. Smoke tested end-to-end on tend
itself. PRs:
#609, #611, #613, #614, #615, #616.

**Per-workflow `harness` / `model` override** — adopters can flip a
single workflow
to a different harness or model without changing `.config/tend.yaml`
defaults. #612.

**Skill refinements** — nightly upstream-bot rebases (#605),
running-in-ci PR bar
(#604) and recheck (#573), env-filter loophole fix (#599),
authorAssociation warning
(#600), review-gates Gate 1 (#602).

**Bug fix** — mention queue-delay now uses `comment.updated_at` so edit
events report
accurately (#595).

## Compatibility

Released `claude` harness path is byte-identical; `claude-interactive`
is strictly
additive and opt-in. Consumer repos that don't touch `harness:` see no
change beyond
the new skill text and the mention-edit fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Apply same threshold to repo-local skill overlay PRs as bundled-skill changes

2 participants