Skip to content

skills(running-tend): baseline integration-test run polling to avoid stale-run false fails#720

Open
tend-agent wants to merge 1 commit into
mainfrom
skills/integration-test-stale-run-27901476441
Open

skills(running-tend): baseline integration-test run polling to avoid stale-run false fails#720
tend-agent wants to merge 1 commit into
mainfrom
skills/integration-test-stale-run-27901476441

Conversation

@tend-agent

Copy link
Copy Markdown
Collaborator

What

The weekly integration-test recipe's §3 (triage) and §4 (review) verification steps poll for the freshly-triggered workflow run with gh run list --workflow tend-{triage,review} --limit 1 but never baseline against the run that already existed. So on any week where a prior run is present (i.e. every week after the first), the register-wait loop matches last week's completed run instead of this week's, treats it as "registered and successful" instantly, and then fails the downstream assertion — a false negative.

§1 (the integration-secrets reseed) already does this correctly: it captures PREV_ID before dispatch and loops until a run with a different id appears. §3 and §4 just never got the same treatment.

Symptom this caught

A weekly run reported FAIL at §3 — no bot comment on issue #N. Reconstructed from run ids and timestamps:

  • §3 grabbed a tend-triage run from the prior week (a much smaller run id than the just-dispatched secrets run), already completed/success, so it skipped the wait entirely.
  • The comment assertion then ran against the brand-new issue before any triage had happened → false fail.
  • The false fail short-circuited to §6 reset, which closed the test issue at the same second the real tend-triage run was only just registering (in_progress). So even the real run ended up acting on an already-closed issue.

tend-triage itself was healthy the whole time — the real run triggered correctly. The bug is purely in the test harness's run-matching.

Fix

Apply §1's baseline-and-compare pattern to §3 and §4: capture the latest existing run id before the trigger, and require the polled run id to differ from it before treating it as this run.

PREV_RUN=$(gh run list ... --workflow tend-triage --limit 1 --json databaseId --jq '.[0].databaseId // empty')
# ... open the issue ...
[ -n "$RUN_ID" ] && [ "$RUN_ID" != "$PREV_RUN" ] && break

Doc-only change to .claude/skills/running-tend/references/integration-test.md; no workflow or generator code touched.

…stale-run false fails

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant