skills(running-tend): baseline integration-test run polling to avoid stale-run false fails#720
Open
tend-agent wants to merge 1 commit into
Open
skills(running-tend): baseline integration-test run polling to avoid stale-run false fails#720tend-agent wants to merge 1 commit into
tend-agent wants to merge 1 commit into
Conversation
…stale-run false fails Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The weekly integration-test recipe's §3 (triage) and §4 (review) verification steps poll for the freshly-triggered workflow run with
gh run list --workflow tend-{triage,review} --limit 1but never baseline against the run that already existed. So on any week where a prior run is present (i.e. every week after the first), the register-wait loop matches last week's completed run instead of this week's, treats it as "registered and successful" instantly, and then fails the downstream assertion — a false negative.§1 (the
integration-secretsreseed) already does this correctly: it capturesPREV_IDbefore dispatch and loops until a run with a different id appears. §3 and §4 just never got the same treatment.Symptom this caught
A weekly run reported
FAIL at §3 — no bot comment on issue #N. Reconstructed from run ids and timestamps:tend-triagerun from the prior week (a much smaller run id than the just-dispatched secrets run), alreadycompleted/success, so it skipped the wait entirely.tend-triagerun was only just registering (in_progress). So even the real run ended up acting on an already-closed issue.tend-triageitself was healthy the whole time — the real run triggered correctly. The bug is purely in the test harness's run-matching.Fix
Apply §1's baseline-and-compare pattern to §3 and §4: capture the latest existing run id before the trigger, and require the polled run id to differ from it before treating it as this run.
Doc-only change to
.claude/skills/running-tend/references/integration-test.md; no workflow or generator code touched.