Skip to content

fix(ci): comprehensive overhaul of release-drafter setup#15623

Open
jamesfredley wants to merge 1 commit into7.0.xfrom
fix/release-drafter-overhaul
Open

fix(ci): comprehensive overhaul of release-drafter setup#15623
jamesfredley wants to merge 1 commit into7.0.xfrom
fix/release-drafter-overhaul

Conversation

@jamesfredley
Copy link
Copy Markdown
Contributor

Summary

Comprehensive overhaul of the Release - Drafter setup that fixes a long-standing set of compounding issues across 7.0.x, 7.1.x, 7.2.x, and 8.0.x. Lands first on 7.0.x and is intended to be merged forward into the higher branches in the usual cascade.

The drafter has been quietly broken for months. Symptoms users have hit:

  • The 8.0.x draft was never created (no draft for any branch currently exists).
  • Recent runs queued for 1,400-2,000+ minutes before being cancelled.
  • Release notes occasionally bumped from stale baselines (e.g. v7.0.10 instead of v7.0.11) because the latest releases were excluded from the "last release" lookup.
  • continue-on-error: true made all of the above invisible: every run reported "success" even when it produced nothing.

This PR addresses all nine root causes in a single change so the next merge cascade gives every release branch a working drafter at once.

Root causes and fixes

1. Concurrency lock with release.yml (the hour-long delays)

release-notes.yml and release.yml both used release-pipeline-${branch}. release.yml has manual approval gates (environment: release, environment: docs, environment: sdkman) which routinely keep a release run in waiting state for days until a maintainer approves the next stage. Every push to a release branch during that window queued behind the waiting release run.

Evidence from the workflow history:

Run Branch Duration Result
25214035979 7.0.x 1,400 min cancelled
25197284620 7.1.x-stop-4x-exceptionlogging 2,091 min cancelled
25167124818 7.0.x 1,358 min cancelled

Fix: switch to release-drafter-${branch} with cancel-in-progress: true. The drafter and release.yml never touch the same release object - the drafter targets the next-version draft (e.g. v7.0.12), release.yml the currently-published tag (e.g. v7.0.11) - so splitting the groups is safe.

2. Prereleases excluded from "last release" detection (the missing 8.0.x draft)

This is the single biggest fix and the reason no draft exists for 8.0.x.

Every Apache Grails release - v7.0.11, v7.1.1, v8.0.0-M1, ... - is published on GitHub with prerelease=true during the ASF vote process. release-drafter's default include-pre-releases=false filters those out when finding the "last release", which means:

  • 7.0.x bumped from v7.0.10 (last non-prerelease) instead of v7.0.11.

  • 7.1.x bumped from v7.1.0 instead of v7.1.1.

  • 8.0.x had no last release at all (only v8.0.0-M1 exists, excluded as a prerelease) - so the action fell back to walking the entire 265-release commit history and exhausted the GitHub API rate limit (visible verbatim in the workflow logs):

    Found 265 releases
    No draft release found
    No last release found
    Fetching parent commits of 8.0.x...
    ##[error]Request failed due to following response errors:
     - API rate limit already exceeded for site ID installation.
    

Simulation of the new filter pipeline against today's release set:

Branch Current lastRelease After fix
7.0.x v7.0.10 (wrong) v7.0.11
7.1.x v7.1.0 (wrong) v7.1.1
7.2.x NONE (rate limit) NONE (bounded by initial-commits-since)
8.0.x NONE (rate limit) v8.0.0-M1

Fix: include-pre-releases: true in .github/release-drafter.yml.

3. Unbounded git history walk on rate-limit exhaustion

When no last release is found, release-drafter walks parent commit history through GraphQL, paging until the entire history is consumed.

Fix: initial-commits-since: '2026-04-29T00:00:00Z' (just before the most recent releases). This is consulted only when no last release matches the filters - so it does not affect 7.0.x/7.1.x/8.0.x (which now find their last release thanks to fix #2), and it bounds 7.2.x's walk to days instead of years.

4. release-drafter v7.2.0 bug

initial-commits-since was silently ignored when set only in release-drafter.yml (not also as a workflow input) - exactly the path we use. Fixed upstream in release-drafter#1593, shipped in v7.2.1.

Fix: pin to release-drafter@563bf132657a13ded0b01fcb723c5a58cdd824e2 (v7.2.1).

5. Floating action tag on 7.0.x

7.0.x used release-drafter@v7 while 8.0.x was already pinned to a SHA. This violates the ASF security policy enforced in #15523.

Fix: SHA pin on every branch via the merge cascade.

6. continue-on-error swallowed all failures silently

A transient or permanent drafter failure looked identical to a successful run; that is precisely why this broken state went unnoticed for months.

Fix: keep continue-on-error (so transient API blips do not turn every PR check red) but add an explicit verification step that:

  • Logs a structured workflow summary with the draft id, tag, name, and URL on success.
  • Emits a ::warning:: annotation and a workflow-summary warning when the action produced no draft id (the failure mode that made this invisible).

7. Workflow ran on every PR

pull_request: types: [opened, reopened, synchronize, labeled] matched PRs targeting any branch, including feature-to-feature PRs that can never affect a release.

Fix: branches: ['[0-9]+.[0-9]+.x'] on the trigger.

8. issues trigger

Drafter has nothing to do with issue lifecycle; the trigger fired on every issue close/reopen.

Fix: removed.

9. Cross-branch leakage prevention (defense in depth)

Combine three filters so release-drafter cannot match a release from a different release line either when looking for the "last release" to bump from or for an existing draft to update:

  • filter-by-commitish: true (release's target_commitish must equal the branch).
  • filter-by-range: ~MAJOR.MINOR.0 (computed dynamically from the branch name in the workflow step).
  • tag-prefix: v (release's tag must start with v).

Files changed

  • .github/release-drafter.yml
    • Added tag-prefix: v, include-pre-releases: true, initial-commits-since.
    • Existing config (filter-by-commitish: true, autolabeler, categories, version-resolver, template) unchanged.
  • .github/workflows/release-notes.yml
    • Removed issues: trigger.
    • Restricted pull_request trigger to release branches.
    • Switched concurrency group to release-drafter-${branch} with cancel-in-progress: true.
    • Pinned release-drafter to v7.2.1 by commit SHA.
    • Added id: drafter and a verification step that surfaces missing-draft failures via workflow summary and annotations.
    • Heavily commented so future maintainers do not have to re-derive the rationale for any of the above.

How to verify after merge

  1. Watch the next push to 7.0.x trigger the drafter and complete in seconds (no longer queued behind release.yml).
  2. Confirm a draft release v7.0.12 (or whatever the version-resolver picks) appears at https://github.com/apache/grails-core/releases with target_commitish=7.0.x.
  3. After the merge cascade reaches 8.0.x, confirm a v8.0.0-M2 (or v8.0.0) draft appears with target_commitish=8.0.x - this is the missing draft from the original report.
  4. If a draft ever fails to appear again, the workflow summary will now contain a clearly visible warning instead of silently passing.

Out of scope

  • The 9 historical v7.0.x releases whose target_commitish is refs/heads/7.0.x instead of 7.0.x are benign: find-previous-releases.ts strips refs/heads/ before comparing, so they still match filter-by-commitish: true. No cleanup needed.
  • The 16 older 3.x/4.x releases with SHA target_commitish cannot match any current branch filter and are correctly excluded.

Assisted-by: claude-code:claude-4.6-opus

Fixes a long-standing set of issues with the Release - Drafter workflow that
collectively prevented per-branch draft releases from working correctly
across 7.0.x, 7.1.x, 7.2.x, and 8.0.x. Symptoms included drafter runs queued
for hours behind the release pipeline, the 8.0.x draft never being created,
and release notes silently bumping from stale baselines.

Root causes addressed:

1. Concurrency lock with release.yml: release-notes.yml shared the
   "release-pipeline-${branch}" concurrency group with release.yml, whose
   manual approval gates routinely keep release runs in the "waiting" state
   for days. Drafter runs queued behind those gates routinely lasted
   1400-2000+ minutes before being cancelled, leaving drafts stale. Fix:
   switch to "release-drafter-${branch}" group with cancel-in-progress so
   the latest push wins. Drafter and release.yml never touch the same
   release object: drafter targets the next-version draft, release.yml the
   currently-published tag, so splitting the groups is safe.

2. Prereleases excluded from "last release" detection: every Apache Grails
   release (v7.0.11, v7.1.1, v8.0.0-M1, ...) is published with
   prerelease=true on GitHub during the ASF vote process. With
   release-drafter's default include-pre-releases=false, those releases
   were filtered out when looking for the "last release", so 7.0.x bumped
   from v7.0.10 instead of v7.0.11 and 8.0.x reported "No last release
   found" (because v8.0.0-M1 was the only release on that branch and was
   excluded), falling back to walking the entire 265-release commit
   history and exhausting the GitHub API rate limit. Fix: set
   include-pre-releases: true.

3. Unbounded git history walk on rate-limit exhaustion: when no last
   release was found, release-drafter walked unbounded parent commit
   history calling the GraphQL API per commit. Fix: set
   initial-commits-since to bound the walk to a recent date floor for
   branches with no prior matching release (e.g. 7.2.x today).

4. Release-drafter v7.2.0 bug: initial-commits-since was silently ignored
   when set only in release-drafter.yml (not also as a workflow input).
   Fix: pin to v7.2.1 (commit 563bf132657a13ded0b01fcb723c5a58cdd824e2)
   which ships the fix from upstream PR #1593.

5. Floating action tag on 7.0.x: 7.0.x used release-drafter@v7 while
   8.0.x was pinned to a SHA. Fix: pin all branches to the v7.2.1 commit
   SHA per ASF security policy (PR #15523).

6. continue-on-error swallowed all failures silently: a transient or
   permanent drafter failure looked identical to a successful run, which
   is why the broken state went unnoticed for so long. Fix: keep
   continue-on-error so PR checks stay green for transient API blips, but
   add an explicit verification step that loudly logs a workflow-summary
   warning and a GitHub Actions annotation when no draft id was produced.

7. Workflow ran on PRs targeting any branch, including feature-to-feature
   PRs that can never affect a release. Fix: restrict the pull_request
   trigger to PRs whose base ref is a release branch.

8. issues trigger ran on every issue close/reopen even though release
   drafting has nothing to do with issue lifecycle. Fix: removed.

9. Cross-branch leakage prevention: combine filter-by-commitish: true,
   filter-by-range: ~MAJOR.MINOR.0 (derived dynamically from the branch
   name) and tag-prefix: v so release-drafter can never match a release
   from a different release line when looking for either the "last
   release" to bump from or an existing draft to update.

The fix is being landed on 7.0.x and will be merged forward into 7.1.x,
7.2.x, and 8.0.x. The 8.0.x branch's existing release-drafter@5de93583
pin (v7.2.0) will be superseded by the new v7.2.1 pin during merge.

Assisted-by: claude-code:claude-4.6-opus
Copilot AI review requested due to automatic review settings May 2, 2026 15:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@testlens-app
Copy link
Copy Markdown

testlens-app Bot commented May 2, 2026

✅ All tests passed ✅

🏷️ Commit: d4608a4
▶️ Tests: 31769 executed
⚪️ Checks: 37/37 completed


Learn more about TestLens at testlens.app.

concurrency:
group: release-pipeline-${{ github.event.pull_request.base.ref || github.ref_name }}
cancel-in-progress: false
group: release-drafter-${{ github.event.pull_request.base.ref || github.ref_name }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont' think this is correct. We want the concurrency to be shared with the release, otherwise release drafter will modify a release that is being released. We do not want that to ever happen.

@bito-code-review
Copy link
Copy Markdown

The change separates the concurrency group from 'release-pipeline-${branch}' to 'release-drafter-${branch}' to prevent drafter runs from queuing behind long-waiting release pipeline jobs with manual approvals. The comment explains drafter updates a draft for the next version (e.g., v7.0.12), while release.yml handles the current published tag (e.g., v7.0.11), so they don't conflict. Sharing concurrency isn't needed since they target different releases.

.github/workflows/release-notes.yml

concurrency:
  group: release-drafter-${{ github.event.pull_request.base.ref || github.ref_name }}
  cancel-in-progress: true

@jdaugherty
Copy link
Copy Markdown
Contributor

The change separates the concurrency group from 'release-pipeline-${branch}' to 'release-drafter-${branch}' to prevent drafter runs from queuing behind long-waiting release pipeline jobs with manual approvals. The comment explains drafter updates a draft for the next version (e.g., v7.0.12), while release.yml handles the current published tag (e.g., v7.0.11), so they don't conflict. Sharing concurrency isn't needed since they target different releases.

.github/workflows/release-notes.yml

concurrency:
  group: release-drafter-${{ github.event.pull_request.base.ref || github.ref_name }}
  cancel-in-progress: true

This is incorrect. The creation of the tag itself was enough to historically trigger the release drafter workflow and thus a release being performed would overlap the release drafter run too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

project folder in grails home should use application name, not parent folder name

3 participants