From d4608a4b659e9844c0d9f1e128bd6b0232290988 Mon Sep 17 00:00:00 2001 From: James Fredley Date: Sat, 2 May 2026 11:40:19 -0400 Subject: [PATCH] fix(ci): comprehensive overhaul of release-drafter setup Fixes a long-standing set of issues with the Release - Drafter workflow that collectively prevented per-branch draft releases from working correctly across 7.0.x, 7.1.x, 7.2.x, and 8.0.x. Symptoms included drafter runs queued for hours behind the release pipeline, the 8.0.x draft never being created, and release notes silently bumping from stale baselines. Root causes addressed: 1. Concurrency lock with release.yml: release-notes.yml shared the "release-pipeline-${branch}" concurrency group with release.yml, whose manual approval gates routinely keep release runs in the "waiting" state for days. Drafter runs queued behind those gates routinely lasted 1400-2000+ minutes before being cancelled, leaving drafts stale. Fix: switch to "release-drafter-${branch}" group with cancel-in-progress so the latest push wins. Drafter and release.yml never touch the same release object: drafter targets the next-version draft, release.yml the currently-published tag, so splitting the groups is safe. 2. Prereleases excluded from "last release" detection: every Apache Grails release (v7.0.11, v7.1.1, v8.0.0-M1, ...) is published with prerelease=true on GitHub during the ASF vote process. With release-drafter's default include-pre-releases=false, those releases were filtered out when looking for the "last release", so 7.0.x bumped from v7.0.10 instead of v7.0.11 and 8.0.x reported "No last release found" (because v8.0.0-M1 was the only release on that branch and was excluded), falling back to walking the entire 265-release commit history and exhausting the GitHub API rate limit. Fix: set include-pre-releases: true. 3. Unbounded git history walk on rate-limit exhaustion: when no last release was found, release-drafter walked unbounded parent commit history calling the GraphQL API per commit. Fix: set initial-commits-since to bound the walk to a recent date floor for branches with no prior matching release (e.g. 7.2.x today). 4. Release-drafter v7.2.0 bug: initial-commits-since was silently ignored when set only in release-drafter.yml (not also as a workflow input). Fix: pin to v7.2.1 (commit 563bf132657a13ded0b01fcb723c5a58cdd824e2) which ships the fix from upstream PR #1593. 5. Floating action tag on 7.0.x: 7.0.x used release-drafter@v7 while 8.0.x was pinned to a SHA. Fix: pin all branches to the v7.2.1 commit SHA per ASF security policy (PR #15523). 6. continue-on-error swallowed all failures silently: a transient or permanent drafter failure looked identical to a successful run, which is why the broken state went unnoticed for so long. Fix: keep continue-on-error so PR checks stay green for transient API blips, but add an explicit verification step that loudly logs a workflow-summary warning and a GitHub Actions annotation when no draft id was produced. 7. Workflow ran on PRs targeting any branch, including feature-to-feature PRs that can never affect a release. Fix: restrict the pull_request trigger to PRs whose base ref is a release branch. 8. issues trigger ran on every issue close/reopen even though release drafting has nothing to do with issue lifecycle. Fix: removed. 9. Cross-branch leakage prevention: combine filter-by-commitish: true, filter-by-range: ~MAJOR.MINOR.0 (derived dynamically from the branch name) and tag-prefix: v so release-drafter can never match a release from a different release line when looking for either the "last release" to bump from or an existing draft to update. The fix is being landed on 7.0.x and will be merged forward into 7.1.x, 7.2.x, and 8.0.x. The 8.0.x branch's existing release-drafter@5de93583 pin (v7.2.0) will be superseded by the new v7.2.1 pin during merge. Assisted-by: claude-code:claude-4.6-opus --- .github/release-drafter.yml | 20 ++++ .github/workflows/release-notes.yml | 138 ++++++++++++++++++++++++++-- 2 files changed, 148 insertions(+), 10 deletions(-) diff --git a/.github/release-drafter.yml b/.github/release-drafter.yml index c96e8efb209..eefa8f3ff66 100644 --- a/.github/release-drafter.yml +++ b/.github/release-drafter.yml @@ -130,7 +130,27 @@ categories: exclude-labels: - 'skip-changelog' change-template: '- $TITLE @$AUTHOR (#$NUMBER)' +# Multi-branch isolation: +# - filter-by-commitish: only consider releases targeting this branch (e.g. 7.0.x) +# - tag-prefix: only consider releases tagged with the "v" prefix (defense-in-depth) +# - include-pre-releases: REQUIRED for Apache Grails - releases are marked +# prerelease=true on GitHub during the ASF vote process. Without this, +# release-drafter excludes them when finding the "last release", causing it +# to either bump from a stale baseline (e.g. v7.0.10 instead of v7.0.11) or, +# on branches where ALL releases are prereleases (e.g. 8.0.x at v8.0.0-M1), +# report "No last release" and fall back to walking unbounded git history, +# which exhausts the GitHub API rate limit and produces no draft at all. filter-by-commitish: true +tag-prefix: v +include-pre-releases: true +# Bound API usage when no prior release matches the filters above (e.g. on a +# brand-new release branch like 7.2.x before its first tag). Without this, +# release-drafter walks the entire commit history and exhausts the rate limit. +# - initial-commits-since: hard date floor used only when no last release is +# found. Set just before 7.1.1 / 7.0.11 / 8.0.0-M1 (all published 2026-04-30) +# so newly-cut branches walk only days, not years, of history. Bump this +# when forking a new release line to a date close to the fork point. +initial-commits-since: '2026-04-29T00:00:00Z' version-resolver: major: labels: diff --git a/.github/workflows/release-notes.yml b/.github/workflows/release-notes.yml index 6490a5d9b17..2dda32849c4 100644 --- a/.github/workflows/release-notes.yml +++ b/.github/workflows/release-notes.yml @@ -13,41 +13,159 @@ # See the License for the specific language governing permissions and # limitations under the License. +# Maintains one draft GitHub Release per active release branch (7.0.x, 7.1.x, +# 7.2.x, 8.0.x, ...). Each branch produces an independent draft because the +# release-drafter config in .github/release-drafter.yml combines +# `filter-by-commitish: true`, `filter-by-range: ~MAJOR.MINOR.0`, and +# `tag-prefix: v` so that drafts created on one branch never leak into another. +# +# Companion config: .github/release-drafter.yml name: "Release - Drafter" on: - issues: - types: [closed,reopened] + # Runs on every push to a release branch so the draft for that branch is + # always up to date with the latest merged PRs. push: branches: - '[0-9]+.[0-9]+.x' + # Runs on PRs whose BASE branch is a release branch so the autolabeler can + # apply labels (bug/feature/docs/...) and the draft picks up new PRs as soon + # as they are opened. Feature-to-feature PRs (e.g. fix/foo -> feat/bar) + # are intentionally excluded - they cannot affect any release. pull_request: types: [opened, reopened, synchronize, labeled] + branches: + - '[0-9]+.[0-9]+.x' + # Manual recovery: rerun against any branch (e.g. to recreate a draft after + # one was accidentally deleted, or to seed an initial draft on a new branch). workflow_dispatch: -# queue jobs and only allow 1 run per branch due to the likelihood of hitting GitHub resource limits + +# Per-branch concurrency. Critically: this group MUST NOT collide with the +# `release-pipeline-${branch}` group used by .github/workflows/release.yml. +# The release pipeline has manual approval gates (`environment: release`, +# `environment: docs`, `environment: sdkman`) which routinely keep a release +# run in `waiting` state for HOURS or DAYS until a maintainer approves the +# next stage. When the drafter shared that group, every push to a release +# branch queued behind those waiting runs - producing drafter runs of +# 1400-2000+ minutes that ultimately got cancelled, leaving drafts stale. +# +# Drafter and release.yml never touch the same release object: the drafter +# maintains a DRAFT for the *next* version (e.g. v7.0.12), while release.yml +# uploads assets to the *current published* tag (e.g. v7.0.11). Splitting the +# concurrency groups is therefore safe. +# +# `cancel-in-progress: true`: if multiple pushes land on the same branch in +# quick succession, only the latest matters - the latest run sees every PR +# the older one would have seen, so cancelling pending runs is correct. concurrency: - group: release-pipeline-${{ github.event.pull_request.base.ref || github.ref_name }} - cancel-in-progress: false + group: release-drafter-${{ github.event.pull_request.base.ref || github.ref_name }} + cancel-in-progress: true + jobs: update_release_draft: + name: "Update Release Draft" permissions: - # write permission is required to create a github release + # Required to create or update the draft GitHub Release contents: write - # write permission is required for autolabeler + # Required for the autolabeler to add labels to PRs pull-requests: write runs-on: ubuntu-latest steps: + # release-drafter's `filter-by-range` keeps the action looking only at + # releases whose tag falls inside this branch's MAJOR.MINOR series. This + # is what prevents 7.0.x's draft from being computed against 7.1.x's + # tags (or vice versa). It is derived dynamically from the branch name + # so this workflow file works identically on every release branch. - name: "🔢 Derive semver range from branch" id: version run: | + set -euo pipefail BRANCH="${{ github.event.pull_request.base.ref || github.ref_name }}" - if [[ "$BRANCH" =~ ^[0-9]+\.[0-9]+\.x$ ]]; then - echo "range=~${BRANCH%.x}.0" >> "$GITHUB_OUTPUT" + if [[ "$BRANCH" =~ ^([0-9]+)\.([0-9]+)\.x$ ]]; then + RANGE="~${BASH_REMATCH[1]}.${BASH_REMATCH[2]}.0" + echo "Branch $BRANCH -> filter-by-range $RANGE" + echo "range=$RANGE" >> "$GITHUB_OUTPUT" + else + echo "Branch $BRANCH is not a release branch; skipping range filter (will only match the configured prerelease/commitish filters)" + echo "range=" >> "$GITHUB_OUTPUT" fi + + # Pinned to v7.2.1 by commit SHA per the ASF security policy (matches + # the pinning convention already used by other ASF-approved actions in + # this repo). v7.2.1 is required because it ships PR #1593 - the bug + # fix for `initial-commits-since` being silently ignored when set only + # in the release-drafter.yml config (not also as a workflow input). + # Our release-drafter.yml relies on that exact config-only path to + # bound history walking on brand-new release branches like 7.2.x. + # + # Earlier minor releases also contributed key options we depend on: + # - v7.1.0 (PR #1451): adds the `initial-commits-since` config option. + # - v7.0.0 (PR #1470): adds the `history-limit` config option. + # + # Bump checklist: when updating, verify the new tag is signed, read the + # release notes for any breaking changes to the config schema, and + # update both the SHA and the `# v...` comment together. Resolve the + # commit SHA via: + # gh api repos/release-drafter/release-drafter/git/tags/ \ + # --jq '.object.sha' - name: "📝 Update Release Draft" - uses: release-drafter/release-drafter@v7 + id: drafter + uses: release-drafter/release-drafter@563bf132657a13ded0b01fcb723c5a58cdd824e2 # v7.2.1 + # Drafting release notes is a best-effort, non-critical task - a + # transient GitHub API hiccup must not turn every PR check red. The + # explicit verification step below catches the case where the action + # silently produced no draft (e.g. rate-limit exhaustion). continue-on-error: true with: + # Explicit `commitish` is critical on `pull_request` events: without + # it, release-drafter would default to `refs/pull/N/merge` (a + # virtual ref) which the GitHub API rejects when creating a release, + # producing the "Validation Failed: target_commitish invalid" error + # historically seen on PRs (see INFRA-27602). commitish: ${{ github.event.pull_request.base.ref || github.ref_name }} filter-by-range: ${{ steps.version.outputs.range }} env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + + # Surface failures that `continue-on-error` would otherwise hide. The + # release-drafter action exposes the resulting release id as an output; + # an empty value means it failed to create or update a draft. We log a + # loud warning (visible in the workflow summary and as a GitHub Actions + # annotation) but do NOT fail the job - PR checks must stay green for + # transient API issues, while still alerting maintainers something is + # wrong if drafts go missing for multiple consecutive runs. + - name: "🔎 Verify draft was created or updated" + if: always() + env: + DRAFT_ID: ${{ steps.drafter.outputs.id }} + DRAFT_TAG: ${{ steps.drafter.outputs.tag_name }} + DRAFT_NAME: ${{ steps.drafter.outputs.name }} + DRAFT_URL: ${{ steps.drafter.outputs.html_url }} + DRAFT_OUTCOME: ${{ steps.drafter.outcome }} + BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} + run: | + set -euo pipefail + { + echo "## Release Drafter Result" + echo "" + echo "- Branch: \`${BRANCH}\`" + echo "- Step outcome: \`${DRAFT_OUTCOME}\`" + } >> "$GITHUB_STEP_SUMMARY" + if [[ -z "${DRAFT_ID:-}" ]]; then + { + echo "- Status: ⚠️ **No draft created/updated**" + echo "" + echo "release-drafter ran but did not produce a draft release id." + echo "Common causes: GitHub API rate limit exhausted, no prior" + echo "release matched the commitish/range/tag-prefix filters, or" + echo "the action errored. Check the previous step's logs." + } >> "$GITHUB_STEP_SUMMARY" + echo "::warning title=Release draft missing::release-drafter produced no draft for branch ${BRANCH}. Step outcome was '${DRAFT_OUTCOME}'. See workflow summary." + else + { + echo "- Status: ✅ Draft maintained" + echo "- Tag: \`${DRAFT_TAG}\`" + echo "- Name: ${DRAFT_NAME}" + echo "- URL: ${DRAFT_URL}" + } >> "$GITHUB_STEP_SUMMARY" + echo "Draft ${DRAFT_TAG} (id ${DRAFT_ID}) maintained for branch ${BRANCH}: ${DRAFT_URL}" + fi