-
Notifications
You must be signed in to change notification settings - Fork 18
HYPERFLEET-633 - docs: Define release contract and integration testing strategy with GCP and ROSA teams #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,166 @@ | ||
| --- | ||
| Status: Draft | ||
| Owner: HyperFleet Team | ||
| Last Updated: 2026-06-02 | ||
| --- | ||
|
|
||
| # HyperFleet Release Contract | ||
|
|
||
| ## What & Why | ||
|
|
||
| ### What | ||
|
|
||
| This document defines the formal release contract between the HyperFleet team and its partner teams (GCP Offering Team and ROSA Regional Platform Team). It covers: | ||
|
|
||
| - **Release handoff contract**: what artifacts are produced, how partners are notified, and what SLAs apply | ||
| - **Integration testing strategy**: how each team gates HyperFleet changes and how tests are coordinated across teams | ||
| - **Test ownership map**: which team owns which layer of testing to eliminate redundancy and close coverage gaps | ||
|
|
||
| ### Why | ||
|
|
||
| - Without a defined contract, release handoffs are ad-hoc and require manual coordination between teams, slowing delivery and increasing error risk | ||
| - Unclear test ownership creates either coverage gaps (bugs reaching production) or overlapping test suites (longer pipelines without benefit) | ||
| - Partner teams deploying HyperFleet via Argo CD and Terraform need predictable, machine-consumable artifacts (OCI Helm charts) to automate rollout | ||
| - A shared testing strategy is required before building confidence in continuous delivery pipelines end-to-end | ||
|
|
||
| ### Out of scope | ||
|
|
||
| - No automated deployment of HyperFleet releases to partner integration environments | ||
|
|
||
| --- | ||
|
|
||
| ## Partner Teams | ||
|
|
||
| | Team | Platform | Deployment Method | | ||
| |------|----------|-------------------| | ||
| | GCP Offering Team | GCP | Argo CD + Terraform + Tekton Pipelines | | ||
| | ROSA Regional Platform | AWS ROSA | Argo CD + Terraform + AWS CodePipelines | | ||
|
|
||
| --- | ||
|
|
||
| ## Release Handoff Contract | ||
|
|
||
| ### Release Artifacts | ||
|
|
||
| For each HyperFleet release, the following artifacts are produced and made available to partner teams: | ||
|
|
||
| | Artifact | Location | Format | Notes | | ||
| |----------|----------|--------|-------| | ||
| | Container images | `quay.io/redhat-services-prod/hyperfleet-tenant/hyperfleet/hyperfleet-{component}:{version}` | OCI image | Built automatically by Konflux on release tag push (PaC tag-triggered pipeline); includes signed provenance (SLSA L3) and SBOM | | ||
| | Helm charts | `quay.io/redhat-services-prod/hyperfleet-tenant/hyperfleet-{component}-chart:{version}` (see [Helm Chart Distribution](#helm-chart-distribution)) | OCI artifact | Published via Konflux `build-helm-chart-oci-ta` task; required for ROSA/Argo CD consumption | | ||
| | Release notes | `hyperfleet-release` repo, `releases/release-X.Y/` | Markdown | Compatibility matrix, breaking changes, upgrade guide | | ||
| | Compatibility matrix | `hyperfleet-release` repo | Markdown table | Maps validated component version combinations | | ||
| | Git tags | Per-component repos + `hyperfleet-release` | `vX.Y.Z` / `release-X.Y` | See [Release Process](hyperfleet-release-process.md) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tried to navigate to this link but it's missing |
||
|
|
||
|
|
||
| When a GA release is published, it will have detail of which ROSA/GCP versions have passed the integration tests to use as compatibility matrix. This allows to potentially introduce a breaking change in one release, that may be only deployable by another pillar. | ||
|
ciaranRoche marked this conversation as resolved.
|
||
|
|
||
| ### Helm Chart Distribution | ||
|
|
||
| HyperFleet team publishes charts as OCI artifacts to `quay.io/redhat-services-prod/hyperfleet-tenant/` via Konflux (using the `build-helm-chart-oci-ta` task), eliminating the temporary workaround and the Git coupling. Blocked on the Konflux team providing a managed release pipeline for external registries (RELEASE-2363). | ||
|
|
||
|
|
||
| ### Notification SLA | ||
|
|
||
| When a HyperFleet GA release is published: | ||
|
|
||
| | Event | Channel | Timeline | Recipients | | ||
| |-------|---------|----------|------------| | ||
| | Release candidate available | `#hyperfleet-releases` Slack | RC cut day | GCP team, ROSA team | | ||
| | GA release published | `#hyperfleet-releases` Slack | GA day | GCP team, ROSA team | | ||
| | Breaking change in next release | `#hyperfleet-releases` Slack | ≥ 1 sprint before GA | GCP team, ROSA team | | ||
|
ciaranRoche marked this conversation as resolved.
|
||
| | Hotfix / patch release | `#hyperfleet-releases` Slack | Within 2 hours of GA tag | GCP team, ROSA team | | ||
|
|
||
|
|
||
| At this point in time (Jun 26) breaking changes are not blockers to HyperFleet releases as ROSA/GCP teams do not have to keep long running clusters and migrate data. | ||
|
|
||
| ### Rollback / Recovery | ||
|
|
||
| HyperFleet uses a **roll-forward** strategy for MVP: issues are fixed via patch releases rather than rollback. See [Release Process — Release Recovery Strategy](hyperfleet-release-process.md#55-release-recovery-strategy). | ||
|
ciaranRoche marked this conversation as resolved.
|
||
|
|
||
| HyperFleet commits to (exact times TBD): | ||
|
|
||
| - Producing a patch release within **48 hours** for Blocker/Critical regressions | ||
| - Producing a patch release within **1 week** for Major regressions | ||
|
rh-amarin marked this conversation as resolved.
|
||
| - Maintaining N-1 backward compatibility so partner teams can remain pinned to the previous validated release while a fix is in flight | ||
|
|
||
| --- | ||
|
|
||
| ## Integration Testing Strategy | ||
|
|
||
| ### Decision: Nightly Runs with OCI Chart Injection | ||
|
|
||
| **Agreed approach** (as of March 31, 2026 meeting): | ||
|
|
||
| - Start with **nightly runs** against HyperFleet `main` branch, not presubmit jobs | ||
| - Test against the **latest known-good stable version** of the ROSA regional platform (production Maestro version), replacing only the HyperFleet component under test | ||
| - The ROSA team will **temporarily enable OCI chart pushing** so the HyperFleet team can inject PR-built charts into the ROSA deployment pipeline | ||
| - Evaluate **non-blocking presubmit** integration with the HyperFleet release repository as a follow-up | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be better to add similar prow jobs—like the E2E testing Prow job—as a standard requirement and include them in the release X.Y checklist? |
||
|
|
||
| **Rationale**: Running full ROSA environment provisioning (~40 minutes + E2E duration) as a presubmit would significantly impact development velocity without proportional benefit at the current team scale. Nightly runs provide meaningful feedback without blocking day-to-day development. | ||
|
|
||
| **Note on ROSA's existing pre-merge capability**: The ROSA repo already has a working cross-component E2E pre-merge mechanism (triggered via Prow comment on PRs). The decision to start with nightly runs is about HyperFleet's readiness to onboard to that mechanism — not a limitation of the ROSA infrastructure. Per-PR testing remains the target once the OCI chart injection step is stable. | ||
|
|
||
|
|
||
| ### Team Test Ownership | ||
|
|
||
| | Layer | Owner | Scope | Runs on | | ||
| |-------|-------|-------|---------| | ||
| | Unit tests | HyperFleet | Each component in isolation | Every PR (presubmit) | | ||
| | Integration tests | HyperFleet | Cross-component API contracts | Every PR (presubmit) | | ||
| | HyperFleet E2E | HyperFleet | HyperFleet stack end-to-end | Nightly (main branch) | | ||
| | ROSA integration | ROSA Team | Full ROSA region + HyperFleet override | Nightly (HyperFleet main) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Who is responsible to update the configuration of breaking changes for nightly integration testing? And how to notify? |
||
| | GCP integration | GCP Team | GCP deployment + HyperFleet | Nightly (HyperFleet main) via Tekton `gcp-region-e2e-pipeline` | | ||
| | Release gate | HyperFleet | All of the above must pass | Before GA tag | | ||
|
|
||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
| **Important**: HyperFleet releases ship with a compatibility matrix for the ROSA/GCP versions tested. If newer ROSA/GCP integration tests fail, that doesn't gate the HyperFleet release as long as it passes the successful ROSA/GCP compatible versions. | ||
|
|
||
|
Comment on lines
+114
to
+117
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make GA gate criteria internally operable and unambiguous. Line 114 says “all of the above must pass,” but Line 116 introduces an exception for failed newer ROSA/GCP versions. This creates two gate definitions in the same contract. Define one canonical rule in the table row itself (e.g., “all mandatory compatible-version layers must pass”). 🧰 Tools🪛 LanguageTool[grammar] ~116-~116: Use a hyphen to join words. (QB_NEW_EN_HYPHEN) 🤖 Prompt for AI Agents |
||
| ### Testing Gaps Identified | ||
|
|
||
| | Gap | Owning Team | Mitigation | | ||
| |-----|-------------|------------| | ||
| | HyperFleet not yet onboarded to ROSA's pre-merge E2E mechanism | HyperFleet | Onboard to `openshift/release` Prow config + create `quay.io/rrp-dev-ci/` image repos (see onboarding steps above) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are the onboarding steps referencing here? |
||
| | Helm chart override (OCI) not yet wired into ROSA CI | ROSA + HyperFleet | Temporary OCI setup by ROSA team (Q2 2026, immediate action); replaced by Konflux `build-helm-chart-oci-ta` pipeline — blocked on Konflux managed release pipeline for external registries (RELEASE-2363); see [Helm OCI Distribution Design](../release/helm-oci-distribution-design.md) | | ||
| | GCP integration tests not yet publishing chart overrides into HyperFleet CI | GCP + HyperFleet | Blocked on GCP-334 (CLM/Argo CD integration in progress); nightly Tekton pipeline already exists but consumes pinned tags, not PR-built charts | | ||
| | Multi-component PR testing (API + Adapter in same PR) | HyperFleet | Nightly tests use `main` for all other components; single-component override per nightly run is the starting point | | ||
| | Presubmit integration gate for HyperFleet release repo | HyperFleet | Future action: non-blocking presubmit on `hyperfleet-release` repo | | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| ### 1. Non-blocking Presubmit on HyperFleet Release Repository | ||
|
|
||
| Run the full ROSA integration pipeline as an optional, non-blocking presubmit job triggered on the `hyperfleet-release` repo. | ||
|
|
||
| **Rejected for now**: A ~40-minute+ non-blocking job provides weak signal — developers may ignore it, especially if failures are infrequent. Starting with nightly runs builds confidence in the pipeline before promoting it to presubmit. This remains a **future action**. | ||
|
|
||
| ### 2. Consumer-Driven Contract Testing (Pact-style) | ||
|
|
||
| Define formal API contracts using a consumer-driven contract testing tool (e.g., Pact). ROSA and GCP publish their expectations; HyperFleet CI verifies them on every PR. | ||
|
|
||
| **Rejected for MVP**: The integration surface between HyperFleet and partner teams is primarily at the Helm chart / deployment configuration level, not a REST API contract boundary. Consumer-driven contract testing tools are better suited to service-to-service REST contracts. Helm value schema validation is a lighter-weight alternative to investigate post-MVP. | ||
|
|
||
| ### 3. Automated Rollout to Integration Environments on GA | ||
|
|
||
| Trigger automatic deployment of each HyperFleet GA release to ROSA and GCP integration environments via webhooks. | ||
|
|
||
| **Rejected for MVP**: ROSA's pipeline takes ~40 minutes per run and requires environment-specific configuration overrides. Automating this safely requires tooling (OCI charts via Konflux, pipeline webhooks) not yet in place. Deferred to post-Q2 2026. | ||
|
|
||
|
Comment on lines
+131
to
+150
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win Add an explicit “Trade-offs” section for this architectural decision doc. You have “Alternatives Considered,” but there is no explicit “Trade-offs” section describing gains/losses/risks of the chosen approach. Please add one near the decision content. 🤖 Prompt for AI Agents |
||
| --- | ||
|
|
||
| ## Related Documents | ||
|
|
||
| - [HyperFleet Release Process](../release/hyperfleet-release-process.md) — release cadence, branching, artifacts | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do these paths work? I think it should just be release/ instead of |
||
| - [Konflux Release Pipeline Design](../release/konflux-release-pipeline-design.md) — build and release pipeline architecture, CEL tag matching, version injection | ||
| - [Helm OCI Distribution Design](../release/helm-oci-distribution-design.md) — how Helm charts are published as OCI artifacts via Konflux | ||
| - [ADR 0014 — Konflux for Container Image Build and Release](../../adrs/0014-konflux-build-and-release.md) | ||
| - [ADR 0016 — Helm OCI Distribution via Konflux](../../adrs/0016-helm-oci-distribution.md) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above, think we are jumping back an extra hop to get to adrs |
||
| - [Versioning Trade-offs](versioning-trade-offs.md) — SDK versioning, rollback considerations | ||
| - [E2E Testing Framework Spike Report](e2e-testing/e2e-testing-framework-spike-report.md) | ||
| - [E2E Run Strategy Spike Report](e2e-testing/e2e-run-strategy-spike-report.md) | ||
| - [ROSA — Adding a Component for Pre-merge E2E Testing](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/adding-component-pre-merge.md) — onboarding guide for the `/test rosa-regionality-compatibility-e2e` Prow trigger | ||
| - [ROSA — Testing Strategy Design](https://github.com/openshift-online/rosa-regional-platform/blob/main/docs/design/testing-strategy.md) — three CI workflows (pre-merge, nightly integration, nightly ephemeral) | ||
| - GCP-334 — CLM Components Deployment (linked Jira epic) | ||
| - HYPERFLEET-633 — Define release contract and integration testing strategy | ||
Uh oh!
There was an error while loading. Please reload this page.