Skip to content

[WIP] UPSTREAM: <carry>: automate hermetic build requirements generation#79

Open
mytreya-rh wants to merge 1 commit intoopenshift:mainfrom
mytreya-rh:automate-requirements-generation
Open

[WIP] UPSTREAM: <carry>: automate hermetic build requirements generation#79
mytreya-rh wants to merge 1 commit intoopenshift:mainfrom
mytreya-rh:automate-requirements-generation

Conversation

@mytreya-rh
Copy link
Copy Markdown
Contributor

@mytreya-rh mytreya-rh commented May 7, 2026

Work in Progress — feedback welcome; not ready to merge.

Summary

Replaces the hand-maintained ~100-line bash pipeline in
openshift/Dockerfile.requirements with
openshift/hack/generate_requirements.py, a Python script that derives all
four hermetic build requirements files entirely from the Pipfile — no
hardcoded package names, no manual conflict resolution.

  • Stage 1pipenv install + CVE auto-fix via Safety/pipenv update,
    then pip freeze to pin all runtime packages.
  • Stage 2 — Iterative pip-compile with dynamic conflict exclusion →
    requirements.txt. Packages causing compilation failures are detected from
    pip-compile's error output and handled automatically.
  • Stage 3pip_find_builddeps.py run per-package to map each package
    to its build-system requirements individually.
  • Stage 4 — Recursive conflict detection and phase splitting. Merges all
    build-dep constraints; on failure identifies the conflicting dependency via
    three fallback strategies (direct upper-bound heuristic → per-package
    compilation for transitive conflicts → single-package bisection). Generates
    all three build requirements files. Build-isolation exact-version pins
    (e.g. wheel==0.45.1 from ansible-core's pyproject.toml) are
    auto-discovered from package metadata, injected into pre-build so cachi2
    pre-fetches them, and stripped from later phases so those phases resolve
    newer CVE-fixed versions.
  • Stage 5 — Safety CVE scan of generated build files; auto-fixes by adding
    minimum-version constraints and re-running pip-compile.

Full algorithm documented in openshift/hack/generate_requirements.md.

Requirements file changes

File Status Notable changes
requirements.txt ✅ 28 active pyasn1 0.6.2→0.6.3 (CVE), requests 2.32.5→2.33.1 (CVE), certifi/idna/charset-normalizer bumped
requirements-pre-build.txt ✅ 10 active (unchanged count) packaging 26.0→26.2; correct setuptools==70/scm==8.1/wheel==0.45.1
requirements-build.txt ✅ 19 active wheel 0.46.3→0.47.0 (CVE-2026-24049 fixed); +setuptools-rust, +vcs-versioning; all old build1 packages absorbed
requirements-build1.txt ○ Skipped kubernetes 33.1.0 has an internal setuptools-scm conflict in its own build-system declaration; its build deps are excluded with a note

Test plan

  • make -f openshift/Makefile generate-requirements builds and runs cleanly
  • make -f openshift/Makefile check-requirements passes (no unexpected diff)
  • Review generated requirements files for correctness
  • Verify ART/OSBS image build succeeds with these requirements

Made with Cursor

Summary by CodeRabbit

Release Notes

  • New Features

    • Added hermetic requirements generation system with automated dependency conflict resolution and security vulnerability scanning.
    • Added documentation for the requirements generation process.
  • Chores

    • Refactored build requirements generation approach.
    • Updated Python package dependencies including requests, certifi, packaging, wheel, setuptools, and related build tools.

Replace the manually maintained ~100-line bash pipeline in
openshift/Dockerfile.requirements with a Python script that derives
all four requirements files entirely from the Pipfile, with no
hardcoded package names.

openshift/hack/generate_requirements.py implements five stages:

Stage 1 – pipenv install + CVE auto-fix via Safety/pipenv update,
  then pip freeze to capture all pinned runtime packages.

Stage 2 – Iterative pip-compile with dynamic conflict exclusion to
  produce requirements.txt.  Packages that make pip-compile fail due
  to incompatible declared metadata (e.g. conflicting setuptools
  version ranges) are detected from the error output, excluded from
  compilation, and appended manually.  RPM-installed packages
  (cryptography, cffi, pycparser, maturin) are commented out in
  post-processing.

Stage 3 – pip_find_builddeps.py is run once per runtime package so
  that every package's build-system requirements can be associated
  with it individually.

Stage 4 – Conflict detection and phase splitting.  Merging all
  build-dep constraints is attempted with pip-compile; when it fails
  the conflicting dependency is identified and packages split into an
  earlier phase (needing the older version) and a later phase
  (needing the newer version) using three fallback strategies in
  order: direct upper-bound heuristic, per-package compilation to
  detect transitive conflicts, and single-package bisection.
  N discovered phases are mapped to exactly three build files with a
  greedy merge that verifies compatibility before absorbing each
  middle phase into the main build group.
  Build-isolation exact-version pins (e.g. wheel==0.45.1 declared by
  ansible-core's pyproject.toml) are discovered automatically from
  pkg_constraints, injected into requirements-pre-build.txt so cachi2
  pre-fetches them, and stripped from later phases so those phases
  resolve newer CVE-fixed versions.  No version numbers are hardcoded.

Stage 5 – Safety scans each generated build requirements file for
  CVEs and attempts to fix them by adding minimum-version constraints
  and re-running pip-compile.  Conflicts that prevent the fix are
  reported with the name of the blocking constraint.

openshift/hack/generate_requirements.md documents the full algorithm.

openshift/Dockerfile.requirements is reduced to installing the
toolchain and invoking the script.  The generated requirements files
and Pipfile.lock (updated by any CVE auto-fixes) are exported to the
mounted volume by the ENTRYPOINT.

The regenerated requirements files reflect:
- pyasn1 0.6.2 → 0.6.3  (CVE-2026-30922)
- requests 2.32.5 → 2.33.1  (CVE-2026-25645)
- wheel 0.46.3 → 0.47.0 in requirements-build.txt (CVE-2026-24049;
  0.45.1 is retained in requirements-pre-build.txt for ansible-core
  build isolation as discovered from its build-system metadata)
- certifi, charset-normalizer, idna, packaging, pathspec,
  poetry-core, setuptools, trove-classifiers bumped to latest
- kubernetes 33.1.0 build deps skipped (internal setuptools-scm
  conflict in the package's own build-system declaration)

Co-authored-by: Cursor <cursoragent@cursor.com>
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Walkthrough

The PR introduces a new hermetic requirements generator for OpenShift, replacing a manual multi-step shell pipeline in the Dockerfile with a Python-based tool that automates dependency resolution, build-dependency discovery, phase splitting for conflicting constraints, and CVE remediation across four output requirements files.

Changes

Requirements Generation System Overhaul

Layer / File(s) Summary
Build Infrastructure
openshift/Dockerfile.requirements
Dockerfile simplified to copy in the new generator script and invoke it instead of running manual pipenv, pip-compile, and sed pipeline steps; artifact export and entrypoint remain unchanged.
Generator Core Implementation
openshift/hack/generate_requirements.py
New ~1259-line Python script implementing five-stage hermetic requirements pipeline: Stage 1 pins runtime deps via pipenv; Stage 2 iteratively resolves conflicts and generates requirements.txt; Stage 3 collects per-package build dependencies via pip_find_builddeps.py; Stage 4 recursively phase-splits conflicting build constraints into three files with dynamic build-isolation pin injection; Stage 5 scans for CVEs and auto-fixes with minimum safe versions.
Documentation
openshift/hack/generate_requirements.md
Comprehensive guide covering the pipeline stages, configuration (RPM_INSTALLED), phase-splitting heuristics (direct bounds, transitive conflict detection, bisection fallback), build-isolation pin discovery, CVE remediation strategy, and local execution instructions.
Generated Artifacts
openshift/requirements.txt, openshift/requirements-pre-build.txt, openshift/requirements-build.txt, openshift/requirements-build1.txt
Regenerated requirements outputs reflect new generator's output format (absolute paths in pip-compile headers) and updated pinned versions: certifi, charset-normalizer, idna, requests bumped in requirements.txt; packaging and poetry-core bumped in build files; setuptools-rust added to unsafe list; requirements-build1.txt reduced to comment-only (conflict documentation).

Sequence Diagram

sequenceDiagram
    participant Docker as Dockerfile
    participant Gen as Generator Script
    participant Pipenv as pipenv
    participant PipTools as pip-tools
    participant BuildDeps as pip_find_builddeps
    participant Safety as Safety
    participant Output as Artifact Files

    Docker->>Gen: invoke generate_requirements.py
    
    Gen->>Pipenv: Stage 1: install --deploy
    Pipenv-->>Gen: pinned runtime deps
    Gen->>Gen: CVE auto-fix via pipenv check
    
    Gen->>PipTools: Stage 2: iterative pip-compile
    PipTools-->>Gen: conflict errors (iteratively)
    Gen->>Gen: exclude conflicts, append manually
    Gen->>Output: requirements.txt
    
    Gen->>BuildDeps: Stage 3: per-package build deps
    BuildDeps-->>Gen: constraint sets
    
    Gen->>Gen: Stage 4: recursive phase-split
    Gen->>PipTools: compile each phase
    PipTools-->>Gen: success + phase outputs
    Gen->>Output: requirements-pre-build.txt<br/>requirements-build1.txt<br/>requirements-build.txt
    
    Gen->>Safety: Stage 5: CVE scan
    Safety-->>Gen: vuln reports
    Gen->>Gen: infer min safe versions
    Gen->>PipTools: recompile with min specs
    PipTools-->>Gen: updated pins
    Gen->>Output: updated build files
    
    Gen->>Output: export Pipfile.lock
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: automating hermetic build requirements generation, replacing manual bash pipelines with a Python script.
Docstring Coverage ✅ Passed Docstring coverage is 95.24% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Ginkgo test names are all stable and deterministic. No dynamic values (pods, timestamps, UUIDs, nodes, namespaces, IPs) in test titles. One string concatenation found is compile-time constant.
Test Structure And Quality ✅ Passed Check requires reviewing Ginkgo test code. PR has no test files, only build changes (Dockerfile, Python script, docs, requirements). Not applicable.
Microshift Test Compatibility ✅ Passed PR contains no new Ginkgo e2e tests. Changes are limited to build infrastructure (Dockerfile), Python scripts, documentation, and dependency lock files. Check not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All changes are build configuration and Python requirements generation scripts. The SNO compatibility check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR contains only build infrastructure changes (Docker, Python script, requirements files). No deployment manifests, operator code, or scheduling constraints introduced.
Ote Binary Stdout Contract ✅ Passed Not applicable. PR contains no Go code changes—only Python scripts, Dockerfile, and requirements files. No process-level stdout in Go binaries affected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR contains no Ginkgo e2e tests. Changes include Docker config, Python automation script, and dependency files. The check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from fabianvf and oceanc80 May 7, 2026 02:47
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 7, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mytreya-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
openshift/hack/generate_requirements.py (3)

454-461: 💤 Low value

Consider using itertools.pairwise() for successive pairs.

Since this script targets Python 3.12, itertools.pairwise() is a cleaner alternative to zip(unique, unique[1:]) for iterating over consecutive elements.

♻️ Suggested refactor
+from itertools import pairwise
+
 # ...
 
-    for lo, hi in zip(unique, unique[1:]):
+    for lo, hi in pairwise(unique):
         gap = (hi.major - lo.major) * 1000 + (hi.minor - lo.minor)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/hack/generate_requirements.py` around lines 454 - 461, Replace the
manual consecutive-pair loop using zip(unique, unique[1:]) with
itertools.pairwise for clarity: import pairwise from itertools and iterate "for
lo, hi in pairwise(unique):" while preserving the gap calculation and update of
split_after and max_gap (the variables split_after and max_gap and the gap
computation using hi.major/hi.minor and lo.major/lo.minor should remain
identical so behavior is unchanged).

991-991: 💤 Low value

Unused variable stderr2.

The variable stderr2 from the unpacking is never used. Prefix with underscore to indicate intentional discard.

♻️ Suggested fix
-                ok2, stderr2 = _pip_compile(in_path, txt_path, ["--allow-unsafe"])
+                ok2, _stderr2 = _pip_compile(in_path, txt_path, ["--allow-unsafe"])
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/hack/generate_requirements.py` at line 991, The unpacking of
_pip_compile(in_path, txt_path, ["--allow-unsafe"]) assigns stderr2 which is
never used; change the unpack to discard the second value by prefixing it with
an underscore (e.g., ok2, _stderr2 = _pip_compile(...)) so it's clear the stderr
result is intentionally ignored while leaving _pip_compile, ok2, in_path, and
txt_path unchanged.

880-884: 💤 Low value

Remove extraneous f prefix from string without placeholders.

Line 882 has an f-string prefix but no placeholders inside the first string portion.

♻️ Suggested fix
     if auto_iso_pins:
         print(
-            f"  Auto-detected build-isolation pins for pre-build: "
+            "  Auto-detected build-isolation pins for pre-build: "
             + ", ".join(f"{k}=={v}" for k, v in sorted(auto_iso_pins.items()))
         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift/hack/generate_requirements.py` around lines 880 - 884, The print
call that logs auto-detected pins uses an unnecessary f-string prefix on the
first literal; update the print in the auto_iso_pins branch so the first string
is a normal string (remove the leading f on "  Auto-detected build-isolation
pins for pre-build: ") while keeping the concatenation with ",
".join(f"{k}=={v}" for k, v in sorted(auto_iso_pins.items())); locate the print
using the auto_iso_pins variable to adjust the literal.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@openshift/hack/generate_requirements.py`:
- Around line 454-461: Replace the manual consecutive-pair loop using
zip(unique, unique[1:]) with itertools.pairwise for clarity: import pairwise
from itertools and iterate "for lo, hi in pairwise(unique):" while preserving
the gap calculation and update of split_after and max_gap (the variables
split_after and max_gap and the gap computation using hi.major/hi.minor and
lo.major/lo.minor should remain identical so behavior is unchanged).
- Line 991: The unpacking of _pip_compile(in_path, txt_path, ["--allow-unsafe"])
assigns stderr2 which is never used; change the unpack to discard the second
value by prefixing it with an underscore (e.g., ok2, _stderr2 =
_pip_compile(...)) so it's clear the stderr result is intentionally ignored
while leaving _pip_compile, ok2, in_path, and txt_path unchanged.
- Around line 880-884: The print call that logs auto-detected pins uses an
unnecessary f-string prefix on the first literal; update the print in the
auto_iso_pins branch so the first string is a normal string (remove the leading
f on "  Auto-detected build-isolation pins for pre-build: ") while keeping the
concatenation with ", ".join(f"{k}=={v}" for k, v in
sorted(auto_iso_pins.items())); locate the print using the auto_iso_pins
variable to adjust the literal.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fe4f21d4-01ca-4770-9bdd-7ecdd39137a3

📥 Commits

Reviewing files that changed from the base of the PR and between 8c00498 and 3443837.

⛔ Files ignored due to path filters (1)
  • openshift/Pipfile.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • openshift/Dockerfile.requirements
  • openshift/hack/generate_requirements.md
  • openshift/hack/generate_requirements.py
  • openshift/requirements-build.txt
  • openshift/requirements-build1.txt
  • openshift/requirements-pre-build.txt
  • openshift/requirements.txt

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 7, 2026

@mytreya-rh: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant