Skip to content

feat: add /qualify AI qualification workflow#474

Open
myakove wants to merge 13 commits into
mainfrom
feature/qualify-workflow
Open

feat: add /qualify AI qualification workflow#474
myakove wants to merge 13 commits into
mainfrom
feature/qualify-workflow

Conversation

@myakove

@myakove myakove commented May 6, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds the /qualify AI qualification workflow — a fully automated end-to-end pipeline that takes a feature design doc or bug report and produces verified tests with cluster proof.

What's Included

Prompt Template

  • llm/qualify/prompts/qualify.md — Main /qualify command that orchestrates all 4 phases

Agents

  • llm/qualify/agents/test-planner.md — Reads feature/bug docs → produces structured test plans
  • llm/qualify/agents/cluster-verifier.md — Independently verifies OpenShift cluster state after test execution

Skill

  • llm/qualify/skills/proof-generator/SKILL.md — Assembles proof.md reports with test results + cluster evidence

Templates

  • llm/qualify/templates/test-plan-template.md — Test plan skeleton
  • llm/qualify/templates/proof-template.md — Proof report skeleton

Documentation

  • llm/qualify/README.md — Full usage guide with setup instructions for pi, Claude Code, Cursor, and other AI CLIs
  • llm/qualify/workflow-diagrams.md — Mermaid flowcharts (workflow, components, sequence diagram)

Workflow Overview

/qualify --type feature --source <url> --cluster ~/kubeconfig
  1. Phase 0: Parse args, validate cluster, collect versions (OCP/MTV/CNV)
  2. Phase 1: AI reads source → produces test plan → human reviews
  3. Phase 2: AI writes tests → runs on real cluster → cluster-verifier independently validates
  4. Phase 3: 3 parallel code reviewers → pre-commit → PR
  5. Phase 4: Generates self-contained proof.md with verdict (QUALIFIED / NOT QUALIFIED / BUG FIXED / BUG NOT FIXED)

Human Checkpoints

  • Test plan review (Phase 1)
  • Bug: permanent test or verify-only? (Phase 0)
  • AI stuck after 3 retries (Phase 2)
  • PR review (Phase 3)

Do NOT auto-merge. This needs human review.

Summary by CodeRabbit

  • New Features

    • Adds an end-to-end "qualify" workflow to generate tests, run them on a real cluster when explicitly invoked, verify cluster state independently, and produce a self-contained qualification proof.
  • Documentation

    • Adds comprehensive docs: workflow phases, CLI arguments/modes (feature vs bug verify-only), agent/skill/prompt specs, test-plan and proof templates, verification evidence rules, and workflow diagrams; includes a note about explicit user invocation for real-cluster test runs.
  • Chores

    • Updated ignore rules to exclude qualification output artifacts.

@coderabbitai

coderabbitai Bot commented May 6, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds an end-to-end /qualify workflow: orchestrator prompt, agent and skill specifications (test-planner, cluster-verifier, proof-generator), templates, diagrams, docs, and a .qualify/ gitignore entry for generated artifacts and outputs.

Changes

AI Qualification Workflow for MTV API Tests

Layer / File(s) Summary
Configuration and Policy
.gitignore, CLAUDE.md, README.md
Adds .qualify/ to .gitignore and documents that /qualify may run pytest on a real cluster when explicitly invoked with cluster credentials (pointer to llm/qualify/README.md).
Documentation Overview
llm/qualify/README.md, README.md
Adds the /qualify command overview, five-phase sequence, CLI args (--type,--source,--cluster,--name), usage examples, human checkpoints, AI-CLI setup instructions, expected directory structure, outputs, and prerequisites.
Workflow Architecture
llm/qualify/workflow-diagrams.md
Adds detailed Mermaid diagrams and sequence flows describing Phases 0–4, bug verify-only branching, component relationships, and key takeaways.
Main Orchestration Prompt
llm/qualify/prompts/qualify.md
Defines Phase 0 (arg parsing, cluster validation, version collection), Phase 1 (test-plan generation with human approval), Phase 2 (test writing, pytest execution, cluster verification with retry loop and verify-only path), Phase 3 (code review, pre-commit, PR creation), Phase 4 (proof assembly and verdict), and critical gating rules.
Test Planner Agent
llm/qualify/agents/test-planner.md
Specifies Phase 1 agent: required inputs/prereqs, exact test-plan.md structure (overview, scenarios, step→test mappings, expected outcomes, verification points, VM config, tests_params, pytest markers), quality checklist, and output behavior.
Templates
llm/qualify/templates/test-plan-template.md, llm/qualify/templates/proof-template.md
Adds Markdown templates for test-plan and proof reports: placeholders for metadata, prerequisites, scenarios, test configuration (tests_params), pytest marker guidance, test result tables with full output blocks, cluster verification evidence sections, and risk assessment.
Cluster Verifier Agent
llm/qualify/agents/cluster-verifier.md
Specifies independent oc-based verification: connectivity gating (oc whoami/oc cluster-info), mandatory version capture (OCP/MTV/CNV), migration verification checklist (VM readiness, disks/DVs/PVCs, networks, StorageMap/NetworkMap, Plans/Migrations), per-check evidence capture rules, structured Markdown report, bug verification mode (BUG FIXED / BUG NOT FIXED), and failure handling.
Proof Generator Skill
llm/qualify/skills/proof-generator/SKILL.md
Defines exact proof.md template and rules: Summary, Environment (versions), Test Execution Results (collapsible pytest output), Cluster Verification (verification table + raw evidence), Qualification Decision/verdict wording, evidence redaction rules, and output path conventions.
Orchestration: Write & Verify
llm/qualify/prompts/qualify.md
Describes delegation to test writers, running pytest on a real cluster (permanent test vs verify-only), logging pytest outputs, invoking cluster-verifier, retry/escalation loops, and branch to PR creation only for permanent tests.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Orchestrator
  participant TestPlannerAgent
  participant PythonWriter
  participant RealCluster
  participant ClusterVerifier
  participant ProofGenerator
  participant Git
  User->>Orchestrator: invoke /qualify --type --source [--cluster]
  Orchestrator->>TestPlannerAgent: generate test-plan.md from source
  Orchestrator->>User: request human approval of test-plan.md
  Orchestrator->>PythonWriter: write tests & config (branch for permanent tests)
  PythonWriter->>RealCluster: run pytest (capture output -> test-output.log)
  RealCluster-->>Orchestrator: pytest stdout/stderr + exit code
  Orchestrator->>ClusterVerifier: run independent cluster verification
  ClusterVerifier-->>Orchestrator: cluster verification report
  Orchestrator->>ProofGenerator: assemble proof.md (tests + verification + versions)
  Orchestrator->>Git: create PR (permanent-test path) with proof reference
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add /qualify AI qualification workflow' directly and clearly summarizes the main change: introduction of a new /qualify AI-driven qualification workflow system.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/qualify-workflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rh-bot-1

rh-bot-1 commented May 6, 2026

Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. Status Checks: All required status checks must pass
  3. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  4. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • solenoci

Reviewers:

  • krcmarik
  • myakove
  • solenoci
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])
  • Test Oracle: Triggers: approved (cursor/gpt-5.4-xhigh-fast); /test-oracle can be used anytime

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@rh-bot-1

Copy link
Copy Markdown

Clean rebase detected — no code changes compared to previous head (e42aac5).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Around line 146-150: Add mandatory redaction rules to the evidence collection
section that currently instructs to store "The full output" and raw evidence
into proof.md: update the cluster-verifier.md checklist to require automatic
redaction before persisting raw logs (e.g., mask secrets, API keys, tokens,
emails, IPs, and YAML anchors), provide a short canonical redaction policy and
example regex patterns, and add a one-liner command or reference to the
redaction utility to run prior to saving evidence so any step that calls out
"The full output" or writes to proof.md enforces redaction.
- Line 35: The document currently has contradictory failure semantics ("stop
immediately" vs "continue checking other items"); update the verification
guidance to clearly distinguish connectivity/authentication failures from
individual check failures: state that failure of any cluster-level connectivity
or authentication check (the sentence currently saying "stop immediately") must
abort the entire verification and report no-trust, whereas non-connectivity
per-check failures should be logged and verification should continue to collect
all failures (the area currently saying "continue checking other items"); change
the two conflicting sentences so the first explicitly names
"connectivity/authentication checks" as abort conditions and the later paragraph
(around the per-check rules) explicitly documents that per-check failures do not
abort but are aggregated as partial failures, and ensure the same clarified rule
text replaces the existing lines referenced in the doc.

In `@llm/qualify/prompts/qualify.md`:
- Around line 136-139: The doc has a conflict between the temporary test
location and the verify-only execution command: update the verify-only example
to run pytest directly against the temporary test file path
(/tmp/qualify-<name>/<test_file>.py) instead of using the repo test selector
(tests/<path>::<TestClass>); change the command shown (the "uv run pytest ..."
example) to point to /tmp/qualify-<name>/<test_file>.py, keep the same pytest
flags (--tc-file, --tc-format, -p no:xdist), and continue piping output to
.qualify/<type>/<name>/test-output.log so verify-only remains isolated from repo
state.

In `@llm/qualify/skills/proof-generator/SKILL.md`:
- Around line 130-132: Update SKILL.md's evidence policy for proof.md so raw
YAML/logs are required to be sanitized before inclusion: change the bullet that
mandates raw evidence in collapsible <details> to require "sanitized, redacted
evidence" and add a short checklist in SKILL.md/proof.md that instructs
redacting secrets/PII (API keys, tokens, passwords, private IPs, certs),
replacing values with placeholders like <REDACTED> and documenting what was
removed; ensure the collapsible details still include enough context (file
names, non-sensitive fields, and diffs) and add a statement on how to mark
omitted sections (e.g., "[…redacted…]") so reviewers know evidence was
intentionally redacted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b340adfa-9cf1-4110-860e-e91ae4f91673

📥 Commits

Reviewing files that changed from the base of the PR and between 922e9a8 and 077b62d.

📒 Files selected for processing (15)
  • .gitignore
  • CLAUDE.md
  • README.md
  • llm/qualify/README.md
  • llm/qualify/agents/.gitkeep
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/agents/test-planner.md
  • llm/qualify/prompts/.gitkeep
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/.gitkeep
  • llm/qualify/skills/proof-generator/SKILL.md
  • llm/qualify/templates/.gitkeep
  • llm/qualify/templates/proof-template.md
  • llm/qualify/templates/test-plan-template.md
  • llm/qualify/workflow-diagrams.md

Comment thread llm/qualify/agents/cluster-verifier.md Outdated
Comment thread llm/qualify/agents/cluster-verifier.md Outdated
Comment thread llm/qualify/prompts/qualify.md
Comment thread llm/qualify/skills/proof-generator/SKILL.md
@myakove

myakove commented May 11, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/prompts/qualify.md`:
- Around line 42-53: After the "Collect environment versions" block in
qualify.md, add explicit failure-handling instructions: if any of the three oc
commands (OCP, MTV, CNV) fail (missing CSV/namespace or RBAC denied), record
that specific version as `UNKNOWN` and capture the command's error message;
state that the proof report will include these `UNKNOWN` entries and that
missing versions will cause proof-generator/SKILL.md's qualification logic to
mark the report NOT QUALIFIED, mirroring the behavior described in
cluster-verifier.md.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b19bc28d-78cd-4c42-8328-1589dd033318

📥 Commits

Reviewing files that changed from the base of the PR and between 077b62d and 6453247.

📒 Files selected for processing (3)
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/SKILL.md

Comment thread llm/qualify/prompts/qualify.md Outdated
@qodo-code-review

qodo-code-review Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 227b8ee

@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

llm/qualify/prompts/qualify.md:139 (qodo bug) — Pytest exit not enforced

Addressed: Fixed: Added explicit PYTEST_EXIT gating logic to qualify.md.

llm/qualify/workflow-diagrams.md:310 (qodo bug) — Diagram example missing args

Addressed: Fixed: Added values to diagram command invocation.

llm/qualify/agents/cluster-verifier.md:7 (qodo bug) — Agent tools list incomplete

Addressed: Fixed: Made tool name mapping note actionable.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 2, 2026
@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
llm/qualify/agents/cluster-verifier.md (1)

261-261: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Clarify that error messages should also be redacted for sensitive content.

Line 261 instructs the agent to "Include error messages verbatim," but this could conflict with the redaction rules established in lines 158-160. Error messages may contain sensitive data (e.g., URLs with embedded tokens, credentials echoed in authentication failures, or kubeconfig paths with secrets).

WHY this matters: Without explicit guidance, an agent might include a verbatim error message like Error: authentication failed with token abc123xyz... directly into the proof.md artifact, leaking credentials.

Recommendation: Add a parenthetical note clarifying that redaction still applies:

-- **Include error messages verbatim** — do not paraphrase or summarize errors.
+- **Include error messages verbatim (with sensitive values redacted)** — do not paraphrase or summarize errors, but apply the same redaction rules as for evidence output.

This preserves the intent (don't lose diagnostic information by paraphrasing) while maintaining security.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/agents/cluster-verifier.md` at line 261, Update the guidance at
the "Include error messages verbatim" line (the Line 261 instruction) to
explicitly state that verbatim inclusion must still apply the existing redaction
rules (referenced earlier around lines 158-160); instruct the agent to redact
sensitive substrings (tokens, credentials, secret-containing URLs, kubeconfig
paths, etc.) by replacing them with consistent placeholders (e.g.,
[REDACTED_TOKEN]) while preserving the original error structure and exact
surrounding text for diagnostic fidelity, and to annotate proof.md with a short
note of what was redacted and why.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Line 261: Update the guidance at the "Include error messages verbatim" line
(the Line 261 instruction) to explicitly state that verbatim inclusion must
still apply the existing redaction rules (referenced earlier around lines
158-160); instruct the agent to redact sensitive substrings (tokens,
credentials, secret-containing URLs, kubeconfig paths, etc.) by replacing them
with consistent placeholders (e.g., [REDACTED_TOKEN]) while preserving the
original error structure and exact surrounding text for diagnostic fidelity, and
to annotate proof.md with a short note of what was redacted and why.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 41bb5d85-45ea-45b6-aa90-af6dce5b523a

📥 Commits

Reviewing files that changed from the base of the PR and between ea5dbed and 227b8ee.

📒 Files selected for processing (5)
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/prompts/qualify.md
  • llm/qualify/templates/proof-template.md
  • llm/qualify/templates/test-plan-template.md
  • llm/qualify/workflow-diagrams.md

@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai[bot]

The following review comments were reviewed and a decision was made:

llm/qualify/agents/cluster-verifier.md:261 (outside diff) — Clarify that error messages should also be redacted for sensitive content.

Addressed: Fixed: Updated to 'Include error messages verbatim (with sensitive values redacted)' to maintain consistency with the redaction rules defined earlier in the document.

@qodo-code-review

qodo-code-review Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 2da0e65

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

@myakove — good fix. Making error-message redaction explicit at the failure-reporting section (line 261) is important because:

  • WHY it matters: Error messages from oc commands can contain kubeconfig paths, tokens, namespace names, or IP addresses that shouldn't be committed into proof reports stored in the repo. The redaction rules earlier in the document cover output, but failure detail sections are a distinct capture point where sensitive data easily leaks.
  • The consistency with the earlier redaction rules means reviewers and AI agents have a single, uniform instruction: redact sensitive values everywhere, including in verbatim error output. ✅

@myakove

myakove commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

myakove added 13 commits June 4, 2026 13:01
Full qualification workflow: test plan → write tests → verify on cluster → PR with proof.

Components:
- /qualify prompt template (orchestrator)
- test-planner agent (reads feature/bug docs → test plans)
- cluster-verifier agent (independent cluster state verification)
- proof-generator skill (assembles proof.md reports)
- Templates for test plans and proof reports
- Clarify failure semantics in cluster-verifier agent
- Add redaction rules to proof-generator skill
- Fix verify-only path in qualify prompt
Safety, redaction, retry bounds, and doc consistency.
- require --cluster, normalize placeholders, capture version errors
Add explicit step in Phase 0 to extract canonical bug identifiers
from --source URLs (Jira/GitHub patterns), support --name override,
and prompt user on extraction failure.
- Add UNKNOWN+diagnostics pattern for OCP version retrieval,
  matching the existing MTV/CNV error-capture approach
- Introduce artifact_key concept (name for features, id for bugs)
  and unify all downstream path references to use it consistently
- Add set -o pipefail and PIPESTATUS to preserve pytest exit codes through tee
- Fix ask_user tool reference for cross-CLI compatibility
- Clarify UNKNOWN version handling links to NOT QUALIFIED verdict
- Clarify AGENTS.md scope in workflow references
- Move HTML comment outside YAML frontmatter in cluster-verifier
- Make step 6 explicitly conditional in test-plan template
- Add preserve_static_ips to tests_params example
- Add --cluster arg to sequence diagram
- Add Failure Details section to proof template
- Add explicit PYTEST_EXIT gating logic to enforce pass/fail branching
- Add concrete argument values to sequence diagram example
- Improve tool name mapping comment with actionable instructions
Add parenthetical noting that sensitive value redaction still applies
when including error messages verbatim.
@qodo-code-review

Copy link
Copy Markdown

Code Review by Qodo

Grey Divider

Looking for bugs?

Check back in a few minutes. An AI review agent is analyzing this pull request.

Grey Divider

Qodo Logo

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Around line 44-47: The OCP version example uses a bare command ("oc get
clusterversion version -o jsonpath='{.status.desired.version}'") but the
/qualify contract in llm/qualify/prompts/qualify.md requires recording UNKNOWN
with error details; update the example to capture command failure and stderr and
demonstrate recording UNKNOWN including the error output (e.g., show running the
oc command with stderr capture and an explicit check that emits UNKNOWN: <error>
when the command fails), and mention the exact behavior in cluster-verifier.md
so diagnostics match the MTV/CNV examples.

In `@llm/qualify/agents/test-planner.md`:
- Around line 157-171: Update the example `tests_params` entry to document the
repo convention: state that each VM object in the `virtual_machines` list should
use the same base `name` (e.g., the example key `test_feature_scenario_name`),
rely on the framework’s UUID-based cloning, and avoid adding an explicit
`clone_name` unless the test intentionally targets non-standard VM naming;
include this guidance next to the `virtual_machines` array and mention
`warm_migration`/`preserve_static_ips` as shown so maintainers editing
`tests/tests_config/config.py` and consumers of `tests_params` know to follow
the same-base-name + UUID-clone pattern and reserve `clone_name` for
special-case tests only.
- Around line 92-93: The documented 6-step copy-offload sequence currently lists
`storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used`, which
violates the repo standard; update the sequence string in the doc (the "6-step
copy-offload" entry) so XCOPY verification comes before VM checks—i.e., change
`... → check_vms → check_xcopy_used` to `... → test_check_xcopy_used →
test_check_vms` (or at minimum `... → check_xcopy_used → check_vms`) wherever
that exact sequence appears.

In `@llm/qualify/prompts/qualify.md`:
- Around line 166-173: The verify-only temp path example still uses the
placeholder <name> and must use the canonical identifier artifact_key to avoid
drift; update the instructions and examples that create and run temporary test
files to use /tmp/qualify-<artifact_key>/ (and referenced test paths like
/tmp/qualify-<artifact_key>/<test_file>.py) and any related mentions to
consistently reference artifact_key instead of <name> or <id> so verify-only
phases use the same canonical identifier as introduced elsewhere.
- Line 243: Resolve the contradiction between the Phase 0 policy and the later
rule by choosing one deterministic behavior and updating the document: either
change the rule at "3. **Always collect versions in Phase 0** — if versions
cannot be determined, report and ask user" to state that unknown versions are
recorded as "UNKNOWN" and processing continues (matching the behavior described
in the earlier lines "continue with `UNKNOWN` and let proof logic decide"), or
change the earlier lines 68–70 to require prompting the user when version cannot
be determined; then update all mentions (the Phase 0 rule and the paragraph
referencing `UNKNOWN`) so they consistently reflect the chosen behavior and
remove the conflicting sentence so only the single deterministic rule remains.

In `@llm/qualify/skills/proof-generator/SKILL.md`:
- Line 136: The doc currently hard-codes "❌ NOT QUALIFIED" for missing versions
in SKILL.md's "Versions are MANDATORY" rule, which conflicts with the precedence
for type=bug; change the rule text and any templates so missing version
information resolves to the verdict determined by report type (e.g., if
type==bug produce "🐛 BUG NOT FIXED", otherwise "❌ NOT QUALIFIED"), ensuring
verdict selection follows the type-based precedence referenced at Lines 116/125
and the /qualify orchestration contract; update the SKILL.md wording, any
example outputs, and the decision mapping used by the qualify logic to make the
behavior deterministic and mention that commit messages should be considered
when relevant.

In `@llm/qualify/workflow-diagrams.md`:
- Around line 128-131: Phase 2 diagram currently shows cluster-verifier (CV1)
always running after RUN/pytest; change the flow so CV1 is only reached when
pytest passes by gating on PASS1 / PYTEST_EXIT == 0 (i.e., only follow the
"yes"/passed branch from PASS1 to CV1); when PASS1 is "no" (pytest failed /
PYTEST_EXIT != 0) route to the fix/retry path instead of invoking CV1. Apply the
same gating fix to the other occurrence around lines 145-149 so both CV1
instances are conditional on pytest success.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 03927824-b8a0-4ca9-928d-7020bf1ca020

📥 Commits

Reviewing files that changed from the base of the PR and between 227b8ee and cc7886e.

📒 Files selected for processing (15)
  • .gitignore
  • CLAUDE.md
  • README.md
  • llm/qualify/README.md
  • llm/qualify/agents/.gitkeep
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/agents/test-planner.md
  • llm/qualify/prompts/.gitkeep
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/.gitkeep
  • llm/qualify/skills/proof-generator/SKILL.md
  • llm/qualify/templates/.gitkeep
  • llm/qualify/templates/proof-template.md
  • llm/qualify/templates/test-plan-template.md
  • llm/qualify/workflow-diagrams.md

Comment on lines +44 to +47
```bash
# OCP version
oc get clusterversion version -o jsonpath='{.status.desired.version}'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: OCP version example can’t satisfy your own UNKNOWN: <error> rule

Line 46 uses a direct oc get call, but Line 59 requires recording UNKNOWN with error details. This block can’t capture command stderr for OCP, unlike MTV/CNV. That weakens failure diagnostics and breaks parity with the /qualify contract in llm/qualify/prompts/qualify.md.

Proposed doc patch
 # OCP version
-oc get clusterversion version -o jsonpath='{.status.desired.version}'
+OCP_VERSION_RAW="$(oc get clusterversion version -o jsonpath='{.status.desired.version}' 2>&1)"
+OCP_VERSION="$(printf '%s\n' "$OCP_VERSION_RAW")"
+# if empty or command error -> record: UNKNOWN: <OCP_VERSION_RAW or "no clusterversion match">

Based on learnings: “always explicitly acknowledge and analyze commit messages… when relevant”; this aligns with the recent commit series that strengthened UNKNOWN diagnostics for version collection.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/agents/cluster-verifier.md` around lines 44 - 47, The OCP version
example uses a bare command ("oc get clusterversion version -o
jsonpath='{.status.desired.version}'") but the /qualify contract in
llm/qualify/prompts/qualify.md requires recording UNKNOWN with error details;
update the example to capture command failure and stderr and demonstrate
recording UNKNOWN including the error output (e.g., show running the oc command
with stderr capture and an explicit check that emits UNKNOWN: <error> when the
command fails), and mention the exact behavior in cluster-verifier.md so
diagnostics match the MTV/CNV examples.

Comment on lines +92 to +93
- **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Copy-offload step order conflicts with repository standard and will produce wrong plans.

The commit objective is standardizing /qualify workflow behavior, but this sequence regresses a core contract: copy-offload must verify XCOPY before final VM checks. Current order (check_vms then check_xcopy_used) contradicts project guidance and can mask transport-path failures until after endpoint assertions.

Proposed fix
-- **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used`
+- **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_xcopy_used → check_vms`
...
-Add step 5.5 (`test_verify_shared_disk_data`) or step 6 (`test_check_xcopy_used`) for 6-step patterns.
+Add step 5.5 (`test_verify_shared_disk_data`) for shared-disk, or step 5.5 (`test_check_xcopy_used`) for copy-offload before `test_check_vms`.

Based on learnings: “For copy-offload tests, use 6-step pattern: storagemap -> networkmap -> plan -> migrate -> test_check_xcopy_used -> test_check_vms.”

Also applies to: 106-106

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/agents/test-planner.md` around lines 92 - 93, The documented
6-step copy-offload sequence currently lists `storagemap → networkmap → plan →
migrate → check_vms → check_xcopy_used`, which violates the repo standard;
update the sequence string in the doc (the "6-step copy-offload" entry) so XCOPY
verification comes before VM checks—i.e., change `... → check_vms →
check_xcopy_used` to `... → test_check_xcopy_used → test_check_vms` (or at
minimum `... → check_xcopy_used → check_vms`) wherever that exact sequence
appears.

Comment on lines +157 to +171
The exact `tests_params` dict entry to add to `tests/tests_config/config.py`:

```python
"test_feature_scenario_name": {
"virtual_machines": [
{
"name": "vm-name",
"source_vm_power": "on",
"guest_agent": True,
},
],
"warm_migration": False,
"preserve_static_ips": False,
},
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

MEDIUM: Add explicit VM naming/clone guidance in tests_params example to prevent config anti-patterns.

This section should codify repo convention: same base VM name per test, rely on framework UUID cloning, and avoid clone_name except special-name tests. Without this, the agent may generate non-idiomatic config entries.

Proposed addition
 "test_feature_scenario_name": {
     "virtual_machines": [
         {
             "name": "vm-name",
             "source_vm_power": "on",
             "guest_agent": True,
         },
     ],
     "warm_migration": False,
     "preserve_static_ips": False,
 },
+
+# Naming convention:
+# - Keep a shared base `name` for VMs in the scenario; framework appends UUIDs when cloning.
+# - Do not set `clone_name` unless the scenario explicitly validates non-conforming/special VM names.

Based on learnings: “VM entries within a test should use the same base name… Explicit clone_name fields should only be used for tests that specifically verify operator behavior with non-conforming or special VM names.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/agents/test-planner.md` around lines 157 - 171, Update the
example `tests_params` entry to document the repo convention: state that each VM
object in the `virtual_machines` list should use the same base `name` (e.g., the
example key `test_feature_scenario_name`), rely on the framework’s UUID-based
cloning, and avoid adding an explicit `clone_name` unless the test intentionally
targets non-standard VM naming; include this guidance next to the
`virtual_machines` array and mention `warm_migration`/`preserve_static_ips` as
shown so maintainers editing `tests/tests_config/config.py` and consumers of
`tests_params` know to follow the same-base-name + UUID-clone pattern and
reserve `clone_name` for special-case tests only.

Comment on lines +166 to +173
1. Write a **temporary test file** in `/tmp/qualify-<name>/` (not in the repo)
2. Run it on the cluster using the temporary path, for example:

```bash
export KUBECONFIG=<path>
set -o pipefail
uv run pytest /tmp/qualify-<name>/<test_file>.py -v \
--tc-file=tests/tests_config/config.py --tc-format=python -p no:xdist \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Use one canonical identifier in verify-only temp paths (artifact_key), not <name>.

The recent commit series correctly introduced artifact_key, but this step still uses <name>. In bug flows, <id> can differ from derived <name>, which reintroduces identifier drift across phases.

Suggested patch
-1. Write a **temporary test file** in `/tmp/qualify-<name>/` (not in the repo)
+1. Write a **temporary test file** in `/tmp/qualify-<artifact_key>/` (not in the repo)
@@
-   uv run pytest /tmp/qualify-<name>/<test_file>.py -v \
+   uv run pytest /tmp/qualify-<artifact_key>/<test_file>.py -v \

Based on learnings: always explicitly acknowledge commit-level fixes when reviewing related changes (here, the prior artifact_key normalization fix).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/prompts/qualify.md` around lines 166 - 173, The verify-only temp
path example still uses the placeholder <name> and must use the canonical
identifier artifact_key to avoid drift; update the instructions and examples
that create and run temporary test files to use /tmp/qualify-<artifact_key>/
(and referenced test paths like /tmp/qualify-<artifact_key>/<test_file>.py) and
any related mentions to consistently reference artifact_key instead of <name> or
<id> so verify-only phases use the same canonical identifier as introduced
elsewhere.


1. **Never mark as QUALIFIED without cluster verification proof** — test passing alone is NOT sufficient
2. **Never skip Phase 1 human review** — the test plan MUST be approved before writing code
3. **Always collect versions in Phase 0** — if versions cannot be determined, report and ask user

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Critical Rules contradict Phase 0 version policy.

Line 243 says to “report and ask user” when versions are unknown, but Lines 68–70 explicitly require continuing with UNKNOWN and letting proof logic decide final verdict. Keep one deterministic rule.

Suggested patch
-3. **Always collect versions in Phase 0** — if versions cannot be determined, report and ask user
+3. **Always collect versions in Phase 0** — if versions cannot be determined, record `UNKNOWN` with diagnostics and continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/prompts/qualify.md` at line 243, Resolve the contradiction
between the Phase 0 policy and the later rule by choosing one deterministic
behavior and updating the document: either change the rule at "3. **Always
collect versions in Phase 0** — if versions cannot be determined, report and ask
user" to state that unknown versions are recorded as "UNKNOWN" and processing
continues (matching the behavior described in the earlier lines "continue with
`UNKNOWN` and let proof logic decide"), or change the earlier lines 68–70 to
require prompting the user when version cannot be determined; then update all
mentions (the Phase 0 rule and the paragraph referencing `UNKNOWN`) so they
consistently reflect the chosen behavior and remove the conflicting sentence so
only the single deterministic rule remains.

- **Always redact sensitive values** before writing proof artifacts
(tokens, passwords, secrets, auth headers, private keys, kubeconfig credentials).
Preserve only fields needed to validate behavior.
- **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as `❌ NOT QUALIFIED` with reason: `"Missing version information"`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Version-missing rule conflicts with bug verdict precedence

Line 136 hard-codes ❌ NOT QUALIFIED, but for type=bug this must resolve to 🐛 BUG NOT FIXED per your own precedence rules (Lines 116, 125) and /qualify orchestration contract. Keep verdict selection deterministic by type.

Proposed doc patch
-- **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as `❌ NOT QUALIFIED` with reason: `"Missing version information"`.
+- **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as:
+- `❌ NOT QUALIFIED` for `type=feature`
+- `🐛 BUG NOT FIXED` for `type=bug`
+  with reason: `"Missing version information"`.

Based on learnings: “always explicitly acknowledge and analyze commit messages… when relevant”; this guards the recent commit intent that clarified UNKNOWN/version-driven verdict behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/skills/proof-generator/SKILL.md` at line 136, The doc currently
hard-codes "❌ NOT QUALIFIED" for missing versions in SKILL.md's "Versions are
MANDATORY" rule, which conflicts with the precedence for type=bug; change the
rule text and any templates so missing version information resolves to the
verdict determined by report type (e.g., if type==bug produce "🐛 BUG NOT
FIXED", otherwise "❌ NOT QUALIFIED"), ensuring verdict selection follows the
type-based precedence referenced at Lines 116/125 and the /qualify orchestration
contract; update the SKILL.md wording, any example outputs, and the decision
mapping used by the qualify logic to make the behavior deterministic and mention
that commit messages should be considered when relevant.

Comment on lines +128 to +131
RUN["Run pytest\non cluster"]:::agent
LOG1[/"💾 test-output.log"/]:::artifact
CV1["cluster-verifier\nchecks state"]:::agent
PASS1{{"Passed?"}}:::decision

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Phase 2 diagram should gate cluster-verifier on pytest success (PYTEST_EXIT == 0)

The current flow implies cluster verification always runs after test execution. The prompt contract requires skipping cluster verification when pytest fails, then entering fix/retry. Given the commit intent (feat: add /qualify AI qualification workflow), keeping this gate explicit prevents incorrect orchestration implementations.

Suggested diagram patch
-    RUN["Run pytest\non cluster"]:::agent
-    LOG1[/"💾 test-output.log"/]:::artifact
-    CV1["cluster-verifier\nchecks state"]:::agent
-    PASS1{{"Passed?"}}:::decision
+    RUN["Run pytest\non cluster"]:::agent
+    LOG1[/"💾 test-output.log"/]:::artifact
+    PY1{{"Pytest\nexit == 0?"}}:::decision
+    CV1["cluster-verifier\nchecks state"]:::agent
+    PASS1{{"Cluster\nverification passed?"}}:::decision
@@
-    MODE -- "Feature /\npermanent" --> BRANCH --> WRITE --> RUN --> LOG1 --> CV1 --> PASS1
+    MODE -- "Feature /\npermanent" --> BRANCH --> WRITE --> RUN --> LOG1 --> PY1
+    PY1 -- "No ❌" --> RETRY1
+    PY1 -- "Yes ✅" --> CV1 --> PASS1
@@
-    RTEMP["Run temp test\non cluster"]:::agent
-    LOG2[/"💾 test-output.log"/]:::artifact
-    CV2["cluster-verifier\nchecks state"]:::agent
-    PASS2{{"Passed?"}}:::decision
+    RTEMP["Run temp test\non cluster"]:::agent
+    LOG2[/"💾 test-output.log"/]:::artifact
+    PY2{{"Pytest\nexit == 0?"}}:::decision
+    CV2["cluster-verifier\nchecks state"]:::agent
+    PASS2{{"Cluster\nverification passed?"}}:::decision
@@
-    MODE -- "Verify-only" --> WTEMP --> RTEMP --> LOG2 --> CV2 --> PASS2
+    MODE -- "Verify-only" --> WTEMP --> RTEMP --> LOG2 --> PY2
+    PY2 -- "No ❌" --> RETRY2
+    PY2 -- "Yes ✅" --> CV2 --> PASS2

Also applies to: 145-149

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@llm/qualify/workflow-diagrams.md` around lines 128 - 131, Phase 2 diagram
currently shows cluster-verifier (CV1) always running after RUN/pytest; change
the flow so CV1 is only reached when pytest passes by gating on PASS1 /
PYTEST_EXIT == 0 (i.e., only follow the "yes"/passed branch from PASS1 to CV1);
when PASS1 is "no" (pytest failed / PYTEST_EXIT != 0) route to the fix/retry
path instead of invoking CV1. Apply the same gating fix to the other occurrence
around lines 145-149 so both CV1 instances are conditional on pytest success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants