feat: add /qualify AI qualification workflow#474
Conversation
WalkthroughAdds an end-to-end /qualify workflow: orchestrator prompt, agent and skill specifications (test-planner, cluster-verifier, proof-generator), templates, diagrams, docs, and a ChangesAI Qualification Workflow for MTV API Tests
Sequence Diagram(s)sequenceDiagram
participant User
participant Orchestrator
participant TestPlannerAgent
participant PythonWriter
participant RealCluster
participant ClusterVerifier
participant ProofGenerator
participant Git
User->>Orchestrator: invoke /qualify --type --source [--cluster]
Orchestrator->>TestPlannerAgent: generate test-plan.md from source
Orchestrator->>User: request human approval of test-plan.md
Orchestrator->>PythonWriter: write tests & config (branch for permanent tests)
PythonWriter->>RealCluster: run pytest (capture output -> test-output.log)
RealCluster-->>Orchestrator: pytest stdout/stderr + exit code
Orchestrator->>ClusterVerifier: run independent cluster verification
ClusterVerifier-->>Orchestrator: cluster verification report
Orchestrator->>ProofGenerator: assemble proof.md (tests + verification + versions)
Orchestrator->>Git: create PR (permanent-test path) with proof reference
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
AI Features
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
33becb1 to
b8ea397
Compare
e42aac5 to
077b62d
Compare
|
Clean rebase detected — no code changes compared to previous head ( |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Around line 146-150: Add mandatory redaction rules to the evidence collection
section that currently instructs to store "The full output" and raw evidence
into proof.md: update the cluster-verifier.md checklist to require automatic
redaction before persisting raw logs (e.g., mask secrets, API keys, tokens,
emails, IPs, and YAML anchors), provide a short canonical redaction policy and
example regex patterns, and add a one-liner command or reference to the
redaction utility to run prior to saving evidence so any step that calls out
"The full output" or writes to proof.md enforces redaction.
- Line 35: The document currently has contradictory failure semantics ("stop
immediately" vs "continue checking other items"); update the verification
guidance to clearly distinguish connectivity/authentication failures from
individual check failures: state that failure of any cluster-level connectivity
or authentication check (the sentence currently saying "stop immediately") must
abort the entire verification and report no-trust, whereas non-connectivity
per-check failures should be logged and verification should continue to collect
all failures (the area currently saying "continue checking other items"); change
the two conflicting sentences so the first explicitly names
"connectivity/authentication checks" as abort conditions and the later paragraph
(around the per-check rules) explicitly documents that per-check failures do not
abort but are aggregated as partial failures, and ensure the same clarified rule
text replaces the existing lines referenced in the doc.
In `@llm/qualify/prompts/qualify.md`:
- Around line 136-139: The doc has a conflict between the temporary test
location and the verify-only execution command: update the verify-only example
to run pytest directly against the temporary test file path
(/tmp/qualify-<name>/<test_file>.py) instead of using the repo test selector
(tests/<path>::<TestClass>); change the command shown (the "uv run pytest ..."
example) to point to /tmp/qualify-<name>/<test_file>.py, keep the same pytest
flags (--tc-file, --tc-format, -p no:xdist), and continue piping output to
.qualify/<type>/<name>/test-output.log so verify-only remains isolated from repo
state.
In `@llm/qualify/skills/proof-generator/SKILL.md`:
- Around line 130-132: Update SKILL.md's evidence policy for proof.md so raw
YAML/logs are required to be sanitized before inclusion: change the bullet that
mandates raw evidence in collapsible <details> to require "sanitized, redacted
evidence" and add a short checklist in SKILL.md/proof.md that instructs
redacting secrets/PII (API keys, tokens, passwords, private IPs, certs),
replacing values with placeholders like <REDACTED> and documenting what was
removed; ensure the collapsible details still include enough context (file
names, non-sensitive fields, and diffs) and add a statement on how to mark
omitted sections (e.g., "[…redacted…]") so reviewers know evidence was
intentionally redacted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b340adfa-9cf1-4110-860e-e91ae4f91673
📒 Files selected for processing (15)
.gitignoreCLAUDE.mdREADME.mdllm/qualify/README.mdllm/qualify/agents/.gitkeepllm/qualify/agents/cluster-verifier.mdllm/qualify/agents/test-planner.mdllm/qualify/prompts/.gitkeepllm/qualify/prompts/qualify.mdllm/qualify/skills/proof-generator/.gitkeepllm/qualify/skills/proof-generator/SKILL.mdllm/qualify/templates/.gitkeepllm/qualify/templates/proof-template.mdllm/qualify/templates/test-plan-template.mdllm/qualify/workflow-diagrams.md
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@llm/qualify/prompts/qualify.md`:
- Around line 42-53: After the "Collect environment versions" block in
qualify.md, add explicit failure-handling instructions: if any of the three oc
commands (OCP, MTV, CNV) fail (missing CSV/namespace or RBAC denied), record
that specific version as `UNKNOWN` and capture the command's error message;
state that the proof report will include these `UNKNOWN` entries and that
missing versions will cause proof-generator/SKILL.md's qualification logic to
mark the report NOT QUALIFIED, mirroring the behavior described in
cluster-verifier.md.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b19bc28d-78cd-4c42-8328-1589dd033318
📒 Files selected for processing (3)
llm/qualify/agents/cluster-verifier.mdllm/qualify/prompts/qualify.mdllm/qualify/skills/proof-generator/SKILL.md
6453247 to
82e124d
Compare
|
Code review by qodo was updated up to the latest commit 227b8ee |
|
@qodo-code-review[bot] The following review comments were reviewed and a decision was made:
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
llm/qualify/agents/cluster-verifier.md (1)
261-261: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winClarify that error messages should also be redacted for sensitive content.
Line 261 instructs the agent to "Include error messages verbatim," but this could conflict with the redaction rules established in lines 158-160. Error messages may contain sensitive data (e.g., URLs with embedded tokens, credentials echoed in authentication failures, or kubeconfig paths with secrets).
WHY this matters: Without explicit guidance, an agent might include a verbatim error message like
Error: authentication failed with token abc123xyz...directly into the proof.md artifact, leaking credentials.Recommendation: Add a parenthetical note clarifying that redaction still applies:
-- **Include error messages verbatim** — do not paraphrase or summarize errors. +- **Include error messages verbatim (with sensitive values redacted)** — do not paraphrase or summarize errors, but apply the same redaction rules as for evidence output.This preserves the intent (don't lose diagnostic information by paraphrasing) while maintaining security.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@llm/qualify/agents/cluster-verifier.md` at line 261, Update the guidance at the "Include error messages verbatim" line (the Line 261 instruction) to explicitly state that verbatim inclusion must still apply the existing redaction rules (referenced earlier around lines 158-160); instruct the agent to redact sensitive substrings (tokens, credentials, secret-containing URLs, kubeconfig paths, etc.) by replacing them with consistent placeholders (e.g., [REDACTED_TOKEN]) while preserving the original error structure and exact surrounding text for diagnostic fidelity, and to annotate proof.md with a short note of what was redacted and why.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Line 261: Update the guidance at the "Include error messages verbatim" line
(the Line 261 instruction) to explicitly state that verbatim inclusion must
still apply the existing redaction rules (referenced earlier around lines
158-160); instruct the agent to redact sensitive substrings (tokens,
credentials, secret-containing URLs, kubeconfig paths, etc.) by replacing them
with consistent placeholders (e.g., [REDACTED_TOKEN]) while preserving the
original error structure and exact surrounding text for diagnostic fidelity, and
to annotate proof.md with a short note of what was redacted and why.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 41bb5d85-45ea-45b6-aa90-af6dce5b523a
📒 Files selected for processing (5)
llm/qualify/agents/cluster-verifier.mdllm/qualify/prompts/qualify.mdllm/qualify/templates/proof-template.mdllm/qualify/templates/test-plan-template.mdllm/qualify/workflow-diagrams.md
|
@coderabbitai[bot] The following review comments were reviewed and a decision was made:
|
|
Code review by qodo was updated up to the latest commit 2da0e65 |
|
Tip For best results, initiate chat on the files or code changes.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Full qualification workflow: test plan → write tests → verify on cluster → PR with proof. Components: - /qualify prompt template (orchestrator) - test-planner agent (reads feature/bug docs → test plans) - cluster-verifier agent (independent cluster state verification) - proof-generator skill (assembles proof.md reports) - Templates for test plans and proof reports
- Clarify failure semantics in cluster-verifier agent - Add redaction rules to proof-generator skill - Fix verify-only path in qualify prompt
Safety, redaction, retry bounds, and doc consistency.
- require --cluster, normalize placeholders, capture version errors
Add explicit step in Phase 0 to extract canonical bug identifiers from --source URLs (Jira/GitHub patterns), support --name override, and prompt user on extraction failure.
- Add UNKNOWN+diagnostics pattern for OCP version retrieval, matching the existing MTV/CNV error-capture approach - Introduce artifact_key concept (name for features, id for bugs) and unify all downstream path references to use it consistently
- Add set -o pipefail and PIPESTATUS to preserve pytest exit codes through tee - Fix ask_user tool reference for cross-CLI compatibility - Clarify UNKNOWN version handling links to NOT QUALIFIED verdict - Clarify AGENTS.md scope in workflow references - Move HTML comment outside YAML frontmatter in cluster-verifier - Make step 6 explicitly conditional in test-plan template - Add preserve_static_ips to tests_params example - Add --cluster arg to sequence diagram - Add Failure Details section to proof template
- Add explicit PYTEST_EXIT gating logic to enforce pass/fail branching - Add concrete argument values to sequence diagram example - Improve tool name mapping comment with actionable instructions
Add parenthetical noting that sensitive value redaction still applies when including error messages verbatim.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Around line 44-47: The OCP version example uses a bare command ("oc get
clusterversion version -o jsonpath='{.status.desired.version}'") but the
/qualify contract in llm/qualify/prompts/qualify.md requires recording UNKNOWN
with error details; update the example to capture command failure and stderr and
demonstrate recording UNKNOWN including the error output (e.g., show running the
oc command with stderr capture and an explicit check that emits UNKNOWN: <error>
when the command fails), and mention the exact behavior in cluster-verifier.md
so diagnostics match the MTV/CNV examples.
In `@llm/qualify/agents/test-planner.md`:
- Around line 157-171: Update the example `tests_params` entry to document the
repo convention: state that each VM object in the `virtual_machines` list should
use the same base `name` (e.g., the example key `test_feature_scenario_name`),
rely on the framework’s UUID-based cloning, and avoid adding an explicit
`clone_name` unless the test intentionally targets non-standard VM naming;
include this guidance next to the `virtual_machines` array and mention
`warm_migration`/`preserve_static_ips` as shown so maintainers editing
`tests/tests_config/config.py` and consumers of `tests_params` know to follow
the same-base-name + UUID-clone pattern and reserve `clone_name` for
special-case tests only.
- Around line 92-93: The documented 6-step copy-offload sequence currently lists
`storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used`, which
violates the repo standard; update the sequence string in the doc (the "6-step
copy-offload" entry) so XCOPY verification comes before VM checks—i.e., change
`... → check_vms → check_xcopy_used` to `... → test_check_xcopy_used →
test_check_vms` (or at minimum `... → check_xcopy_used → check_vms`) wherever
that exact sequence appears.
In `@llm/qualify/prompts/qualify.md`:
- Around line 166-173: The verify-only temp path example still uses the
placeholder <name> and must use the canonical identifier artifact_key to avoid
drift; update the instructions and examples that create and run temporary test
files to use /tmp/qualify-<artifact_key>/ (and referenced test paths like
/tmp/qualify-<artifact_key>/<test_file>.py) and any related mentions to
consistently reference artifact_key instead of <name> or <id> so verify-only
phases use the same canonical identifier as introduced elsewhere.
- Line 243: Resolve the contradiction between the Phase 0 policy and the later
rule by choosing one deterministic behavior and updating the document: either
change the rule at "3. **Always collect versions in Phase 0** — if versions
cannot be determined, report and ask user" to state that unknown versions are
recorded as "UNKNOWN" and processing continues (matching the behavior described
in the earlier lines "continue with `UNKNOWN` and let proof logic decide"), or
change the earlier lines 68–70 to require prompting the user when version cannot
be determined; then update all mentions (the Phase 0 rule and the paragraph
referencing `UNKNOWN`) so they consistently reflect the chosen behavior and
remove the conflicting sentence so only the single deterministic rule remains.
In `@llm/qualify/skills/proof-generator/SKILL.md`:
- Line 136: The doc currently hard-codes "❌ NOT QUALIFIED" for missing versions
in SKILL.md's "Versions are MANDATORY" rule, which conflicts with the precedence
for type=bug; change the rule text and any templates so missing version
information resolves to the verdict determined by report type (e.g., if
type==bug produce "🐛 BUG NOT FIXED", otherwise "❌ NOT QUALIFIED"), ensuring
verdict selection follows the type-based precedence referenced at Lines 116/125
and the /qualify orchestration contract; update the SKILL.md wording, any
example outputs, and the decision mapping used by the qualify logic to make the
behavior deterministic and mention that commit messages should be considered
when relevant.
In `@llm/qualify/workflow-diagrams.md`:
- Around line 128-131: Phase 2 diagram currently shows cluster-verifier (CV1)
always running after RUN/pytest; change the flow so CV1 is only reached when
pytest passes by gating on PASS1 / PYTEST_EXIT == 0 (i.e., only follow the
"yes"/passed branch from PASS1 to CV1); when PASS1 is "no" (pytest failed /
PYTEST_EXIT != 0) route to the fix/retry path instead of invoking CV1. Apply the
same gating fix to the other occurrence around lines 145-149 so both CV1
instances are conditional on pytest success.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 03927824-b8a0-4ca9-928d-7020bf1ca020
📒 Files selected for processing (15)
.gitignoreCLAUDE.mdREADME.mdllm/qualify/README.mdllm/qualify/agents/.gitkeepllm/qualify/agents/cluster-verifier.mdllm/qualify/agents/test-planner.mdllm/qualify/prompts/.gitkeepllm/qualify/prompts/qualify.mdllm/qualify/skills/proof-generator/.gitkeepllm/qualify/skills/proof-generator/SKILL.mdllm/qualify/templates/.gitkeepllm/qualify/templates/proof-template.mdllm/qualify/templates/test-plan-template.mdllm/qualify/workflow-diagrams.md
| ```bash | ||
| # OCP version | ||
| oc get clusterversion version -o jsonpath='{.status.desired.version}' | ||
|
|
There was a problem hiding this comment.
HIGH: OCP version example can’t satisfy your own UNKNOWN: <error> rule
Line 46 uses a direct oc get call, but Line 59 requires recording UNKNOWN with error details. This block can’t capture command stderr for OCP, unlike MTV/CNV. That weakens failure diagnostics and breaks parity with the /qualify contract in llm/qualify/prompts/qualify.md.
Proposed doc patch
# OCP version
-oc get clusterversion version -o jsonpath='{.status.desired.version}'
+OCP_VERSION_RAW="$(oc get clusterversion version -o jsonpath='{.status.desired.version}' 2>&1)"
+OCP_VERSION="$(printf '%s\n' "$OCP_VERSION_RAW")"
+# if empty or command error -> record: UNKNOWN: <OCP_VERSION_RAW or "no clusterversion match">Based on learnings: “always explicitly acknowledge and analyze commit messages… when relevant”; this aligns with the recent commit series that strengthened UNKNOWN diagnostics for version collection.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/agents/cluster-verifier.md` around lines 44 - 47, The OCP version
example uses a bare command ("oc get clusterversion version -o
jsonpath='{.status.desired.version}'") but the /qualify contract in
llm/qualify/prompts/qualify.md requires recording UNKNOWN with error details;
update the example to capture command failure and stderr and demonstrate
recording UNKNOWN including the error output (e.g., show running the oc command
with stderr capture and an explicit check that emits UNKNOWN: <error> when the
command fails), and mention the exact behavior in cluster-verifier.md so
diagnostics match the MTV/CNV examples.
| - **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used` | ||
|
|
There was a problem hiding this comment.
HIGH: Copy-offload step order conflicts with repository standard and will produce wrong plans.
The commit objective is standardizing /qualify workflow behavior, but this sequence regresses a core contract: copy-offload must verify XCOPY before final VM checks. Current order (check_vms then check_xcopy_used) contradicts project guidance and can mask transport-path failures until after endpoint assertions.
Proposed fix
-- **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_vms → check_xcopy_used`
+- **6-step copy-offload**: `storagemap → networkmap → plan → migrate → check_xcopy_used → check_vms`
...
-Add step 5.5 (`test_verify_shared_disk_data`) or step 6 (`test_check_xcopy_used`) for 6-step patterns.
+Add step 5.5 (`test_verify_shared_disk_data`) for shared-disk, or step 5.5 (`test_check_xcopy_used`) for copy-offload before `test_check_vms`.Based on learnings: “For copy-offload tests, use 6-step pattern: storagemap -> networkmap -> plan -> migrate -> test_check_xcopy_used -> test_check_vms.”
Also applies to: 106-106
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/agents/test-planner.md` around lines 92 - 93, The documented
6-step copy-offload sequence currently lists `storagemap → networkmap → plan →
migrate → check_vms → check_xcopy_used`, which violates the repo standard;
update the sequence string in the doc (the "6-step copy-offload" entry) so XCOPY
verification comes before VM checks—i.e., change `... → check_vms →
check_xcopy_used` to `... → test_check_xcopy_used → test_check_vms` (or at
minimum `... → check_xcopy_used → check_vms`) wherever that exact sequence
appears.
| The exact `tests_params` dict entry to add to `tests/tests_config/config.py`: | ||
|
|
||
| ```python | ||
| "test_feature_scenario_name": { | ||
| "virtual_machines": [ | ||
| { | ||
| "name": "vm-name", | ||
| "source_vm_power": "on", | ||
| "guest_agent": True, | ||
| }, | ||
| ], | ||
| "warm_migration": False, | ||
| "preserve_static_ips": False, | ||
| }, | ||
| ``` |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
MEDIUM: Add explicit VM naming/clone guidance in tests_params example to prevent config anti-patterns.
This section should codify repo convention: same base VM name per test, rely on framework UUID cloning, and avoid clone_name except special-name tests. Without this, the agent may generate non-idiomatic config entries.
Proposed addition
"test_feature_scenario_name": {
"virtual_machines": [
{
"name": "vm-name",
"source_vm_power": "on",
"guest_agent": True,
},
],
"warm_migration": False,
"preserve_static_ips": False,
},
+
+# Naming convention:
+# - Keep a shared base `name` for VMs in the scenario; framework appends UUIDs when cloning.
+# - Do not set `clone_name` unless the scenario explicitly validates non-conforming/special VM names.Based on learnings: “VM entries within a test should use the same base name… Explicit clone_name fields should only be used for tests that specifically verify operator behavior with non-conforming or special VM names.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/agents/test-planner.md` around lines 157 - 171, Update the
example `tests_params` entry to document the repo convention: state that each VM
object in the `virtual_machines` list should use the same base `name` (e.g., the
example key `test_feature_scenario_name`), rely on the framework’s UUID-based
cloning, and avoid adding an explicit `clone_name` unless the test intentionally
targets non-standard VM naming; include this guidance next to the
`virtual_machines` array and mention `warm_migration`/`preserve_static_ips` as
shown so maintainers editing `tests/tests_config/config.py` and consumers of
`tests_params` know to follow the same-base-name + UUID-clone pattern and
reserve `clone_name` for special-case tests only.
| 1. Write a **temporary test file** in `/tmp/qualify-<name>/` (not in the repo) | ||
| 2. Run it on the cluster using the temporary path, for example: | ||
|
|
||
| ```bash | ||
| export KUBECONFIG=<path> | ||
| set -o pipefail | ||
| uv run pytest /tmp/qualify-<name>/<test_file>.py -v \ | ||
| --tc-file=tests/tests_config/config.py --tc-format=python -p no:xdist \ |
There was a problem hiding this comment.
HIGH: Use one canonical identifier in verify-only temp paths (artifact_key), not <name>.
The recent commit series correctly introduced artifact_key, but this step still uses <name>. In bug flows, <id> can differ from derived <name>, which reintroduces identifier drift across phases.
Suggested patch
-1. Write a **temporary test file** in `/tmp/qualify-<name>/` (not in the repo)
+1. Write a **temporary test file** in `/tmp/qualify-<artifact_key>/` (not in the repo)
@@
- uv run pytest /tmp/qualify-<name>/<test_file>.py -v \
+ uv run pytest /tmp/qualify-<artifact_key>/<test_file>.py -v \Based on learnings: always explicitly acknowledge commit-level fixes when reviewing related changes (here, the prior artifact_key normalization fix).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/prompts/qualify.md` around lines 166 - 173, The verify-only temp
path example still uses the placeholder <name> and must use the canonical
identifier artifact_key to avoid drift; update the instructions and examples
that create and run temporary test files to use /tmp/qualify-<artifact_key>/
(and referenced test paths like /tmp/qualify-<artifact_key>/<test_file>.py) and
any related mentions to consistently reference artifact_key instead of <name> or
<id> so verify-only phases use the same canonical identifier as introduced
elsewhere.
|
|
||
| 1. **Never mark as QUALIFIED without cluster verification proof** — test passing alone is NOT sufficient | ||
| 2. **Never skip Phase 1 human review** — the test plan MUST be approved before writing code | ||
| 3. **Always collect versions in Phase 0** — if versions cannot be determined, report and ask user |
There was a problem hiding this comment.
HIGH: Critical Rules contradict Phase 0 version policy.
Line 243 says to “report and ask user” when versions are unknown, but Lines 68–70 explicitly require continuing with UNKNOWN and letting proof logic decide final verdict. Keep one deterministic rule.
Suggested patch
-3. **Always collect versions in Phase 0** — if versions cannot be determined, report and ask user
+3. **Always collect versions in Phase 0** — if versions cannot be determined, record `UNKNOWN` with diagnostics and continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/prompts/qualify.md` at line 243, Resolve the contradiction
between the Phase 0 policy and the later rule by choosing one deterministic
behavior and updating the document: either change the rule at "3. **Always
collect versions in Phase 0** — if versions cannot be determined, report and ask
user" to state that unknown versions are recorded as "UNKNOWN" and processing
continues (matching the behavior described in the earlier lines "continue with
`UNKNOWN` and let proof logic decide"), or change the earlier lines 68–70 to
require prompting the user when version cannot be determined; then update all
mentions (the Phase 0 rule and the paragraph referencing `UNKNOWN`) so they
consistently reflect the chosen behavior and remove the conflicting sentence so
only the single deterministic rule remains.
| - **Always redact sensitive values** before writing proof artifacts | ||
| (tokens, passwords, secrets, auth headers, private keys, kubeconfig credentials). | ||
| Preserve only fields needed to validate behavior. | ||
| - **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as `❌ NOT QUALIFIED` with reason: `"Missing version information"`. |
There was a problem hiding this comment.
HIGH: Version-missing rule conflicts with bug verdict precedence
Line 136 hard-codes ❌ NOT QUALIFIED, but for type=bug this must resolve to 🐛 BUG NOT FIXED per your own precedence rules (Lines 116, 125) and /qualify orchestration contract. Keep verdict selection deterministic by type.
Proposed doc patch
-- **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as `❌ NOT QUALIFIED` with reason: `"Missing version information"`.
+- **Versions are MANDATORY.** If any version (OCP, MTV, or CNV) is missing, mark the report as:
+- `❌ NOT QUALIFIED` for `type=feature`
+- `🐛 BUG NOT FIXED` for `type=bug`
+ with reason: `"Missing version information"`.Based on learnings: “always explicitly acknowledge and analyze commit messages… when relevant”; this guards the recent commit intent that clarified UNKNOWN/version-driven verdict behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/skills/proof-generator/SKILL.md` at line 136, The doc currently
hard-codes "❌ NOT QUALIFIED" for missing versions in SKILL.md's "Versions are
MANDATORY" rule, which conflicts with the precedence for type=bug; change the
rule text and any templates so missing version information resolves to the
verdict determined by report type (e.g., if type==bug produce "🐛 BUG NOT
FIXED", otherwise "❌ NOT QUALIFIED"), ensuring verdict selection follows the
type-based precedence referenced at Lines 116/125 and the /qualify orchestration
contract; update the SKILL.md wording, any example outputs, and the decision
mapping used by the qualify logic to make the behavior deterministic and mention
that commit messages should be considered when relevant.
| RUN["Run pytest\non cluster"]:::agent | ||
| LOG1[/"💾 test-output.log"/]:::artifact | ||
| CV1["cluster-verifier\nchecks state"]:::agent | ||
| PASS1{{"Passed?"}}:::decision |
There was a problem hiding this comment.
HIGH: Phase 2 diagram should gate cluster-verifier on pytest success (PYTEST_EXIT == 0)
The current flow implies cluster verification always runs after test execution. The prompt contract requires skipping cluster verification when pytest fails, then entering fix/retry. Given the commit intent (feat: add /qualify AI qualification workflow), keeping this gate explicit prevents incorrect orchestration implementations.
Suggested diagram patch
- RUN["Run pytest\non cluster"]:::agent
- LOG1[/"💾 test-output.log"/]:::artifact
- CV1["cluster-verifier\nchecks state"]:::agent
- PASS1{{"Passed?"}}:::decision
+ RUN["Run pytest\non cluster"]:::agent
+ LOG1[/"💾 test-output.log"/]:::artifact
+ PY1{{"Pytest\nexit == 0?"}}:::decision
+ CV1["cluster-verifier\nchecks state"]:::agent
+ PASS1{{"Cluster\nverification passed?"}}:::decision
@@
- MODE -- "Feature /\npermanent" --> BRANCH --> WRITE --> RUN --> LOG1 --> CV1 --> PASS1
+ MODE -- "Feature /\npermanent" --> BRANCH --> WRITE --> RUN --> LOG1 --> PY1
+ PY1 -- "No ❌" --> RETRY1
+ PY1 -- "Yes ✅" --> CV1 --> PASS1
@@
- RTEMP["Run temp test\non cluster"]:::agent
- LOG2[/"💾 test-output.log"/]:::artifact
- CV2["cluster-verifier\nchecks state"]:::agent
- PASS2{{"Passed?"}}:::decision
+ RTEMP["Run temp test\non cluster"]:::agent
+ LOG2[/"💾 test-output.log"/]:::artifact
+ PY2{{"Pytest\nexit == 0?"}}:::decision
+ CV2["cluster-verifier\nchecks state"]:::agent
+ PASS2{{"Cluster\nverification passed?"}}:::decision
@@
- MODE -- "Verify-only" --> WTEMP --> RTEMP --> LOG2 --> CV2 --> PASS2
+ MODE -- "Verify-only" --> WTEMP --> RTEMP --> LOG2 --> PY2
+ PY2 -- "No ❌" --> RETRY2
+ PY2 -- "Yes ✅" --> CV2 --> PASS2Also applies to: 145-149
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@llm/qualify/workflow-diagrams.md` around lines 128 - 131, Phase 2 diagram
currently shows cluster-verifier (CV1) always running after RUN/pytest; change
the flow so CV1 is only reached when pytest passes by gating on PASS1 /
PYTEST_EXIT == 0 (i.e., only follow the "yes"/passed branch from PASS1 to CV1);
when PASS1 is "no" (pytest failed / PYTEST_EXIT != 0) route to the fix/retry
path instead of invoking CV1. Apply the same gating fix to the other occurrence
around lines 145-149 so both CV1 instances are conditional on pytest success.

Summary
Adds the
/qualifyAI qualification workflow — a fully automated end-to-end pipeline that takes a feature design doc or bug report and produces verified tests with cluster proof.What's Included
Prompt Template
llm/qualify/prompts/qualify.md— Main/qualifycommand that orchestrates all 4 phasesAgents
llm/qualify/agents/test-planner.md— Reads feature/bug docs → produces structured test plansllm/qualify/agents/cluster-verifier.md— Independently verifies OpenShift cluster state after test executionSkill
llm/qualify/skills/proof-generator/SKILL.md— Assembles proof.md reports with test results + cluster evidenceTemplates
llm/qualify/templates/test-plan-template.md— Test plan skeletonllm/qualify/templates/proof-template.md— Proof report skeletonDocumentation
llm/qualify/README.md— Full usage guide with setup instructions for pi, Claude Code, Cursor, and other AI CLIsllm/qualify/workflow-diagrams.md— Mermaid flowcharts (workflow, components, sequence diagram)Workflow Overview
Human Checkpoints
Do NOT auto-merge. This needs human review.
Summary by CodeRabbit
New Features
Documentation
Chores