Skip to content

feat(integration-tests): Add dataset-manager action and verification infra for clp-json.#2234

Open
quinntaylormitchell wants to merge 8 commits into
y-scope:mainfrom
quinntaylormitchell:testing-redesign-8
Open

feat(integration-tests): Add dataset-manager action and verification infra for clp-json.#2234
quinntaylormitchell wants to merge 8 commits into
y-scope:mainfrom
quinntaylormitchell:testing-redesign-8

Conversation

@quinntaylormitchell
Copy link
Copy Markdown
Collaborator

@quinntaylormitchell quinntaylormitchell commented Apr 29, 2026

Description

This PR adds the action and verification infrastructure for testing the package dataset-manager (only for clp-json).

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested on development branch; all tests pass.

Summary by CodeRabbit

  • Tests
    • Added integration test utilities for dataset-manager list and delete flows, including output parsing and automated verification of expected datasets and deletions.
  • Chores
    • Enhanced test infrastructure to carry structured verification arguments with actions, provide a combined stdout/stderr output helper, and return standardized pass/fail verification results for clearer outcomes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new test utility module to build and verify clp-json dataset-manager list/delete CLI actions and extends base test utilities: introduces an abstract CmdArgs, attaches structured args to ExternalAction with get_output(), and adds a frozen VerificationResult dataclass.

Changes

Dataset Manager Test Utilities

Layer / File(s) Summary
Data Shape / Args
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
Adds DatasetManagerArgs (fields: script_path, config, subcommand, del_all, datasets) and to_cmd() to render CLI invocation.
Core Commands
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
Adds ClpPackageDatasetManagerSubcommand enum and factories dataset_manager_list() and dataset_manager_del() producing ExternalAction instances.
Verification Logic
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
Adds verify_dataset_manager_list_action_clp_json() and verify_dataset_manager_del_action_clp_json() to validate outputs and deletion effects.
Helper / Parsing
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
Adds _extract_dataset_names_from_output() to parse list output and enforce reported counts; uses logging and pytest failures where appropriate.

Base Test Utilities

Layer / File(s) Summary
Abstract Args Type
integration-tests/tests/utils/classes.py
Adds CmdArgs(BaseModel, ABC) with abstract to_cmd() for structured CLI arg typing.
Action Shape
integration-tests/tests/utils/classes.py
Extends ExternalAction with `args: CmdArgs
Output Helper
integration-tests/tests/utils/classes.py
Adds ExternalAction.get_output() to concatenate stdout and stderr for verification consumers.
Verification Result
integration-tests/tests/utils/classes.py
Adds frozen VerificationResult dataclass with boolean truthiness and ok()/fail() constructors for verification APIs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing dataset-manager action and verification infrastructure for clp-json integration tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@quinntaylormitchell quinntaylormitchell changed the title feat(integration-tests): Add dataset-manager infra. feat(integration-tests): Add dataset-manager test infrastructure for clp-json. Apr 29, 2026
@quinntaylormitchell quinntaylormitchell changed the title feat(integration-tests): Add dataset-manager test infrastructure for clp-json. feat(integration-tests): Add dataset-manager action and verification infra for clp-json. Apr 29, 2026
@quinntaylormitchell quinntaylormitchell marked this pull request as ready for review April 29, 2026 23:41
@quinntaylormitchell quinntaylormitchell requested a review from a team as a code owner April 29, 2026 23:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 87-92: The current logic treats an empty datasets_to_del list as
valid and proceeds to run dataset-manager del with no targets; update the branch
handling datasets_to_del so that if datasets_to_del is None or an empty list it
triggers pytest.fail (same message) unless del_all is true, and only assign
args.datasets = [dataset.dataset_name for dataset in datasets_to_del] when
datasets_to_del is a non-empty list; reference variables: datasets_to_del,
del_all, args.datasets and the dataset-manager del invocation.

In `@integration-tests/tests/utils/classes.py`:
- Around line 166-168: The get_output method currently concatenates
completed_proc.stdout and completed_proc.stderr directly, which can fuse the
last stdout line with the first stderr line if stdout lacks a trailing newline;
update get_output in the same class to insert a separator (e.g. '\n') between
stdout and stderr only when stdout does not already end with a newline (or use a
join that ensures exactly one newline), so the returned combined string
preserves line boundaries and avoids duplicating newlines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 910fb98f-4c3b-4506-8eb9-97d0e242beb2

📥 Commits

Reviewing files that changed from the base of the PR and between 298e2e2 and 62f4226.

📒 Files selected for processing (2)
  • integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
  • integration-tests/tests/utils/classes.py

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated
Comment thread integration-tests/tests/utils/classes.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 199-204: The loop over output_lines modifies the list during
iteration by calling output_lines.remove(line) after finding the regex match;
change this to avoid in-loop mutation: iterate with enumerate to capture the
index (for i, line in enumerate(output_lines)), set num_datasets =
int(match.group(1)) and then call output_lines.pop(i) (or slice to create a new
list without that element) before breaking; this keeps the logic around
re.search, match and num_datasets intact while preventing modifying output_lines
during iteration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dde9b259-bff6-44ba-bd24-45d4a41ff4ff

📥 Commits

Reviewing files that changed from the base of the PR and between 62f4226 and 7ceae0d.

📒 Files selected for processing (1)
  • integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 157-162: The supporting ExternalAction returned by
dataset_manager_list (assigned to list_action) is never executed before its
properties (list_action.completed_proc.returncode and list_action.log_file_path)
are inspected; call its execute method using the same executor used elsewhere
(e.g., the executor variable or function used to run dataset_manager_del/list
actions) or invoke list_action.run()/execute_with_executor(...) so the
subprocess runs and populates completed_proc and log_file_path, then check
list_action.completed_proc.returncode and use list_action.log_file_path in the
pytest.fail message if non-zero.
- Around line 203-218: The parsing currently treats a missing "Found N datasets"
marker or a count mismatch as an empty list; change dataset_manager parsing so
that if the regex r"Found (\d+) datasets" does not match any line, it
raises/returns a verification failure (e.g., raise ValueError) instead of
returning dataset_list, and after collecting matches for r"INFO
\[dataset_manager\] (.+)" validate that len(dataset_list) == num_datasets and
raise/return failure if they differ; update the code paths around the
num_datasets, output_lines and dataset_list variables (and the function that
calls this parsing) to propagate that error so del --all verification cannot
proceed on untrusted/ drifting output.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 408a842a-a13b-4a20-ae57-cc5433ba7d6b

📥 Commits

Reviewing files that changed from the base of the PR and between 7ceae0d and baa43c1.

📒 Files selected for processing (1)
  • integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 92-97: The code currently allows both datasets_to_del and del_all
to be set which produces args.datasets plus the --all flag; update the logic in
the dataset-manager routine to enforce mutual exclusivity by checking if both
datasets_to_del and del_all are truthy and calling pytest.fail with a clear
message (e.g., "Specify either datasets_to_del or del_all, not both") before
building args; otherwise keep the existing branches that set args.datasets from
datasets_to_del or require one of them (the existing pytest.fail for neither) so
that only one delete target path can be used.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cb0c3134-90f3-4456-bc30-15e3d127f077

📥 Commits

Reviewing files that changed from the base of the PR and between baa43c1 and 5077d6c.

📒 Files selected for processing (1)
  • integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant