feat(integration-tests): Add dataset-manager action and verification infra for `clp-json`. by quinntaylormitchell · Pull Request #2234 · y-scope/clp

quinntaylormitchell · 2026-04-29T20:57:51Z

Description

This PR adds the action and verification infrastructure for testing the package dataset-manager (only for clp-json).

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested on development branch; all tests pass.

Summary by CodeRabbit

Tests
- Added integration test utilities for dataset-manager list and delete flows, including output parsing and automated verification of expected datasets and deletions.
Chores
- Enhanced test infrastructure to carry structured verification arguments with actions, provide a combined stdout/stderr output helper, and return standardized pass/fail verification results for clearer outcomes.

coderabbitai · 2026-04-29T20:58:00Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a new test utility module to build and verify clp-json dataset-manager list/delete CLI actions and extends base test utilities: introduces an abstract CmdArgs, attaches structured args to ExternalAction with get_output(), and adds a frozen VerificationResult dataclass.

Changes

Dataset Manager Test Utilities

Layer / File(s)	Summary
Data Shape / Args `integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`	Adds `DatasetManagerArgs` (fields: `script_path`, `config`, `subcommand`, `del_all`, `datasets`) and `to_cmd()` to render CLI invocation.
Core Commands `integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`	Adds `ClpPackageDatasetManagerSubcommand` enum and factories `dataset_manager_list()` and `dataset_manager_del()` producing `ExternalAction` instances.
Verification Logic `integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`	Adds `verify_dataset_manager_list_action_clp_json()` and `verify_dataset_manager_del_action_clp_json()` to validate outputs and deletion effects.
Helper / Parsing `integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`	Adds `_extract_dataset_names_from_output()` to parse list output and enforce reported counts; uses logging and pytest failures where appropriate.

Base Test Utilities

Layer / File(s)	Summary
Abstract Args Type `integration-tests/tests/utils/classes.py`	Adds `CmdArgs(BaseModel, ABC)` with abstract `to_cmd()` for structured CLI arg typing.
Action Shape `integration-tests/tests/utils/classes.py`	Extends `ExternalAction` with `args: CmdArgs
Output Helper `integration-tests/tests/utils/classes.py`	Adds `ExternalAction.get_output()` to concatenate `stdout` and `stderr` for verification consumers.
Verification Result `integration-tests/tests/utils/classes.py`	Adds frozen `VerificationResult` dataclass with boolean truthiness and `ok()`/`fail()` constructors for verification APIs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: introducing dataset-manager action and verification infrastructure for clp-json integration tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 87-92: The current logic treats an empty datasets_to_del list as
valid and proceeds to run dataset-manager del with no targets; update the branch
handling datasets_to_del so that if datasets_to_del is None or an empty list it
triggers pytest.fail (same message) unless del_all is true, and only assign
args.datasets = [dataset.dataset_name for dataset in datasets_to_del] when
datasets_to_del is a non-empty list; reference variables: datasets_to_del,
del_all, args.datasets and the dataset-manager del invocation.

In `@integration-tests/tests/utils/classes.py`:
- Around line 166-168: The get_output method currently concatenates
completed_proc.stdout and completed_proc.stderr directly, which can fuse the
last stdout line with the first stderr line if stdout lacks a trailing newline;
update get_output in the same class to insert a separator (e.g. '\n') between
stdout and stderr only when stdout does not already end with a newline (or use a
join that ensures exactly one newline), so the returned combined string
preserves line boundaries and avoids duplicating newlines.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 910fb98f-4c3b-4506-8eb9-97d0e242beb2

📥 Commits

Reviewing files that changed from the base of the PR and between 298e2e2 and 62f4226.

📒 Files selected for processing (2)

integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
integration-tests/tests/utils/classes.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 199-204: The loop over output_lines modifies the list during
iteration by calling output_lines.remove(line) after finding the regex match;
change this to avoid in-loop mutation: iterate with enumerate to capture the
index (for i, line in enumerate(output_lines)), set num_datasets =
int(match.group(1)) and then call output_lines.pop(i) (or slice to create a new
list without that element) before breaking; this keeps the logic around
re.search, match and num_datasets intact while preventing modifying output_lines
during iteration.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dde9b259-bff6-44ba-bd24-45d4a41ff4ff

📥 Commits

Reviewing files that changed from the base of the PR and between 62f4226 and 7ceae0d.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 157-162: The supporting ExternalAction returned by
dataset_manager_list (assigned to list_action) is never executed before its
properties (list_action.completed_proc.returncode and list_action.log_file_path)
are inspected; call its execute method using the same executor used elsewhere
(e.g., the executor variable or function used to run dataset_manager_del/list
actions) or invoke list_action.run()/execute_with_executor(...) so the
subprocess runs and populates completed_proc and log_file_path, then check
list_action.completed_proc.returncode and use list_action.log_file_path in the
pytest.fail message if non-zero.
- Around line 203-218: The parsing currently treats a missing "Found N datasets"
marker or a count mismatch as an empty list; change dataset_manager parsing so
that if the regex r"Found (\d+) datasets" does not match any line, it
raises/returns a verification failure (e.g., raise ValueError) instead of
returning dataset_list, and after collecting matches for r"INFO
\[dataset_manager\] (.+)" validate that len(dataset_list) == num_datasets and
raise/return failure if they differ; update the code paths around the
num_datasets, output_lines and dataset_list variables (and the function that
calls this parsing) to propagate that error so del --all verification cannot
proceed on untrusted/ drifting output.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 408a842a-a13b-4a20-ae57-cc5433ba7d6b

📥 Commits

Reviewing files that changed from the base of the PR and between 7ceae0d and baa43c1.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 92-97: The code currently allows both datasets_to_del and del_all
to be set which produces args.datasets plus the --all flag; update the logic in
the dataset-manager routine to enforce mutual exclusivity by checking if both
datasets_to_del and del_all are truthy and calling pytest.fail with a clear
message (e.g., "Specify either datasets_to_del or del_all, not both") before
building args; otherwise keep the existing branches that set args.datasets from
datasets_to_del or require one of them (the existing pytest.fail for neither) so
that only one delete target path can be used.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cb0c3134-90f3-4456-bc30-15e3d127f077

📥 Commits

Reviewing files that changed from the base of the PR and between baa43c1 and 5077d6c.

📒 Files selected for processing (1)

integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Add dataset-manager infra.

9b020ed

Docstrings.

62f4226

quinntaylormitchell changed the title ~~feat(integration-tests): Add dataset-manager infra.~~ feat(integration-tests): Add dataset-manager test infrastructure for clp-json. Apr 29, 2026

quinntaylormitchell changed the title ~~feat(integration-tests): Add dataset-manager test infrastructure for clp-json.~~ feat(integration-tests): Add dataset-manager action and verification infra for clp-json. Apr 29, 2026

quinntaylormitchell marked this pull request as ready for review April 29, 2026 23:41

quinntaylormitchell requested a review from a team as a code owner April 29, 2026 23:41

quinntaylormitchell requested a review from Bill-hbrhbr April 29, 2026 23:42

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated

Comment thread integration-tests/tests/utils/classes.py

Rabbit.

7ceae0d

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Rabbit.

baa43c1

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated

Rabbit.

5077d6c

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py Outdated

quinntaylormitchell added 2 commits May 5, 2026 20:38

Rabbit.

6b67e9c

Merge branch 'main' into feature branch

da58477

junhaoliao force-pushed the main branch from 4505f8c to 8977316 Compare May 13, 2026 22:48

Merge branch 'main' into feature branch

23bda40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(integration-tests): Add dataset-manager action and verification infra for `clp-json`.#2234

feat(integration-tests): Add dataset-manager action and verification infra for `clp-json`.#2234
quinntaylormitchell wants to merge 8 commits into
y-scope:mainfrom
quinntaylormitchell:testing-redesign-8

quinntaylormitchell commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quinntaylormitchell commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Validation performed

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

quinntaylormitchell commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading