feat(integration-tests): Add dataset-manager action and verification infra for clp-json.#2234
feat(integration-tests): Add dataset-manager action and verification infra for clp-json.#2234quinntaylormitchell wants to merge 8 commits into
clp-json.#2234Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new test utility module to build and verify clp-json ChangesDataset Manager Test Utilities
Base Test Utilities
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
clp-json.
clp-json.clp-json.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 87-92: The current logic treats an empty datasets_to_del list as
valid and proceeds to run dataset-manager del with no targets; update the branch
handling datasets_to_del so that if datasets_to_del is None or an empty list it
triggers pytest.fail (same message) unless del_all is true, and only assign
args.datasets = [dataset.dataset_name for dataset in datasets_to_del] when
datasets_to_del is a non-empty list; reference variables: datasets_to_del,
del_all, args.datasets and the dataset-manager del invocation.
In `@integration-tests/tests/utils/classes.py`:
- Around line 166-168: The get_output method currently concatenates
completed_proc.stdout and completed_proc.stderr directly, which can fuse the
last stdout line with the first stderr line if stdout lacks a trailing newline;
update get_output in the same class to insert a separator (e.g. '\n') between
stdout and stderr only when stdout does not already end with a newline (or use a
join that ensures exactly one newline), so the returned combined string
preserves line boundaries and avoids duplicating newlines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 910fb98f-4c3b-4506-8eb9-97d0e242beb2
📒 Files selected for processing (2)
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.pyintegration-tests/tests/utils/classes.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 199-204: The loop over output_lines modifies the list during
iteration by calling output_lines.remove(line) after finding the regex match;
change this to avoid in-loop mutation: iterate with enumerate to capture the
index (for i, line in enumerate(output_lines)), set num_datasets =
int(match.group(1)) and then call output_lines.pop(i) (or slice to create a new
list without that element) before breaking; this keeps the logic around
re.search, match and num_datasets intact while preventing modifying output_lines
during iteration.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: dde9b259-bff6-44ba-bd24-45d4a41ff4ff
📒 Files selected for processing (1)
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 157-162: The supporting ExternalAction returned by
dataset_manager_list (assigned to list_action) is never executed before its
properties (list_action.completed_proc.returncode and list_action.log_file_path)
are inspected; call its execute method using the same executor used elsewhere
(e.g., the executor variable or function used to run dataset_manager_del/list
actions) or invoke list_action.run()/execute_with_executor(...) so the
subprocess runs and populates completed_proc and log_file_path, then check
list_action.completed_proc.returncode and use list_action.log_file_path in the
pytest.fail message if non-zero.
- Around line 203-218: The parsing currently treats a missing "Found N datasets"
marker or a count mismatch as an empty list; change dataset_manager parsing so
that if the regex r"Found (\d+) datasets" does not match any line, it
raises/returns a verification failure (e.g., raise ValueError) instead of
returning dataset_list, and after collecting matches for r"INFO
\[dataset_manager\] (.+)" validate that len(dataset_list) == num_datasets and
raise/return failure if they differ; update the code paths around the
num_datasets, output_lines and dataset_list variables (and the function that
calls this parsing) to propagate that error so del --all verification cannot
proceed on untrusted/ drifting output.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 408a842a-a13b-4a20-ae57-cc5433ba7d6b
📒 Files selected for processing (1)
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py`:
- Around line 92-97: The code currently allows both datasets_to_del and del_all
to be set which produces args.datasets plus the --all flag; update the logic in
the dataset-manager routine to enforce mutual exclusivity by checking if both
datasets_to_del and del_all are truthy and calling pytest.fail with a clear
message (e.g., "Specify either datasets_to_del or del_all, not both") before
building args; otherwise keep the existing branches that set args.datasets from
datasets_to_del or require one of them (the existing pytest.fail for neither) so
that only one delete target path can be used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cb0c3134-90f3-4456-bc30-15e3d127f077
📒 Files selected for processing (1)
integration-tests/tests/package_tests/clp_json/utils/dataset_manager.py
Description
This PR adds the action and verification infrastructure for testing the package dataset-manager (only for
clp-json).Checklist
breaking change.
Validation performed
Tested on development branch; all tests pass.
Summary by CodeRabbit