Skip to content

feat(integration-tests): Add search action and verification infra.#2235

Open
quinntaylormitchell wants to merge 6 commits into
y-scope:mainfrom
quinntaylormitchell:testing-redesign-9
Open

feat(integration-tests): Add search action and verification infra.#2235
quinntaylormitchell wants to merge 6 commits into
y-scope:mainfrom
quinntaylormitchell:testing-redesign-9

Conversation

@quinntaylormitchell
Copy link
Copy Markdown
Collaborator

@quinntaylormitchell quinntaylormitchell commented Apr 29, 2026

Description

This PR adds the action and verification infrastructure for testing package search.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested on development branch; all tests pass.

Summary by CodeRabbit

  • Tests
    • Expanded integration test utilities for package search with multiple search modes (case, file-targeting, counts, time ranges).
    • Added verification that compares search output to reference tool results and reports pass/fail with details.
    • Test actions now include structured/typed arguments for clearer intent and reporting.
    • Test dataset metadata includes explicit single-file match entries to stabilise single-match scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Walkthrough

Adds a typed SearchArgs builder, ClpPackageSearchType modes, functions to construct and run CLP package search actions, grep-based verification helpers that normalize results per mode, and a new dataset metadata field plus validation for a single-match file.

Changes

Search Test Utilities

Layer / File(s) Summary
Module header and SearchArgs builder
integration-tests/tests/package_tests/utils/search.py
Module imports, logger, default interval constant, and SearchArgs with to_cmd() building CLI flags for dataset/file targeting, ignore-case, count modes, time bounds, raw output, and wildcard query.
Search modes and action construction
integration-tests/tests/package_tests/utils/search.py
ClpPackageSearchType enum plus search_clp_package() and _construct_args() mapping dataset metadata to CLI flags and returning a ClpAction with typed SearchArgs.
Verification flow and helpers
integration-tests/tests/package_tests/utils/search.py
verify_search_action() validates exit code and SearchArgs, builds a mode-specific grep command, runs grep, and compares normalized grep output to normalized CLP output. Helper functions: _construct_grep_verification_cmd(), _get_grep_options_from_search_type(), _format_grep_result_for_search_type(), _format_search_result_for_search_type().

Dataset Metadata

Layer / File(s) Summary
single_match_file metadata and validation
integration-tests/tests/utils/classes.py, integration-tests/tests/data/json_multifile/metadata.json, integration-tests/tests/data/text_multifile/metadata.json
Adds single_match_file field to SampleDatasetMetadata and a SampleDataset.__post_init__ check that raises ValueError if the specified file is not present in metadata.file_names. Corresponding dataset metadata files include the new field.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Code
    participant Builder as SearchArgs / search_clp_package
    participant CLP as CLP subprocess
    participant Verifier as verify_search_action
    participant Grep as grep subprocess

    Test->>Builder: request ClpAction(dataset, type, query)
    Builder-->>Test: ClpAction (cmd list, SearchArgs)
    Test->>CLP: run ClpAction subprocess
    CLP-->>Test: exit code + stdout
    Test->>Verifier: verify_search_action(action, type, dataset)
    Verifier->>Verifier: check exit code & args
    Verifier->>Grep: run mode-specific grep on dataset logs or file path
    Grep-->>Verifier: grep stdout / exit code
    Verifier->>Verifier: normalize outputs and compare
    Verifier-->>Test: VerificationResult.ok() or fail(message)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding search action and verification infrastructure for integration tests, which aligns with the new SearchArgs class, search_clp_package function, and verify_search_action function introduced in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@quinntaylormitchell quinntaylormitchell changed the title feat(integration-tests): Add search infra. feat(integration-tests): Add search action and verification infra. Apr 29, 2026
@quinntaylormitchell quinntaylormitchell marked this pull request as ready for review April 29, 2026 23:41
@quinntaylormitchell quinntaylormitchell requested a review from a team as a code owner April 29, 2026 23:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/utils/search.py`:
- Line 134: When running in FILE_PATH mode, guard against empty metadata
file_names before indexing: check that dataset.metadata.file_names is non-empty
(e.g., len(dataset.metadata.file_names) > 0 or truthiness) before using
dataset.metadata.file_names[0]; if empty, raise a clear assertion or ValueError
with a useful message rather than letting IndexError bubble up. Update the
assignment to args.file_path to only occur after this guard; reference the
existing identifiers args.file_path, dataset.metadata.file_names, and FILE_PATH
mode to locate and modify the code.
- Around line 176-177: Replace the bare assert on "args = action.args" with
explicit error handling: check "isinstance(args, SearchArgs)" and if false
return a VerificationResult.fail(...) with a clear message (matching other
validations in this function). Use the same VerificationResult type used
elsewhere in this file so invalid action.args is handled consistently rather
than being stripped by Python -O; reference the variables "args" and
"action.args" and the classes "SearchArgs" and "VerificationResult".
- Around line 184-188: The check treating any non-zero grep exit code as failure
is too strict: GNU grep returns 1 when no matches are found. Update the
condition in the block that inspects grep_action.completed_proc.returncode to
only fail when the return code is neither 0 nor 1 (i.e., allow 0 and 1), and
keep the pytest.fail call that references grep_action.log_file_path for true
errors; ensure you still reference grep_action.completed_proc.returncode and
pytest.fail in the same location.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ef0fc03e-e41f-4ba2-b226-b05ab2ce3723

📥 Commits

Reviewing files that changed from the base of the PR and between 298e2e2 and 3b62e1f.

📒 Files selected for processing (2)
  • integration-tests/tests/package_tests/utils/search.py
  • integration-tests/tests/utils/classes.py

Comment thread integration-tests/tests/package_tests/utils/search.py Outdated
Comment thread integration-tests/tests/package_tests/utils/search.py
Comment thread integration-tests/tests/package_tests/utils/search.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration-tests/tests/package_tests/utils/search.py`:
- Around line 282-286: The COUNT_BY_TIME branch currently uses re.search which
only returns the first "count: <n>" match; change the logic in the case handling
for ClpPackageSearchType.COUNT_RESULTS | ClpPackageSearchType.COUNT_BY_TIME to
use re.findall(r"count: (\d+)", search_result) (or an iterator like re.finditer)
to collect all count buckets, return them joined with "\n" and a trailing
newline, and call pytest.fail if the resulting list is empty; keep the same
match formatting and error message behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 22395b96-f587-4100-b7c2-fa030589ad7f

📥 Commits

Reviewing files that changed from the base of the PR and between 3b62e1f and f9d3189.

📒 Files selected for processing (1)
  • integration-tests/tests/package_tests/utils/search.py

Comment thread integration-tests/tests/package_tests/utils/search.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
integration-tests/tests/package_tests/utils/search.py (2)

175-176: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace assert with explicit verification failure handling.

assert can be stripped with Python optimization flags, which can bypass this validation path. Use explicit runtime handling and return a typed verification failure instead.

Suggested fix
     args = action.args
-    assert isinstance(args, SearchArgs)
+    if not isinstance(args, SearchArgs):
+        return action.fail_verification(
+            "Search verification requires `action.args` to be a SearchArgs instance."
+        )
In Python, are assert statements removed when running with optimization flags like -O?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/tests/package_tests/utils/search.py` around lines 175 -
176, The code uses an assert to check the type of action.args which can be
bypassed with Python -O; replace the assert with an explicit runtime check on
action.args (e.g., inspect isinstance(action.args, SearchArgs)) and on failure
raise or return the appropriate verification failure value used by this test
harness (do not rely on AssertionError). Update the handling around
args/action.args and SearchArgs so the function returns or raises the typed
verification failure the test framework expects instead of using assert.

269-273: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

COUNT_BY_TIME parsing only reads the first bucket.

re.search only captures one count: value, so multi-bucket count-by-time output is verified incorrectly. Aggregate all buckets before comparing.

Suggested fix
         case ClpPackageSearchType.COUNT_RESULTS | ClpPackageSearchType.COUNT_BY_TIME:
-            match = re.search(r"count: (\d+)", search_result)
-            if match:
-                return match.group(1) + "\n"
+            counts = [int(value) for value in re.findall(r"count:\s*(\d+)", search_result)]
+            if counts:
+                return f"{sum(counts)}\n"
             pytest.fail(f"The search result '{search_result}' wasn't in the correct format.")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integration-tests/tests/package_tests/utils/search.py` around lines 269 -
273, The COUNT_BY_TIME branch currently uses re.search which only captures the
first "count: X" and misses additional buckets; update the block handling
ClpPackageSearchType.COUNT_RESULTS | ClpPackageSearchType.COUNT_BY_TIME to use
re.findall(r"count: (\d+)", search_result), convert all matches to integers, sum
them (or otherwise aggregate all bucket counts as required), and return the
aggregated value as a string with a trailing newline (e.g., str(total) + "\n");
keep the pytest.fail call if no matches are found.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@integration-tests/tests/package_tests/utils/search.py`:
- Around line 175-176: The code uses an assert to check the type of action.args
which can be bypassed with Python -O; replace the assert with an explicit
runtime check on action.args (e.g., inspect isinstance(action.args, SearchArgs))
and on failure raise or return the appropriate verification failure value used
by this test harness (do not rely on AssertionError). Update the handling around
args/action.args and SearchArgs so the function returns or raises the typed
verification failure the test framework expects instead of using assert.
- Around line 269-273: The COUNT_BY_TIME branch currently uses re.search which
only captures the first "count: X" and misses additional buckets; update the
block handling ClpPackageSearchType.COUNT_RESULTS |
ClpPackageSearchType.COUNT_BY_TIME to use re.findall(r"count: (\d+)",
search_result), convert all matches to integers, sum them (or otherwise
aggregate all bucket counts as required), and return the aggregated value as a
string with a trailing newline (e.g., str(total) + "\n"); keep the pytest.fail
call if no matches are found.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 07eaf020-0cb0-448f-80b7-6cc6a26a3283

📥 Commits

Reviewing files that changed from the base of the PR and between 89677f8 and 613925f.

📒 Files selected for processing (2)
  • integration-tests/tests/package_tests/utils/search.py
  • integration-tests/tests/utils/classes.py

Copy link
Copy Markdown
Contributor

@Bill-hbrhbr Bill-hbrhbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't actually include any search tests right? Let's include them instead of simply introducing a portion of the search test infra.
Also, most of my concerns are elaborated in the comments.

)

args: SearchArgs = _construct_args(clp_package, dataset, search_type, wildcard_query)
return ClpAction(cmd=args.to_cmd(), args=args)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return ClpAction(cmd=args.to_cmd(), args=args)
return ClpAction.from_args(args)

dataset.dataset_name,
)

args: SearchArgs = _construct_args(clp_package, dataset, search_type, wildcard_query)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, feels like there's no need to create a helper for what feels like the meat of search_clp_package.

Comment on lines +171 to +173
returncode_result = action.verify_returncode()
if not returncode_result:
return returncode_result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
returncode_result = action.verify_returncode()
if not returncode_result:
return returncode_result
result = action.verify_returncode()
if not result:
return result

Comment on lines +15 to +16
"single_match_wildcard_query": "\"detail\":\"Roll program complete, heads down attitude achieved for ascent\"",
"single_match_file": "sts-135-2011-07-08.jsonl"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to feel like these entries are more like test cases instead of metadata. Why don't we inline these queries and query results in test code directly?

Also each time you add something, you'd have to update the metadata format documentation.

Comment on lines +79 to +87
class ClpPackageSearchType(Enum):
"""Possible search types."""

BASIC = auto()
FILE_PATH = auto()
IGNORE_CASE = auto()
COUNT_RESULTS = auto()
COUNT_BY_TIME = auto()
TIME_RANGE = auto()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like different test case flavours hiding away as an Enum class inside a utils folder. Without looking at your existing infra, the clearest design to me would look something like:

  @pytest.mark.parametrize(
      "clp_package",
      [CLP_TEXT_MODE, CLP_JSON_MODE],
      indirect=True,
      ids=[CLP_TEXT_MODE.mode_name, CLP_JSON_MODE.mode_name],
  )
  class TestPackageSearch:
      def test_basic_search(self, clp_package: ClpPackage) -> None:
          ...

      def test_file_path_search(self, clp_package: ClpPackage) -> None:
          ...

      def test_ignore_case_search(self, clp_package: ClpPackage) -> None:
          ...

      def test_count_results_search(self, clp_package: ClpPackage) -> None:
          ...

      def test_count_by_time_search(self, clp_package: ClpPackage) -> None:
          ...

      def test_time_range_search(self, clp_package: ClpPackage) -> None:
          ...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, now you already have a split file design. The split file design is sound if the package behaviors are meaningfully different and you want each file to document a specific package flavor.

But it becomes weak if the tests end up looking like:

  • test_clp_text.py → calls a generic helper
  • test_clp_json.py → calls the same generic helper

with all meaningful behavior hidden behind shared utilities.

A sound split file design with search tests would look more like:

  • test_clp_text.py

    • Uses text_multifile
    • Shows CLP/text specific command expectations
    • Does not pass --dataset
    • Does not pass --timestamp-key
    • Verifies plain text search behavior
  • test_clp_json.py

    • Uses json_multifile
    • Shows CLP-S/JSON specific command expectations
    • Passes --dataset
    • Passes --timestamp-key
    • Verifies structured search behavior and unstructured + log-converter.

There is an abundance of over-abstracting the package specific behavior that I'm seeing in these PRs.

A rule of thumb for this test suite could be:

  • Keep package lifecycle fixtures shared.
  • Keep low-level action, logging, and result helpers shared.
  • Keep subprocess calling shared.

However, keep package-specific command construction and expectations visible in test_clp_text.py and test_clp_json.py. That preserves the value of having separate test modules.

Comment on lines +135 to +140
if self.metadata.single_match_file not in self.metadata.file_names:
err_msg = (
f"`single_match_file` '{self.metadata.single_match_file}' is not listed in"
" `file_names`."
)
raise ValueError(err_msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, there's no need to verify this, because if the single_match_file contains the wrong value, the search test will fail anyway.
This touches upon my previous comment on what should be metadata and what should merely be test case constants/inputs/outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants