Skip to content

Conversation

@aalva500-prog
Copy link
Contributor

@aalva500-prog aalva500-prog commented Dec 18, 2025

Description

This PR continues the work done in PR #4816 to add frequent used queries to the big5 workload based on gap analysis between existing benchmarks and frequent used query patterns.

dedup query is added here: #4991

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

noCharger and others added 3 commits November 17, 2025 09:35
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Tests
    • Expanded test coverage for regex-based field extraction functionality.
    • Added test cases for LIKE pattern matching with aggregation operations.
    • Added test cases for LIKE pattern matching with sorting operations.
    • Updated test expectations for aggregation span handling and struct field operations.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This pull request adds test infrastructure for validating PPL functionality around regex transformations and LIKE pattern matching. Three new test methods are added to the CalcitePPLBig5IT class with corresponding PPL query resource files. Test data in big5.json is expanded, and existing test expectations are updated to reflect new behavior.

Changes

Cohort / File(s) Summary
Test Methods
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Added three new test methods: rex_regex_transformation(), script_engine_like_pattern_with_aggregation(), and script_engine_like_pattern_with_sort() to validate regex extraction, LIKE pattern matching with aggregation, and LIKE pattern matching with sorting.
Test Expectation Updates
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java, integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
Updated expected result counts: "Span by Implicit Timestamp" test now expects 3 rows instead of 1; "testIsNotNullWithStruct" test now expects 3 rows instead of 1.
Test Data
integ-test/src/test/resources/big5/data/big5.json
Added two new log event entries (sshd and kernel events) with metadata to expand test dataset coverage.
PPL Query Resources
integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl, integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl, integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
Added three new PPL query scripts: regex-based field extraction with sorting; LIKE pattern matching with composite aggregation; and LIKE pattern matching with sort and result limiting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

enhancement

Suggested reviewers

  • Swiddis
  • joshuali925
  • derek-ho
  • penghuo
  • anirudha
  • ps48
  • ykmr1224
  • yuancu

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add Frequently Used Big5 PPL Queries' accurately describes the main change: adding new PPL queries to the Big5 workload.
Description check ✅ Passed The description is directly related to the changeset, explaining that it adds frequently used PPL queries to the Big5 workload based on gap analysis.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6079b8 and c727124.

📒 Files selected for processing (1)
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)

Comment @coderabbitai help to get the list of available commands and usage tips.

@aalva500-prog aalva500-prog mentioned this pull request Dec 18, 2025
8 tasks
@aalva500-prog aalva500-prog changed the title Add Frequently Use Big5 PPL Queries Add Frequently Used Big5 PPL Queries Dec 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl (1)

5-5: Index name inconsistency across test resources.

This query uses custom-big5 as the default index, while other queries in this PR use big5 (e.g., script_engine_like_pattern_with_sort.ppl, rex_regex_transformation.ppl). Consider standardizing the index name across all test resources unless there's a specific reason for the difference.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7dfabce and 1f006db.

📒 Files selected for processing (8)
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (2 hunks)
  • integ-test/src/test/resources/big5/data/big5.json (1 hunks)
  • integ-test/src/test/resources/big5/queries/dedup_metrics_size_field.ppl (1 hunks)
  • integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl (1 hunks)
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl (1 hunks)
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml (1 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yaml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test new grammar rules with positive and negative cases for PPL parser changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
  • integ-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🪛 Biome (2.1.2)
integ-test/src/test/resources/big5/data/big5.json

[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (6)
integ-test/src/test/resources/big5/data/big5.json (1)

1-6: LGTM! Static analysis warnings are false positives.

The file uses NDJSON (newline-delimited JSON) format, which is the standard format for OpenSearch bulk indexing operations. The Biome linter doesn't recognize this format and reports false positives. The new log entries (sshd and kernel) provide appropriate test data for the like pattern queries.

integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)

35-38: LGTM! Query correctly demonstrates LIKE pattern with sorting.

The PPL query properly filters messages containing "sshd" and sorts results by timestamp, which aligns with the new test data added to big5.json.

integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl (1)

25-29: LGTM! Rex query correctly demonstrates regex extraction.

The query properly extracts logType and filename from log.file.path using a regex pattern and computes the filename length. The transformation logic is sound and aligns with the test objectives.

integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl (1)

45-47: LGTM! Query correctly demonstrates LIKE pattern with aggregation.

The PPL query properly combines pattern matching with aggregation to count occurrences grouped by metrics.size.

integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)

8-9: LGTM! Appropriate static import added.

The import of assertYamlEqualsIgnoreId is correctly added to support YAML plan assertions in the new tests.

integ-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml (1)

1-16: LGTM! Expected Calcite plan correctly models deduplication.

The plan properly shows:

  • Logical plan with ROW_NUMBER window function for deduplication by metrics.size
  • Physical plan with filter condition (<=($45, 1)) to keep only the first occurrence
  • Pushdown of EXISTS filter and field projections to OpenSearch

This validates the correct translation of PPL dedup operation to Calcite execution plan.

@aalva500-prog aalva500-prog marked this pull request as draft December 19, 2025 23:21
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
@aalva500-prog aalva500-prog marked this pull request as ready for review December 23, 2025 20:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (2)

60-68: Consider adding plan assertions for consistency.

For consistency with rex_regex_transformation() and to properly test Calcite's SQL generation and optimization paths, consider adding plan validation assertions. This would verify that the generated Calcite plan matches expected behavior for LIKE pattern matching with aggregation.

🔎 Example refactor to add assertions
  @Test
  public void script_engine_like_pattern_with_aggregation() throws IOException {
    String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_aggregation.ppl"));
    timing(summary, "script_engine_like_pattern_with_aggregation", ppl);
+   String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_aggregation.yaml");
+   assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl));
  }

Note: You would need to create the corresponding expected output YAML file at integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_aggregation.yaml.

Based on learnings, Calcite integration changes should test SQL generation and optimization paths.


70-78: Consider adding plan assertions for consistency.

Similar to the comment above, consider adding plan validation assertions to verify Calcite's handling of LIKE pattern matching with sorting. This would provide better test coverage and consistency with the rex_regex_transformation() test.

🔎 Example refactor to add assertions
  @Test
  public void script_engine_like_pattern_with_sort() throws IOException {
    String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_sort.ppl"));
    timing(summary, "script_engine_like_pattern_with_sort", ppl);
+   String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_sort.yaml");
+   assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl));
  }

Note: You would need to create the corresponding expected output YAML file at integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_sort.yaml.

Based on learnings, Calcite integration changes should test SQL generation and optimization paths.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 237478a and edef39c.

📒 Files selected for processing (4)
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/src/test/resources/**/*

⚙️ CodeRabbit configuration file

integ-test/src/test/resources/**/*: - Verify test data is realistic and representative

  • Check data format matches expected schema
  • Ensure test data covers edge cases and boundary conditions

Files:

  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test new grammar rules with positive and negative cases for PPL parser changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java
  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (4)
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml (1)

1-14: LGTM! Expected plan structure is correct.

The YAML properly defines both logical and physical Calcite plans for the rex regex transformation query. The regex pattern correctly extracts logType and filename from the log path, and the physical plan includes appropriate pushdown optimizations (PROJECT, SORT, LIMIT).

integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)

48-58: LGTM! Test follows best practices.

The test properly validates Calcite's handling of rex regex transformations by loading the expected plan and asserting the actual generated plan matches. The JavaDoc clearly documents the test's purpose.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java (1)

738-745: The test expectations are correct and match the big5 test data. The big5.json file contains exactly three documents with @timestamp values (2023-01-02, 2023-03-01, 2023-05-01) that normalize to the expected 1-day spans. No action required.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java (1)

96-100: Row count of 3 is correct for the isnotnull(aws) filter.

The big5.json test data contains exactly 3 documents with a non-null aws object field (which contains nested aws.cloudwatch properties). All 3 will be returned by the query, confirming the expected verifyNumOfRows(actual, 3) assertion is accurate.

@aalva500-prog
Copy link
Contributor Author

@Swiddis the SQL CLI Integration tests are failing, any recommendations on how to fix this?

@LantaoJin LantaoJin added the testing Related to improving software testing label Dec 24, 2025
Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you resolve the conflicts

aalva500-prog and others added 3 commits December 24, 2025 09:54
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)

70-78: Add plan assertions as previously requested.

The JavaDoc has been added, but the main issue from the previous review remains unaddressed: this test still only performs timing without validating the generated Calcite plan. As noted in the previous review comment (lines 70-78), add plan validation:

  @Test
  public void script_engine_like_pattern_with_sort() throws IOException {
    String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_sort.ppl"));
    timing(summary, "script_engine_like_pattern_with_sort", ppl);
+   String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_sort.yaml");
+   assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl));
  }

Create integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_sort.yaml with the expected Calcite query plan.

Based on learnings, test SQL generation and optimization paths for Calcite integration changes.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8dbb518 and 0bf2fb8.

📒 Files selected for processing (4)
  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
  • integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🚧 Files skipped from review as they are similar to previous changes (3)
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
  • integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)

48-58: Good implementation - JavaDoc and assertions added.

The JavaDoc has been added per previous review feedback, and the test now includes proper plan validation via assertYamlEqualsIgnoreId, which tests Calcite's SQL generation and optimization paths as expected for Calcite integration tests.

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)

1-35: Consider clarifying the comment block's purpose.

The comment block describes an OpenSearch DSL query structure with script engine details, while the PPL query below (lines 36-39) is much simpler. Consider adding a note at the beginning indicating this represents the expected compiled/translated OpenSearch query format for documentation/testing purposes, to avoid confusion about why both representations exist.

💡 Example clarification
-/* Filter messages containing 'sshd', sort by timestamp, and return top 10 results */
+/* Filter messages containing 'sshd', sort by timestamp, and return top 10 results
+ * 
+ * Expected compiled OpenSearch DSL query structure (for documentation/validation):
+ */
 /*
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bf2fb8 and d6079b8.

📒 Files selected for processing (3)
  • integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🚧 Files skipped from review as they are similar to previous changes (2)
  • integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
🧰 Additional context used
📓 Path-based instructions (1)
integ-test/src/test/resources/**/*

⚙️ CodeRabbit configuration file

integ-test/src/test/resources/**/*: - Verify test data is realistic and representative

  • Check data format matches expected schema
  • Ensure test data covers edge cases and boundary conditions

Files:

  • integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)

36-39: LGTM! The PPL query syntax and logic are correct:

  • Filters messages containing 'sshd' using the LIKE predicate
  • Sorts by timestamp in descending order
  • Returns top 10 results

Test data verification confirms the big5 dataset contains a message record with "sshd" in both the message text and process.name field. The data format is valid NDJSON with proper Elasticsearch bulk indexing structure, and includes realistic log metadata (timestamps, agent info, cloud region). The query will execute successfully and return meaningful test results.

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
@dai-chen dai-chen merged commit b66dc12 into opensearch-project:main Jan 6, 2026
38 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 8, 2026
* Add frequent used queries

Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>

* Add new queries to CalcitePPLBig5IT

Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>

* Add frequently used Big5 PPL queries

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>

* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>

* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>

* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>

* Addressing naming difference issues

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixing formatting issues

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixing integration tests

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Removing dedup

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Adding comment to rex ppl file to explain what the query does

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Adding comments to ppl queries to explain what they do

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Fixing comments

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Empty commit to trigger CI

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

* Removing correctness check for rex command

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

---------

Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Co-authored-by: Louis Chu <lingzhichu.clz@gmail.com>
Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
(cherry picked from commit b66dc12)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
LantaoJin pushed a commit that referenced this pull request Jan 8, 2026
* Add frequent used queries



* Add new queries to CalcitePPLBig5IT



* Add frequently used Big5 PPL queries



* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java




* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java




* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java




* Update integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java




* Addressing naming difference issues



* Fixing formatting issues



* Fixing integration tests



* Removing dedup



* Adding comment to rex ppl file to explain what the query does



* Adding comments to ppl queries to explain what they do



* Fixing comments



* Empty commit to trigger CI



* Removing correctness check for rex command



---------







(cherry picked from commit b66dc12)

Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Louis Chu <lingzhichu.clz@gmail.com>
Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev testing Related to improving software testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants