Skip to content

feat: Add CLP_GET_* UDFs with rewrites for schemaless querying.#42

Merged
wraymo merged 23 commits into
y-scope:release-0.293-clp-connectorfrom
wraymo:clp_get_udf
Sep 6, 2025
Merged

feat: Add CLP_GET_* UDFs with rewrites for schemaless querying.#42
wraymo merged 23 commits into
y-scope:release-0.293-clp-connectorfrom
wraymo:clp_get_udf

Conversation

@wraymo
Copy link
Copy Markdown

@wraymo wraymo commented Jul 17, 2025

Description

This PR introduces new user-defined functions (UDFs) for the CLP connector to improve querying of semi-structured logs:

CLP_GET_* functions for retrieving values from JSON paths, targeting specific CLP column types (ClpString, Integer, Float, etc.). These UDFs enable direct access to dynamic fields and return corresponding Presto native types.
They are integrated with the query rewriting layer. During query optimization, calls to these functions are rewritten into normal column references or KQL query column symbols, ensuring efficient execution with no additional parsing overhead.

The PR doesn't update the documentation and will leave it for a future PR.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

All unit tests passed. End-to-end testing worked.

Summary by CodeRabbit

  • New Features

    • Added placeholder scalar CLP functions for typed JSON-path extraction and registered them with the CLP plugin.
    • Added a plan rewriter to translate CLP UDFs into query variables to enable optimization and pushdown.
  • Refactor

    • Reorganized optimizer and converter components; centralized filter pushdown and extended it to cover metadata filters.
    • Simplified converter APIs by encapsulating variable-to-column mappings.
  • Tests

    • Added integration tests for UDF rewriting and compute-pushdown; improved test setup.
  • Chores

    • Expanded test-scoped dependencies and updated native-execution submodule pointer.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants