Skip to content

[GLUTEN] Route bitmap_or_agg to native Velox execution#12242

Open
minni31 wants to merge 5 commits into
apache:mainfrom
minni31:bitmap-or-agg
Open

[GLUTEN] Route bitmap_or_agg to native Velox execution#12242
minni31 wants to merge 5 commits into
apache:mainfrom
minni31:bitmap-or-agg

Conversation

@minni31
Copy link
Copy Markdown
Contributor

@minni31 minni31 commented Jun 4, 2026

Register bitmap_or_agg aggregate function for native Velox execution:

  • Add BITMAP_OR_AGG constant to ExpressionNames
  • Add bitmap_or_agg to C++ plan validator supportedAggFuncs
  • Register Sig[BitmapOrAgg] in Spark 3.5/4.0/4.1 shims
  • Add DefaultValidator() to CH_AGGREGATE_FUNC_BLACKLIST (CH fallback)
  • Add plan-shape assertion test (excluded until Velox function lands)
  • Add ClickHouse test exclusions for native-only test

Note: The native Velox bitmap_or_agg function is pending upstream (facebookincubator/velox). The test is excluded in VeloxTestSettings until that PR is merged and Gluten's Velox dependency is updated.

What changes are proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Register bitmap_or_agg aggregate function for native Velox execution:
- Add BITMAP_OR_AGG constant to ExpressionNames
- Add bitmap_or_agg to C++ plan validator supportedAggFuncs
- Register Sig[BitmapOrAgg] in Spark 3.5/4.0/4.1 shims
- Add DefaultValidator() to CH_AGGREGATE_FUNC_BLACKLIST (CH fallback)
- Add plan-shape assertion test (excluded until Velox function lands)
- Add ClickHouse test exclusions for native-only test

Note: The native Velox bitmap_or_agg function is pending upstream
(facebookincubator/velox). The test is excluded in VeloxTestSettings
until that PR is merged and Gluten's Velox dependency is updated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 16:43
@github-actions github-actions Bot added CORE works for Gluten Core VELOX CLICKHOUSE labels Jun 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support wiring for the bitmap_or_agg aggregate across Spark shims and backend validation, plus query-plan routing tests (with backend-specific exclusions where not yet supported).

Changes:

  • Register bitmap_or_agg in Spark 3.5/4.0/4.1 shims and add the expression name constant.
  • Add query suite coverage asserting bitmap_or_agg routes to native aggregation.
  • Update backend allow/deny lists (Velox validator + CH function validation / test exclusions).

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
shims/spark41/.../Spark41Shims.scala Registers bitmap_or_agg expression signature for Spark 4.1 shim.
shims/spark40/.../Spark40Shims.scala Registers bitmap_or_agg expression signature for Spark 4.0 shim.
shims/spark35/.../Spark35Shims.scala Registers bitmap_or_agg expression signature for Spark 3.5 shim.
shims/common/.../ExpressionNames.scala Adds BITMAP_OR_AGG SQL function name constant.
gluten-ut/spark41/.../GlutenBitmapExpressionsQuerySuite.scala Adds routing-to-native test for bitmap_or_agg.
gluten-ut/spark41/.../VeloxTestSettings.scala Excludes new bitmap_or_agg test for Velox backend pending support.
gluten-ut/spark41/.../ClickHouseTestSettings.scala Excludes new bitmap_or_agg test for CH backend.
gluten-ut/spark40/.../GlutenBitmapExpressionsQuerySuite.scala Adds routing-to-native test for bitmap_or_agg.
gluten-ut/spark40/.../VeloxTestSettings.scala Excludes new bitmap_or_agg test for Velox backend pending support.
gluten-ut/spark40/.../ClickHouseTestSettings.scala Excludes new bitmap_or_agg test for CH backend.
gluten-ut/spark35/.../GlutenBitmapExpressionsQuerySuite.scala Adds routing-to-native test for bitmap_or_agg.
gluten-ut/spark35/.../VeloxTestSettings.scala Excludes new bitmap_or_agg test for Velox backend pending support.
gluten-ut/spark35/.../ClickHouseTestSettings.scala Excludes new bitmap_or_agg test for CH backend.
cpp/velox/substrait/SubstraitToVeloxPlanValidator.cc Whitelists bitmap_or_agg as a supported aggregate in Velox plan validation.
backends-clickhouse/.../CHExpressionUtil.scala Adds bitmap_or_agg to CH expression validation map.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cpp/velox/substrait/SubstraitToVeloxPlanValidator.cc
The native bitmap_or_agg function has been merged upstream in Velox.
Remove the .exclude() entries from VeloxTestSettings so the plan-shape
assertion test now runs on CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Run Gluten Clickhouse CI on x86

Use a subquery to avoid nesting bitmap_construct_agg inside
bitmap_or_agg at the same aggregation level.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 4, 2026 16:50
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Comment on lines +41 to +54
test("bitmap_or_agg routes to native") {
val df = spark.sql(
"SELECT bitmap_or_agg(bm) FROM (" +
"SELECT bitmap_construct_agg(bitmap_bit_position(col)) AS bm " +
"FROM values (1L), (2L), (3L) AS t(col)" +
") sub")
df.collect()
assert(
collectWithSubqueries(df.queryExecution.executedPlan) {
case h: HashAggregateExecBaseTransformer => h
}.nonEmpty,
"Expected native HashAggregateExecBaseTransformer in plan"
)
}
Comment on lines +41 to +54
test("bitmap_or_agg routes to native") {
val df = spark.sql(
"SELECT bitmap_or_agg(bm) FROM (" +
"SELECT bitmap_construct_agg(bitmap_bit_position(col)) AS bm " +
"FROM values (1L), (2L), (3L) AS t(col)" +
") sub")
df.collect()
assert(
collectWithSubqueries(df.queryExecution.executedPlan) {
case h: HashAggregateExecBaseTransformer => h
}.nonEmpty,
"Expected native HashAggregateExecBaseTransformer in plan"
)
}
Comment on lines +41 to +54
test("bitmap_or_agg routes to native") {
val df = spark.sql(
"SELECT bitmap_or_agg(bm) FROM (" +
"SELECT bitmap_construct_agg(bitmap_bit_position(col)) AS bm " +
"FROM values (1L), (2L), (3L) AS t(col)" +
") sub")
df.collect()
assert(
collectWithSubqueries(df.queryExecution.executedPlan) {
case h: HashAggregateExecBaseTransformer => h
}.nonEmpty,
"Expected native HashAggregateExecBaseTransformer in plan"
)
}
Comment on lines 91 to +94
enableSuite[GlutenBitmapExpressionsQuerySuite]
// bitmap_construct_agg is not supported natively in CH backend.
.excludeCH("bitmap_construct_agg routes to native")
.excludeCH("bitmap_or_agg routes to native")
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 8, 2026

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLICKHOUSE CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants