Skip to content

Simplify concat/concat_ws simplification path & support binary in concat simplification#23079

Open
Jefffrey wants to merge 11 commits into
apache:mainfrom
Jefffrey:refactor-concat
Open

Simplify concat/concat_ws simplification path & support binary in concat simplification#23079
Jefffrey wants to merge 11 commits into
apache:mainfrom
Jefffrey:refactor-concat

Conversation

@Jefffrey

@Jefffrey Jefffrey commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

  • N/A

Rationale for this change

Attempting to simplify the implementation of concat/concat_ws simplify & return type. Main motivator is that return_type had logic to find widest type; however the signatures already guaranteed that type coercion would coerce all input arguments to a single common type, meaning this return type calculation logic was unnecessary. Along the way simplify the simplify implementation.

What changes are included in this PR?

Refactor simplify for concat & concat_ws, along with return type; fix tests accordingly (tests incorrectly passed in disparate types instead of a common expected type which the output of type coercion would provide).

Are these changes tested?

Existing tests.

Are there any user-facing changes?

No.

Jefffrey added 11 commits June 21, 2026 22:27
concat_ws shouldn't call the simplify implementation of concat, as it
adds unnecessary complexity; it's enough to just rewrite to concat and
let the expression simplifier then run simplify of concat
Only reason this was a separate function was so concat_ws could also
call it; now that it's removed, lets inline for simplicity
Type coercion ensures we get inputs all of the same type; so we don't
need this logic to find the widest type again in the return_type
function. Some adjustments need to be made accordingly:

- Align simplify to ensure we get the same return type
- Fix Spark concat since it has a special path for no inputs
- Fix unit tests for concat since testing invoke should pass inputs all
of the same type
- Fix simplification tests (same reason as above)
- Move some of the unit tests to SLTs (for the Array path); keeping the
other unit tests since they test the scalar path without going through
simplification first
Coercion will convert it to LargeBinary
Mainly flattening the nested match
We were incorrectly filtering the args (which is a simplification) but
calling this the original form, instead of returning the simplified
variant
@github-actions github-actions Bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation spark labels Jun 22, 2026
@Jefffrey Jefffrey changed the title Refactor concat Simplify concat/concat_ws simplification path & support binary in concat simplification Jun 22, 2026
@Jefffrey

Copy link
Copy Markdown
Contributor Author

Recommend reviewing commit by commit as each commit breaks down cleanly as a unit of work, and has some explanatory commentary along the way.

@Jefffrey Jefffrey marked this pull request as ready for review June 22, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant