Simplify concat/concat_ws simplification path & support binary in concat simplification#23079
Open
Jefffrey wants to merge 11 commits into
Open
Simplify concat/concat_ws simplification path & support binary in concat simplification#23079Jefffrey wants to merge 11 commits into
concat/concat_ws simplification path & support binary in concat simplification#23079Jefffrey wants to merge 11 commits into
Conversation
concat_ws shouldn't call the simplify implementation of concat, as it adds unnecessary complexity; it's enough to just rewrite to concat and let the expression simplifier then run simplify of concat
Only reason this was a separate function was so concat_ws could also call it; now that it's removed, lets inline for simplicity
Type coercion ensures we get inputs all of the same type; so we don't need this logic to find the widest type again in the return_type function. Some adjustments need to be made accordingly: - Align simplify to ensure we get the same return type - Fix Spark concat since it has a special path for no inputs - Fix unit tests for concat since testing invoke should pass inputs all of the same type - Fix simplification tests (same reason as above) - Move some of the unit tests to SLTs (for the Array path); keeping the other unit tests since they test the scalar path without going through simplification first
Coercion will convert it to LargeBinary
Mainly flattening the nested match
We were incorrectly filtering the args (which is a simplification) but calling this the original form, instead of returning the simplified variant
concat/concat_ws simplification path & support binary in concat simplification
Contributor
Author
|
Recommend reviewing commit by commit as each commit breaks down cleanly as a unit of work, and has some explanatory commentary along the way. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Attempting to simplify the implementation of concat/concat_ws simplify & return type. Main motivator is that
return_typehad logic to find widest type; however the signatures already guaranteed that type coercion would coerce all input arguments to a single common type, meaning this return type calculation logic was unnecessary. Along the way simplify the simplify implementation.What changes are included in this PR?
Refactor simplify for concat & concat_ws, along with return type; fix tests accordingly (tests incorrectly passed in disparate types instead of a common expected type which the output of type coercion would provide).
Are these changes tested?
Existing tests.
Are there any user-facing changes?
No.