Skip to content

[VL] Fix native union use column type name as column name lead to result error#11832

Open
lifulong wants to merge 1 commit intoapache:mainfrom
lifulong:fix_native_union_column_name
Open

[VL] Fix native union use column type name as column name lead to result error#11832
lifulong wants to merge 1 commit intoapache:mainfrom
lifulong:fix_native_union_column_name

Conversation

@lifulong
Copy link
Copy Markdown
Contributor

@lifulong lifulong commented Mar 26, 2026

What changes are proposed in this pull request?

Fix native union result use column type name as column name, which lead to same data type column has same data, but is not right result. eg all string columns has same data value as the first string column

const auto name = outRowType->childAt(colIdx)->name();
result name is column type name
const auto name = outRowType->nameOf(colIdx);
result name is column name

with conf spark.gluten.sql.native.union=true and use sql like

with deduplicated_data as (
  select col1, col2, col3, col4, col6
  from (
    select
      u.col1,
      u.col2,
      u.col3,
      u.col4,
      u.col6,
      row_number() over (partition by u.col2 order by u.col5 desc) as rn
    from (
      select col1, col2, col3, col4, 98 as col5, '' as col6 from union_src_a
      union all
      select col1, col2, col3, col4, 100 as col5, '' as col6 from union_src_b
    ) u
  ) t
  where t.rn = 1
)
select col1, col2, col3, col4
from deduplicated_data
where col1 != 'valueC'
union all
select col1, col2, col3, col4
from deduplicated_data
where col1 = 'valueC'
``` can reproduce the error

## How was this patch tested?

<!--
Describe how the changes were tested, if applicable.
Include new tests to validate the functionality, if necessary.
For UI-related changes, attach screenshots to demonstrate the updates.
-->
test at our produce env, and add unit test
## Was this patch authored or co-authored using generative AI tooling?
yes, co-authored by cursor
<!--
If generative AI tooling has been used in the process of authoring this patch, please include the
phrase: 'Generated-by: ' followed by the name of the tool and its version.
If no, write 'No'.
Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
-->

@github-actions github-actions bot added the VELOX label Mar 26, 2026
@lifulong lifulong force-pushed the fix_native_union_column_name branch 4 times, most recently from fd8d6bd to eeac8c7 Compare March 27, 2026 01:27
@lifulong
Copy link
Copy Markdown
Contributor Author

the error is clear, reproduce error result need complex union sql, maybe no need add unittest, i am not sure

@lifulong lifulong force-pushed the fix_native_union_column_name branch from c67261f to 06b7341 Compare March 27, 2026 02:31
@lifulong lifulong changed the title fix native union use column type as name lead to result error Fix native union use column type name as column name lead to result error Mar 27, 2026
@zhouyuan zhouyuan changed the title Fix native union use column type name as column name lead to result error [VL] Fix native union use column type name as column name lead to result error Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants