Skip to content

[VL] Support config velox parquet writer option enableStoreDecimalAsInteger for write parquet#12213

Open
lifulong wants to merge 3 commits into
apache:mainfrom
lifulong:support_config_parquet_store_decimal_as_integer_for_velox
Open

[VL] Support config velox parquet writer option enableStoreDecimalAsInteger for write parquet#12213
lifulong wants to merge 3 commits into
apache:mainfrom
lifulong:support_config_parquet_store_decimal_as_integer_for_velox

Conversation

@lifulong

@lifulong lifulong commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

…compatible with spark conf spark.sql.parquet.writeLegacyFormat

What changes are proposed in this pull request?

Support config spark.sql.parquet.writeLegacyFormat while use native write, compatible with Vanilla spark.
Velox doesn’t expose any config to control how Parquet decimal columns are actually written.
I have added this parameter via PR facebookincubator/velox#16941.
This feature is really useful when Spark or Flink reads Hive tables using ParquetHiveSerDe defined in Hive CREATE TABLE statements, especially with older Hive versions like 2.1.
With Velox’s current write logic, it decides whether to write decimals as int or fixed_len_byte_array based on precision.
When write decimal use Int32/Int64 will cause Spark and Flink to throw exceptions when reading those Hive tables.

How was this patch tested?

test at our produce env

Was this patch authored or co-authored using generative AI tooling?

co-authored with cursor

@lifulong

lifulong commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

#11839
This duplicates the previous PR. It was closed due to unresolved dependencies and long-term inactivity, and I couldn't find a way to reopen it.

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 1, 2026
@lifulong lifulong force-pushed the support_config_parquet_store_decimal_as_integer_for_velox branch from dd9d876 to 99dfebc Compare June 1, 2026 10:18
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@lifulong lifulong force-pushed the support_config_parquet_store_decimal_as_integer_for_velox branch 2 times, most recently from ffc96c8 to 5d48667 Compare June 1, 2026 11:15
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

…compatible with spark conf spark.sql.parquet.writeLegacyFormat
@lifulong lifulong force-pushed the support_config_parquet_store_decimal_as_integer_for_velox branch from 5d48667 to e86171c Compare June 1, 2026 11:25
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Comment thread cpp/velox/utils/VeloxWriterUtils.cc
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@lifulong lifulong force-pushed the support_config_parquet_store_decimal_as_integer_for_velox branch from 9a9b95c to 9b073f5 Compare June 4, 2026 08:56
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants