Skip to content

functions: Add dict support for get field#21115

Open
brancz wants to merge 1 commit intoapache:mainfrom
polarsignals:get-field-dict
Open

functions: Add dict support for get field#21115
brancz wants to merge 1 commit intoapache:mainfrom
polarsignals:get-field-dict

Conversation

@brancz
Copy link
Contributor

@brancz brancz commented Mar 23, 2026

Which issue does this PR close?

Closes #21113

What changes are included in this PR?

Support structs in get_field.

Are these changes tested?

Yes, see tests, and I also replaced this in our code base and it had the effect we wanted.

Are there any user-facing changes?

No just added support for existing APIs.

@alamb

@github-actions github-actions bot added the functions Changes to functions implementation label Mar 23, 2026
@brancz brancz force-pushed the get-field-dict branch 2 times, most recently from f836dd1 to 92d660d Compare March 23, 2026 22:00
Comment on lines +403 to +405
datafusion_common::DataFusionError::Execution(
"Field name must be a non-empty string".to_string(),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datafusion_common::DataFusionError::Execution(
"Field name must be a non-empty string".to_string(),
)
exec_err!("Field name must be a non-empty string")

Comment on lines +224 to +226
datafusion_common::DataFusionError::Execution(format!(
"Field {field_name} not found in dictionary struct"
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datafusion_common::DataFusionError::Execution(format!(
"Field {field_name} not found in dictionary struct"
))
exec_err!("Field {field_name} not found in dictionary struct")

Comment on lines +216 to +219
datafusion_common::DataFusionError::Internal(format!(
"Failed to downcast dictionary with key type {}",
key_type
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datafusion_common::DataFusionError::Internal(format!(
"Failed to downcast dictionary with key type {}",
key_type
))
internal_err!("Failed to downcast dictionary with key type {key_type}")

($key_ty:ty) => {{
let dict = array
.as_any()
.downcast_ref::<arrow::array::DictionaryArray<$key_ty>>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use arrow::array::DictionaryArray and

Suggested change
.downcast_ref::<arrow::array::DictionaryArray<$key_ty>>()
.downcast_ref::<DictionaryArray<$key_ty>>()

))
})?;
// Rebuild dictionary: same keys, extracted field as values.
let new_dict = arrow::array::DictionaryArray::<$key_ty>::try_new(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let new_dict = arrow::array::DictionaryArray::<$key_ty>::try_new(
let new_dict = DictionaryArray::<$key_ty>::try_new(

}

match key_type.as_ref() {
DataType::Int8 => extract_dict_field!(arrow::datatypes::Int8Type),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DataType::Int8 => extract_dict_field!(arrow::datatypes::Int8Type),
DataType::Int8 => extract_dict_field!(Int8Type),

and import them with use arrow::datatypes::...

Box::new(child_field.data_type().clone()),
);
current_field =
Arc::new(Field::new(child_field.name(), dict_type, nullable));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to clone the child_field as the Map/Struct arms do ?
This would preserve any metadata of the field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support dict encoded structs in get_field

2 participants