-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
Now that it seems like we can merge #20312 soon, we should implement the full-range of Arrow's canonical extension types. Currently, only UUID is supported in the PR.
This issue tracks adding the remaining canonincal extension types:
- Fixed shape tensor
- Variable shape tensor
- JSON
- UUID
- Opaque
- 8-bit Boolean
- Parquet Variant
- Timestamp With Offset
Describe the solution you'd like
Implement the DFExtensionType similar to UUID.
The question that remains is how we implement pretty-printing for these types.
- Do we try to pretty-print tensors?
- Do we pretty-print JSON using newlines?
- I guess Parquet Variant would benefit from a nice representation in tests/CLIs. @friendlymatthew maybe you have some 2 cents here?
Describe alternatives you've considered
We could implement the formatters within arrow-rs and just use them in DataFusion. But I am unsure where they best fit.
Maybe starting in DataFusion and migrating them to arrow-rs sometimes in the future (depending on a use case) is a good choice.
Additional context
Some (maybe) related issues I've found:
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request