Skip to content

Core, Data: Implementation of AvroFormatModel#15254

Merged
pvary merged 3 commits intoapache:mainfrom
pvary:avro_model
Feb 13, 2026
Merged

Core, Data: Implementation of AvroFormatModel#15254
pvary merged 3 commits intoapache:mainfrom
pvary:avro_model

Conversation

@pvary
Copy link
Copy Markdown
Contributor

@pvary pvary commented Feb 7, 2026

Part of: #12298
Implementation of the new API: #12774

AvroFormatModel and related tests

DataTestHelpers.assertEquals(
TestBase.SCHEMA.asStruct(), TEST_RECORDS.get(0), readRecords.get(0));
DataTestHelpers.assertEquals(
TestBase.SCHEMA.asStruct(), TEST_RECORDS.get(1), readRecords.get(1));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I'd prefer a loop here so that we could add more cases or pass a random data generator.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added random generated data and moved to looped check.
Please check the changes in DataTestHelpers

extends BaseFormatModel<D, S, DatumWriter<D>, DatumReader<D>, Schema> {

public static <D> AvroFormatModel<PositionDelete<D>, Object> forPositionDeletes() {
return new AvroFormatModel<>(PositionDelete.deleteClass(), null, null, null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't catch this earlier. Shouldn't this pass Void.class instead of null? There's no need to have a null class that could cause issues.

Types.StructType struct, List<Record> expected, List<Record> actual) {
assertThat(actual).hasSize(expected.size());
for (int i = 0; i < expected.size(); i += 1) {
assertEquals(struct, expected.get(i), actual.get(i), null, -1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call assertEquals(StructType, Record, Record) below? That seems like the correct choice, even though it just makes this same call.


@ParameterizedTest
@FieldSource("FILE_FORMATS")
public void testDataWriter(FileFormat fileFormat) throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This tests the read and write paths.

FormatModelRegistry.readBuilder(fileFormat, Record.class, inputFile)
.project(TestBase.SCHEMA)
.build()) {
readRecords = ImmutableList.copyOf(reader);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use reuseContainers and copy each record? It would be good to ensure that reuseContainers works for the object model, even if it is ignored (I think ORC ignores it).

Copy link
Copy Markdown
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. It would be good to fix a couple minor issues, like using null instead of Void.class and also exercise reuseContainers in the read path test.

Those aren't blockers so let's get this in to unblock the other commits.

@pvary pvary merged commit 6d565b2 into apache:main Feb 13, 2026
32 checks passed
@pvary
Copy link
Copy Markdown
Contributor Author

pvary commented Feb 13, 2026

Merged to main.
Thanks @rdblue for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants