Core, Data: Implementation of AvroFormatModel#15254
Conversation
core/src/main/java/org/apache/iceberg/avro/AvroFormatModel.java
Outdated
Show resolved
Hide resolved
data/src/main/java/org/apache/iceberg/data/GenericFormatModels.java
Outdated
Show resolved
Hide resolved
data/src/main/java/org/apache/iceberg/data/GenericFormatModels.java
Outdated
Show resolved
Hide resolved
data/src/test/java/org/apache/iceberg/data/TestGenericFormatModels.java
Outdated
Show resolved
Hide resolved
| DataTestHelpers.assertEquals( | ||
| TestBase.SCHEMA.asStruct(), TEST_RECORDS.get(0), readRecords.get(0)); | ||
| DataTestHelpers.assertEquals( | ||
| TestBase.SCHEMA.asStruct(), TEST_RECORDS.get(1), readRecords.get(1)); |
There was a problem hiding this comment.
Minor: I'd prefer a loop here so that we could add more cases or pass a random data generator.
There was a problem hiding this comment.
Added random generated data and moved to looped check.
Please check the changes in DataTestHelpers
data/src/test/java/org/apache/iceberg/data/TestGenericFormatModels.java
Outdated
Show resolved
Hide resolved
data/src/test/java/org/apache/iceberg/data/TestGenericFormatModels.java
Outdated
Show resolved
Hide resolved
data/src/test/java/org/apache/iceberg/data/TestGenericFormatModels.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/avro/AvroFormatModel.java
Outdated
Show resolved
Hide resolved
| extends BaseFormatModel<D, S, DatumWriter<D>, DatumReader<D>, Schema> { | ||
|
|
||
| public static <D> AvroFormatModel<PositionDelete<D>, Object> forPositionDeletes() { | ||
| return new AvroFormatModel<>(PositionDelete.deleteClass(), null, null, null); |
There was a problem hiding this comment.
Sorry, I didn't catch this earlier. Shouldn't this pass Void.class instead of null? There's no need to have a null class that could cause issues.
| Types.StructType struct, List<Record> expected, List<Record> actual) { | ||
| assertThat(actual).hasSize(expected.size()); | ||
| for (int i = 0; i < expected.size(); i += 1) { | ||
| assertEquals(struct, expected.get(i), actual.get(i), null, -1); |
There was a problem hiding this comment.
Why not call assertEquals(StructType, Record, Record) below? That seems like the correct choice, even though it just makes this same call.
|
|
||
| @ParameterizedTest | ||
| @FieldSource("FILE_FORMATS") | ||
| public void testDataWriter(FileFormat fileFormat) throws IOException { |
There was a problem hiding this comment.
Nit: This tests the read and write paths.
| FormatModelRegistry.readBuilder(fileFormat, Record.class, inputFile) | ||
| .project(TestBase.SCHEMA) | ||
| .build()) { | ||
| readRecords = ImmutableList.copyOf(reader); |
There was a problem hiding this comment.
Should this use reuseContainers and copy each record? It would be good to ensure that reuseContainers works for the object model, even if it is ignored (I think ORC ignores it).
rdblue
left a comment
There was a problem hiding this comment.
Overall looks good. It would be good to fix a couple minor issues, like using null instead of Void.class and also exercise reuseContainers in the read path test.
Those aren't blockers so let's get this in to unblock the other commits.
|
Merged to main. |
Part of: #12298
Implementation of the new API: #12774
AvroFormatModel and related tests