Skip to content

Conversation

@hashmapybx
Copy link

Purpose

Linked issue:

This PR enhances the test coverage for the Lance file format implementation in Paimon. The changes include:

  1. Fix compilation errors: Corrected incorrect usage of generic parameters for FileWriter interface

  2. Comprehensive test coverage: Added 14 new test methods covering:

    • All supported numeric types (TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL)
    • All string and binary types (CHAR, VARCHAR, BINARY, VARBINARY, BYTES)
    • Time-related types (DATE, TIME, TIMESTAMP with various precisions)
    • Complex types (ARRAY, MULTISET, VARIANT)
    • Nested RowType structures
    • Unsupported types validation (MAP, TIMESTAMP_WITH_LOCAL_TIME_ZONE)
    • Configuration tests for batch size and memory settings
    • Projection scenarios for column pruning
    • Edge cases (empty RowType, single field types, mixed array types)
  3. Documentation improvements: Added descriptive comments for all test methods to improve code readability and maintainability

Tests

  • LanceFileFormatTest unit tests in paimon-lance/src/test/java/org/apache/paimon/format/lance/LanceFileFormatTest.java

All 16 test methods verify:

  • testCreateReaderFactory - Basic reader factory creation
  • testCreateWriterFactory - Basic writer factory creation
  • testValidateDataFields_UnsupportedType_Map - Validation rejects MAP type
  • testValidateDataFields_UnsupportedType_LocalZonedTimestamp - Validation rejects TIMESTAMP_WITH_LOCAL_TIME_ZONE
  • testValidateDataFields_SupportedTypes_Basic - Basic supported types validation
  • testValidateDataFields_AllNumericTypes - All numeric types validation
  • testValidateDataFields_AllStringTypes - All string/binary types validation
  • testValidateDataFields_TimeTypes - Time types with different precisions
  • testValidateDataFields_ComplexTypes - Arrays, multisets, and variants
  • testValidateDataFields_NestedRowType - Nested structures
  • testReaderFactory_WithProjectedTypes - Column pruning scenarios
  • testReaderFactory_BatchSizeConfiguration - Batch size configuration
  • testWriterFactory_BatchSizeConfiguration - Writer batch and memory configuration
  • testValidateDataFields_EmptyRowType - Edge case: empty RowType
  • testValidateDataFields_SingleFieldTypes - Single field type scenarios
  • testValidateDataFields_MixedArrayTypes - Mixed array element types
  • testValidateDataFields_VariantType - VARIANT type support

API and Format

No. This change only affects test code and does not modify any public APIs or storage formats.

Documentation

No. This is a test enhancement with no new features introduced.

JingsongLi and others added 30 commits September 24, 2025 16:22
Zouxxyy and others added 30 commits October 30, 2025 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.