Skip to content

Provide Struct data type support for oracle plugin#659

Open
vanshikaagupta22 wants to merge 1 commit intodata-integrations:developfrom
cloudsufi:oracle-structsupport
Open

Provide Struct data type support for oracle plugin#659
vanshikaagupta22 wants to merge 1 commit intodata-integrations:developfrom
cloudsufi:oracle-structsupport

Conversation

@vanshikaagupta22
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for resolving Oracle STRUCT types into CDAP RECORD schemas by querying the ALL_TYPE_ATTRS metadata table. The implementation includes recursive resolution for nested structures and an updated mapping for primitive Oracle types. Feedback highlights several critical issues: the metadata query lacks an OWNER filter which could lead to incorrect schema resolution in multi-schema environments; fully qualified type names containing dots will cause IllegalArgumentException when creating CDAP records; and there are logic errors in the primitive type mapping, specifically regarding Oracle-specific type naming conventions and an invalid comparison between Java class names and SQL type strings.

Comment on lines +177 to +180
String sql = "SELECT * FROM ALL_TYPE_ATTRS WHERE TYPE_NAME = ? ORDER BY ATTR_NO";

try (PreparedStatement stmt = connection.prepareStatement(sql)) {
stmt.setString(1, typeName.substring(typeName.lastIndexOf('.') + 1));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The query to ALL_TYPE_ATTRS should include a filter on the OWNER column. Without it, if multiple schemas define a type with the same name, the query will return attributes for all of them, leading to an incorrect or corrupted schema.

Additionally, the typeName provided by the JDBC driver is often fully qualified (e.g., OWNER.TYPE_NAME). You should parse the owner from the typeName string if present, and fallback to the provided schemaName otherwise.

if (attrSchema != null) {
fields.add(Schema.Field.of(attrName, attrSchema));
} else {
Schema nestedSchema = getStructSchema(connection, schemaName, attrTypeName);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When recursing for nested STRUCT types, the schemaName should be updated to the owner of the attribute's type to ensure the subsequent query filters correctly. This owner can be retrieved from the ATTR_TYPE_OWNER column in the ALL_TYPE_ATTRS result set.

schemaName, typeName));
}

return Schema.recordOf(typeName, fields);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

CDAP Schema.recordOf requires the record name to be a valid identifier (matching [A-Za-z_][A-Za-z0-9_]*). If typeName is fully qualified (e.g., MY_SCHEMA.MY_TYPE), this call will throw an IllegalArgumentException due to the dot. Consider using only the simple type name part for the record name.

    String recordName = typeName.contains(".") ? typeName.substring(typeName.lastIndexOf('.') + 1) : typeName;
    return Schema.recordOf(recordName, fields);

Comment on lines +211 to +244
case "TIMESTAMP WITH TZ":
return isTimestampOldBehavior ? Schema.of(Schema.Type.STRING) : Schema.of(Schema.LogicalType.TIMESTAMP_MICROS);
case "TIMESTAMP WITH LTZ":
return getTimestampLtzSchema();
case "TIMESTAMP":
return Schema.of(Schema.LogicalType.DATETIME);
case "DATE" :
return Schema.of(Schema.LogicalType.DATE);
case "BINARY FLOAT":
case "FLOAT":
return Schema.of(Schema.Type.FLOAT);
case "BINARY DOUBLE":
case "DOUBLE":
return Schema.of(Schema.Type.DOUBLE);
case "BFILE":
case "RAW":
case "LONG RAW":
return Schema.of(Schema.Type.BYTES);
case "INTERVAL DAY TO SECOND":
case "INTERVAL YEAR TO MONTH":
case "VARCHAR2":
case "VARCHAR":
case "CHAR":
case "CLOB":
case "BLOB":
case "LONG":
return Schema.of(Schema.Type.STRING);
case "INTEGER":
return Schema.of(Schema.Type.INT);
case "NUMBER":
case "DECIMAL":
// FLOAT and REAL are returned as java.sql.Types.NUMERIC but with value that is a java.lang.Double
if (Double.class.getTypeName().equals(typeName)) {
return Schema.of(Schema.Type.DOUBLE);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There are two issues in mapPrimitiveOracleType:

  1. Type Name Mismatches: The strings used in the switch (e.g., "TIMESTAMP WITH TZ", "BINARY FLOAT") likely do not match the values returned by Oracle in ALL_TYPE_ATTRS.ATTR_TYPE_NAME. For example, Oracle typically uses "TIMESTAMP WITH TIME ZONE", "BINARY_FLOAT", and "BINARY_DOUBLE". Please verify these against the Oracle metadata documentation.
  2. Logic Error: At line 243, Double.class.getTypeName().equals(typeName) compares the Java class name string ("java.lang.Double") with the SQL type name (e.g., "NUMBER"), which will always be false. Since metadata is not available here to check the column class, this check should probably be removed or replaced with a check on the SQL type name if applicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant