IcebergCatalog- updateNamespaceProperties on single commit#4013
IcebergCatalog- updateNamespaceProperties on single commit#4013rambleraptor wants to merge 2 commits intoapache:mainfrom
Conversation
| public boolean updateProperties( | ||
| Namespace namespace, Set<String> removals, Map<String, String> updates) | ||
| throws NoSuchNamespaceException { | ||
| PolarisResolvedPathWrapper resolvedEntities = resolvedEntityView.getResolvedPath(namespace); |
There was a problem hiding this comment.
Why not using resolvedEntityView.getResolvedPath(ResolvedPathKey.ofNamespace(namespace)); like in other methods here ?
| if (!updates.isEmpty()) { | ||
| catalog.setProperties(namespace, updates); | ||
| } | ||
| if (catalog instanceof IcebergCatalog polarisCatalog) { |
There was a problem hiding this comment.
Due to that we have a direct dependency with IcebergCatalog.
As CatalogHandlerUtils implements the SupportNamespace, it should not be "linked" to any catalog specific implementation.
I think it should be cleaner to isolate this in a specific method (for clarity).
There was a problem hiding this comment.
Yes, having instanceof checks in this case does not look nice from the design POV.
If we have to have custom handling for the internal catalog (which is ok), perhaps it is time to refactor this code to use proper polymorphic catalog objects and allow implementations to decide how to best handle updates.
Federated catalogs will have to delegate to the appropriate API.
| catalog.setProperties(namespace, updates); | ||
| } | ||
| if (catalog instanceof IcebergCatalog polarisCatalog) { | ||
| polarisCatalog.updateProperties(namespace, removals, updates); |
There was a problem hiding this comment.
nit: I think if both updates and removals are empty, there's no need to call updateProperties no?
| * @param updates the map of properties to update | ||
| * @return true if the properties were updated, false otherwise | ||
| */ | ||
| public boolean updateProperties( |
There was a problem hiding this comment.
The returned boolean is never used / checked.
| if (!realmConfig.getConfig(FeatureConfiguration.ALLOW_NAMESPACE_LOCATION_OVERLAP)) { | ||
| LOGGER.debug("Validating no overlap with sibling tables or namespaces"); | ||
| validateNoLocationOverlap( | ||
| NamespaceEntity.of(updatedEntity), resolvedEntities.getRawParentPath()); | ||
| } else { | ||
| LOGGER.debug("Skipping location overlap validation for namespace '{}'", namespace); | ||
| } | ||
| if (!realmConfig.getConfig( | ||
| BehaviorChangeConfiguration.ALLOW_NAMESPACE_CUSTOM_LOCATION, catalogEntity)) { | ||
| if (updates.containsKey(PolarisEntityConstants.ENTITY_BASE_LOCATION)) { | ||
| validateNamespaceLocation(NamespaceEntity.of(entity), resolvedEntities); | ||
| } | ||
| } |
There was a problem hiding this comment.
These checks appear twice, can we extract a common utility method?
|
|
||
| catalog.updateProperties( | ||
| NS, | ||
| ImmutableSet.of("prop1", "prop2", "missing_prop"), |
There was a problem hiding this comment.
Nit: let's keep one old prop to assert that it is preserved after the update:
| ImmutableSet.of("prop1", "prop2", "missing_prop"), | |
| ImmutableSet.of("prop1", "missing_prop"), |
| @@ -2614,4 +2615,30 @@ public void testPaginatedListNamespaces() { | |||
| } | |||
| } | |||
| } | |||
|
|
|||
| @Test | |||
| public void testUpdateNamespacePropertiesAtomic() { | |||
There was a problem hiding this comment.
It might be preferable to promote this test to PolarisRestCatalogIntegrationBase, which acts almost like a TCK for Polaris. WDYT?
| @@ -2614,4 +2615,30 @@ public void testPaginatedListNamespaces() { | |||
| } | |||
| } | |||
| } | |||
|
|
|||
| @Test | |||
| public void testUpdateNamespacePropertiesAtomic() { | |||
There was a problem hiding this comment.
What in this test ensures that the operation is "atomic"? 🤔
| Set<String> missing = Sets.difference(removals, startProperties.keySet()); | ||
|
|
||
| if (!updates.isEmpty()) { | ||
| catalog.setProperties(namespace, updates); |
There was a problem hiding this comment.
Does this mean Polaris propagates namespace property updates to Federated catalogs even if CatalogEntity.isStaticFacade() is true? @dennishuo WDYT?
| if (!updates.isEmpty()) { | ||
| catalog.setProperties(namespace, updates); | ||
| } | ||
| if (catalog instanceof IcebergCatalog polarisCatalog) { |
There was a problem hiding this comment.
Yes, having instanceof checks in this case does not look nice from the design POV.
If we have to have custom handling for the internal catalog (which is ok), perhaps it is time to refactor this code to use proper polymorphic catalog objects and allow implementations to decide how to best handle updates.
Federated catalogs will have to delegate to the appropriate API.
|
Thanks for you contribution, @rambleraptor 👍 |
First time Polaris contributor here! I've been working mostly on the Iceberg side.
We ran the PyIceberg Catalog Tests against Polaris (apache/iceberg-python#3106) and discovered that Polaris can't handle adding / removing namespace properties at the same time.
This is because Polaris attempts to handle this as two commits (which run into concurrency issues), when the Iceberg spec supports a single commit.
This PR adds support for creating this single commit for Iceberg Catalogs.
Checklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)