Skip to content

feat: add OpenMetadataEntityTag CRD#5

Open
berimbolo13 wants to merge 5 commits intomasterfrom
feat/entity-tag-crd
Open

feat: add OpenMetadataEntityTag CRD#5
berimbolo13 wants to merge 5 commits intomasterfrom
feat/entity-tag-crd

Conversation

@berimbolo13
Copy link
Copy Markdown
Collaborator

@berimbolo13 berimbolo13 commented May 5, 2026

Description

Adds a new OpenMetadataEntityTag CRD for declaratively tagging OpenMetadata entities the operator does not create directly (tables, topics, databaseSchemas, databases, dashboards, mlmodels, pipelines, containers, searchIndexes), i.e. entities discovered by OM's ingestion pipelines.

Each CR specifies:

  • spec.match.entityType: which OM entity type to target
  • spec.match.includes / excludes: Lucene-style FQN patterns (* and ? wildcards) selecting the assets in scope
  • spec.tag.tagFQN: the tag to apply (e.g. Tier.Tier3)
  • spec.openMetadataConnectionRef: the cluster-scoped connection

The reconciler:

  1. Resolves the OM connection and looks up the tag's UUID
  2. Queries OM's /v1/search/query endpoint with the include/exclude patterns to find currently-matching entities
  3. Compares against status.TagAssignments (what we applied last reconcile) and either:
    • Steady state: bulk-adds the tag to newly-matched entities and bulk-removes it from entities that fell out of scope
    • Rename (status records a different tag than spec): bulk-adds the new tag to all matched entities, then bulk-removes the old tag from every entity recorded under it
  4. Persists the new state to status.TagAssignments, sorted by FQN
  5. On CR deletion, finalizer removes the tag from all recorded assets before releasing

Why

OM's ingestion pipelines write entities directly to the OM API, bypassing Kubernetes, so we have no native way to declaratively tag them.

Design notes

  • Bulk endpoints: tagging uses OM's PUT /v1/tags/{id}/assets/add and .../remove, which take a tag UUID + list of asset references. One HTTP call per CR per reconcile, regardless of how many entities match.
  • Status invariant: every entry in status.TagAssignments shares the same tagFQN. Simplifies rename detection (one comparison) and deletion (one bulk-remove).

Introduces a new CRD for declaratively tagging OpenMetadata entities
discovered by ingestion (tables, topics, schemas, etc.). The reconciler
queries OM's search endpoint for entities matching FQN patterns and
applies a tag via the bulk tag-asset endpoints, recording assignments
in status to drive drift detection and rename cleanup.
@berimbolo13 berimbolo13 marked this pull request as ready for review May 6, 2026 08:44
Copilot AI review requested due to automatic review settings May 6, 2026 08:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new OpenMetadataEntityTag custom resource + controller path to declaratively apply/remove OpenMetadata tags on entities discovered via ingestion (i.e., not created by this operator), using OpenMetadata search + bulk tag-asset endpoints.

Changes:

  • Introduces the OpenMetadataEntityTag CRD/API types (match entity type + FQN patterns, tag FQN, connection ref) and wires the controller into cmd/main.go.
  • Implements reconciler/handler logic to search matched entities, diff vs. status.tagAssignments, and bulk add/remove tags (including rename + finalizer cleanup paths).
  • Extends the OpenMetadata client package with search + bulk tag-asset operations and adds unit/controller tests plus RBAC/CRD manifests.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/omclient/types.go Adds search/bulk-tag request/response types used by entity-tagging.
internal/omclient/interface.go Introduces EntityTagClient interface for tagging/search operations.
internal/omclient/entitytag.go Implements search pagination/query building and bulk add/remove tag-asset calls.
internal/handler/entitytag_index.go Maps supported entity types to OpenMetadata search index names.
internal/handler/entitytag_handler.go Core Observe→Compare→Converge logic, rename handling, and deletion finalizer cleanup.
internal/handler/entitytag_handler_test.go Unit tests for diffing, recorded tag lookup, and index resolution.
internal/controller/openmetadataentitytag_controller.go New controller wiring finalizer + delegation to handler.
internal/controller/openmetadataentitytag_controller_test.go Envtest coverage for controller flow (finalizer, apply, delete cleanup).
config/rbac/role.yaml Grants manager-role access to the new CR and its status/finalizers.
config/rbac/openmetadataentitytag_viewer_role.yaml Adds read-only ClusterRole for the new CR.
config/rbac/openmetadataentitytag_editor_role.yaml Adds editor ClusterRole for the new CR.
config/rbac/openmetadataentitytag_admin_role.yaml Adds admin ClusterRole for the new CR.
config/rbac/kustomization.yaml Includes the new RBAC role manifests.
config/crd/kustomization.yaml Includes the new CRD base.
config/crd/bases/openmetadata.vortexa.com_openmetadataentitytags.yaml New generated CRD manifest for OpenMetadataEntityTag.
cmd/main.go Registers the new controller with the manager.
api/v1alpha1/zz_generated.deepcopy.go Generated deepcopy updates for new API types.
api/v1alpha1/tag_types.go Adds TagRef API type.
api/v1alpha1/openmetadataentitytag_types.go Adds OpenMetadataEntityTag API types/spec/status.
api/v1alpha1/conditions.go Adds new condition reasons for entity-tagging reconcile outcomes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/handler/entitytag_handler.go Outdated
Comment thread internal/controller/openmetadataentitytag_controller.go
Comment on lines +45 to +55
// SearchEntities returns every entity in the named search index whose
// fullyQualifiedName matches any include pattern and no exclude pattern.
// Wildcards '*' (zero or more chars) and '?' (one char) in patterns are
// passed through to OpenMetadata's search endpoint unchanged. Pagination is
// handled internally.
func (c *Client) SearchEntities(ctx context.Context, searchIndex string, includes, excludes []string) ([]EntitySummary, error) {
if len(includes) == 0 {
return nil, nil
}
q := buildSearchQuery(includes, excludes)

Comment on lines +123 to +126
msg := fmt.Sprintf("Applied %s to %d %s entities", tagFQN, len(matched), et.Spec.Match.EntityType)
h.setConditionAndPersist(ctx, et, metav1.ConditionTrue, omv1alpha1.ReasonInSync, msg)
h.emitEvent(et, corev1.EventTypeNormal, omv1alpha1.ReasonInSync, msg)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants