Example setup for a Postgres database: service, metadata extraction, profiling, and data quality checks.
- The operator is running (see Getting started)
- A Kubernetes Secret with your OpenMetadata JWT token
- (Optional) Kubernetes Secrets with your database credentials, if using
valueFrom.secretKeyRefinstead of plainvalue:
Shared by all resources below.
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: OpenMetadataConnection
metadata:
name: om-connection
spec:
url: "http://openmetadata.openmetadata:8585/api"
authSecretRef:
name: openmetadata-api-secret
namespace: openmetadata
key: tokenapiVersion: openmetadata.vortexa.com/v1alpha1
kind: OpenMetadataService
metadata:
name: backend-db
namespace: default
spec:
forOpenMetadata:
serviceType: Postgres
displayName: "Backend Database"
connection:
config:
type:
value: Postgres
hostPort:
valueFrom:
secretKeyRef:
name: my-db-credentials
key: endpoint
database:
value: backend
username:
valueFrom:
secretKeyRef:
name: my-db-credentials
key: username
authType:
password:
valueFrom:
secretKeyRef:
name: my-db-credentials
key: password
sslMode:
value: prefer
supportsMetadataExtraction:
value: true
openMetadataConnectionRef: om-connectionExtracts tables from the public schema every 6 hours.
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: IngestionPipeline
metadata:
name: backend-db-metadata
namespace: default
spec:
forOpenMetadata:
pipelineType: metadata
service:
fullyQualifiedName: backend-db
type: databaseService
sourceConfig:
config:
type: DatabaseMetadata
markDeletedTables: true
schemaFilterPattern:
includes:
- public
airflowConfig:
scheduleInterval: "0 */6 * * *"
openMetadataConnectionRef: om-connectionComputes table and column metrics every 12 hours.
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: IngestionPipeline
metadata:
name: backend-db-profiler
namespace: default
spec:
forOpenMetadata:
pipelineType: profiler
service:
fullyQualifiedName: backend-db
type: databaseService
sourceConfig:
config:
type: Profiler
profileSample: 100
computeTableMetrics: true
computeColumnMetrics: true
airflowConfig:
scheduleInterval: "0 */12 * * *"
openMetadataConnectionRef: om-connectionA not-null check (no params), a range check (with params), and a custom SQL query on the orders table.
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: OpenMetadataTestCase
metadata:
name: orders-customer-id-not-null
namespace: default
spec:
forOpenMetadata:
testDefinition: columnValuesToBeNotNull
entityLink: "<#E::table::backend-db.backend.public.orders::columns::customer_id>"
openMetadataConnectionRef: om-connection
---
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: OpenMetadataTestCase
metadata:
name: orders-total-amount-range
namespace: default
spec:
forOpenMetadata:
testDefinition: columnValuesToBeBetween
entityLink: "<#E::table::backend-db.backend.public.orders::columns::total_amount>"
parameterValues:
- name: minValue
value: "0"
- name: maxValue
value: "100000"
openMetadataConnectionRef: om-connection
---
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: OpenMetadataTestCase
metadata:
name: orders-no-negative-totals
namespace: default
spec:
forOpenMetadata:
testDefinition: tableCustomSQLQuery
entityLink: "<#E::table::backend-db.backend.public.orders>"
parameterValues:
- name: sqlExpression
value: "SELECT COUNT(*) FROM orders WHERE total_amount < 0"
- name: strategy
value: "COUNT"
- name: threshold
value: "0"
displayName: "No Negative Order Totals"
openMetadataConnectionRef: om-connectionSchedule the test cases to run. Each table you want to test needs its own test suite pipeline.
apiVersion: openmetadata.vortexa.com/v1alpha1
kind: IngestionPipeline
metadata:
name: orders-test-suite
namespace: default
spec:
forOpenMetadata:
pipelineType: TestSuite
service:
fullyQualifiedName: backend-db.backend.public.orders.testSuite
type: testSuite
sourceConfig:
config:
type: TestSuite
entityFullyQualifiedName: backend-db.backend.public.orders.testSuite
airflowConfig:
scheduleInterval: "0 */12 * * *"
openMetadataConnectionRef: om-connectionkubectl get openmetadataservices,ingestionpipelines,openmetadatatestcasesEach resource's READY condition shows whether it has been successfully reconciled with OpenMetadata.