The open standard for defining Data Product Interfaces.
OpenDPI is to Data Products what OpenAPI is to REST APIs.
Data products are everywhere, but how do you describe them?
- Inconsistent documentation - Every team describes their data differently
- Discovery is hard - No standard way to find what data a product exposes
- Integration friction - Connecting to data products requires tribal knowledge
- Tooling fragmentation - Each platform has its own metadata format
OpenDPI provides a simple, vendor-neutral specification to:
- Describe what data your product exposes
- Document how to connect and access that data
- Define the exact shape of your data with JSON Schema
- Enable tooling, catalogs, and automation
opendpi: "1.0.0"
info:
title: Customer Analytics
version: "2.1.0"
description: Aggregated customer behavior metrics
connections:
analytics_db:
type: postgresql
host: analytics.db.example.com
variables:
database: analytics
schema: public
ports:
daily_metrics:
description: Daily aggregated customer metrics
connections:
- connection: "#/connections/analytics_db"
location: customer_daily_metrics
schema:
type: object
properties:
customer_id:
type: string
date:
type: string
format: date
total_orders:
type: integer
revenue:
type: number| Concept | Description |
|---|---|
| Connections | Where your data lives - any type, any infrastructure |
| Ports | What data you expose - the interface to your data product |
| Components | Reusable schema definitions for DRY documents |
- Type agnostic - Works with any data infrastructure (databases, streams, object storage, APIs)
- JSON Schema based - Familiar, well-tooled, widely supported
- Reference support - Use
$reffor DRY, reusable definitions - Lightweight - Minimal required fields, easy to get started
- Extensible - Custom variables per connection for any type
| Version | Status | Docs |
|---|---|---|
| v1 | Current | Getting Started · Specification · Schema |
See CONTRIBUTING.md for guidelines on proposing changes to the specification.