Skip to content

Latest commit

 

History

History
507 lines (394 loc) · 20.7 KB

File metadata and controls

507 lines (394 loc) · 20.7 KB

SQL interface specification for SAP ecosystem

Version 1.0.2, 2024-02-02

Authors: Alexander Böhm, Timm Falter, Arne Harren

Introduction

Scope

With the SQL interface specification for SAP ecosystem, we define a standard document format for describing and publishing metadata of database-level API resources of a (database) server to allow consumers to discover API resources and to interact with a server via SQL client interfaces and SQL query or data manipulation statements.

A SQL interface document focuses on the metadata of exposed database objects like database views and database procedures, e.g. fields of database views, data types, and parameters. In addition, it contains details for establishing data connections to the server via standard SQL client interfaces, i.e. ODBC or JDBC and corresponding drivers, or product-specific call-level interfaces.

Database objects exposed via SQL interfaces are considered stable with well-defined signatures, data semantics, and behavior – although the metadata description discussed here focuses on signatures. SQL interfaces are considered APIs and API evolution is expected to either happen in a backwards compatible way or via additional objects in a new API version and a deprecation of old APIs.

A SQL API is not exposed on the level of a conceptual or abstract model – instead, it is based on a specific database management system (DBMS) product, its features, and its capabilities for describing data structures, including the product-specific data types, i.e. the specification does not define a standardized set of data types. We consider the API to be positioned on the logical database design layer, describing the API objects’ data structures for SQL usage incl. uniqueness constraints and references between objects, but hiding physical details like index types, table partitioning, etc., which we consider implementation and tuning details which might change over time without violating the API contract.

SQL interface documents are represented in YAML or JSON. It is recommended that the corresponding file suffix .yaml or .json is used. While a consumer of SQL interface documents must support both representations, it is not required that providers can deliver both representations, i.e. it is sufficient that a provider returns SQL interface documents as JSON documents only.

Although SQL interface documents are mainly processed by machines, documents should still be human-readable. To support this, providers should create documents with a stable ordering of elements, e.g. list schemas and objects in alphabetical order, and should avoid unnecessary diffs between document versions.

A SQL API could be consumed externally in data federation, data integration, or (cross-system) analytics scenarios, or it could be used for database-level integration between two applications in the same database.

Specification-by-Examples

To illustrate the specification, we build aspects of it along a set of examples.

We follow ideas from the OpenAPI specification (cf. [1], [2]) and use the ABAP flight reference scenario (cf. [3]) as the basis for the exposed SQL views and data structures. The original flight scenario objects were adapted to illustrate extended features.

Examples are written in YAML.

Document Structure

A SQL interface document has the following structure:

  • sqlapi version header
  • info object with a description of the API and general metadata
  • servers array with information about (database) servers and connection details
  • objects object with metadata about the exposed objects like database views and their relationships, structured by logical object types and optionally structured along schemas or databases
  • components object with re-use definitions, e.g. common data types

Example:

sqlapi: "1.0.2"
info:
  version: 1.0.0
  title: Flight Scenario
  ...
servers:
  - ...
objects:
  schemas:
    FLIGHT:
      tableOriented:
        AGENCY:
          ...
        AIRPORT:
          ...
        BOOKING:
          ...
        CARRIER:
          ...
        TRAVEL:
          ...
        ...
      procedures:
        CREATE_TRAVEL:
          ...
        ...
components:
  types:
    atomic:
      DATE:
        atomic:
          name: DATE
      DECIMAL:
        atomic:
          name: DECIMAL
      INTEGER:
        atomic:
          name: INTEGER
      NVARCHAR(1):
        atomic:
          name: NVARCHAR
          length: 1
      NVARCHAR(40):
        atomic:
          name: NVARCHAR
          length: 40
      TIME:
        atomic:
          name: TIME
      ...
    table:
      ...
    ...

General API Information

The info object contains general information about the exposed SQL API, i.e. the API version, the title, a description, contact information, etc., and mandatory metadata of the underlying DBMS, i.e. DBMS kind and (relevant part of) version.

Example:

info:
  version: 1.0.0
  title: Flight Scenario
  description: Flight Scenario SQL API description
  contact:
    name: Flight Scenario contributors
    url: https://flight.example.com/
  license:
    name: Apache License, Version 2.0
    url: https://flight.example.com/license
  dbms:
    kind: Fictional DBMS
    version: 3.14.15

The version field specifies the version of the described SQL API. It is recommended to use Semantic Versioning (semver, cf. [4]) and to follow the rules of the semver specification.

The dbms section defines the kind and version of the underyling DBMS. As the specification does not define an abstraction for data types, the DBMS kind and version need to be used to resolve supported data types, data type semantics, DBMS features, and the SQL dialect.

API Servers and Connection Details

Like in OpenAPI (cf. [1]), a server object provides connection details for a (database) server which serves the API. In the context of SQL APIs, connection details can use different protocols like JDBC, ODBC, etc. with their parameters as key-value pairs.

The servers array contains a list of available API servers:

servers:
  - description: Main Flight Scenario SQL API server
    purposes:
      - federation
    connections:
      - odbc:
          host: api.flight.example.com
          port: 443
          encrypt: true
          parameter1: value1
          parameter2: value2
      - jdbc:
          url: jdbc:fictional-dbms://api.flight.example.com:443/?encrypt=true&parameter=value&other_parameter=other-value

The required database driver and related names of shared libraries (e.g. for ODBC) can be derived from the dbms kind from the info object.

The servers array itself is optional in a SQL interface document. This allows to specify APIs or API snippets without servers for re-use in other interface documents, i.e. referenced via a $ref reference.

The optional purposes array in a server object indicates the supported purpose types of a server, i.e. whether the server can be used for federated access or as a source for replication. The same physical server may expose a federation endpoint for access by business users and another endpoint which serves for federated access by technical users and for replication, e.g. with the same host and port, but different connection properties:

servers:
  - description: Server for federated data access with business users
    purposes:
      - federation
    connections:
      - odbc:
          host: api.flight.example.com
          port: 443
          endpoint: endpoint-value-1
  - description: Server for technical users and replication
    purposes:
      - federation
      - replication
    connections:
      - odbc:
          host: api.flight.example.com
          port: 443
          endpoint: endpoint-value-2

Currently, the defined purpose values are federation and replication.

description and connections are mandatory fields of a servers element.

Exposed Objects

An API contains a set of explicitly exposed SQL objects like views or procedures.

An object has a name, a description, a (logical) type, and further type-specific properties like columns or parameters. Columns and parameters have names and data types, e.g. atomic data types, structured types, or table types.

An atomic data type consists of a type name and optional length, precision, and scale values, e.g. a data type NVARCHAR(256) has the type name NVARCHAR and the length 256, and a DECIMAL(12, 2) has the type name DECIMAL and a precision of 12 and a scale of 2. Note that the type names and type semantics are DBMS-specific.

The following example contains the flight scenarios AIRPORT table as a simple view:

objects:
  schemas:
    FLIGHT:                              # the FLIGHT schema
      tableOriented:                     # set of table-oriented objects in the schema
        AIRPORT:                         # the AIRPORT view
          kind: view
          operations:                    # read (select) & modify (insert, update, delete) access
            - select
            - insert
            - update
            - delete
          columns:                       # ordered list of view columns
            - name: AIRPORT_ID           # column name
              type:                      # column type
                atomic:                  # an atomic type
                  name: NVARCHAR         # DBMS-specific atomic type NVARCHAR
                  length: 3              # with length 3
              notNull: true
            - name: NAME
              type:                      # column type defined as a type reference
                $ref: "#/components/types/atomic/NVARCHAR(40)"
            - name: CITY
              type:
                $ref: "#/components/types/atomic/NVARCHAR(40)"
            - name: COUNTRY_CODE
              type:
                $ref: "#/components/types/atomic/NVARCHAR(3)"
          constraints:
            - kind: unique               # a unique constraint
              columns:                   # on column AIRPORT_ID
                - AIRPORT_ID

Enumeration of API Objects

To enumerate the set of exposed API objects, we use a hierarchical structure with a mixture of logical type keys and names to reflect that e.g. a table-oriented object with a specific name exists inside a schema with a specific name.

The logical type key indicates an object's type from a consumer perspective, i.e. we use the type key tableOriented for tables, views, and table functions, which are different implementation variants of table-oriented API objects which are selectable or updateable. The type key procedures is used for procedures, which are callable API objects.

Example:

objects:
  schemas:
    A_SCHEMA:
      tableOriented:
        A_VIEW
          ...
        ANOTHER_VIEW:
          ...
        ...
      procedures:
        A_PROCEDURE:
          ...
        ...
      ...

This way, each object can be addressed with a path which allows internal and external references to point to individual objects inside an API specification, e.g. $ref "#/objects/schemas/FLIGHT/tableOriented/AIRPORT" which points to the AIRPORT view of the FLIGHT schema. The value of a $ref JSON reference is a JSON pointer [5]. Note that escaping rules apply for the special characters ~ and /, cf. [5] for details.

The path stays stable even if an API object's implementation changes, e.g. a table is replaced by a view, or a table function is replaced by a view with parameters or vice versa. Note that for some DBMS, a view with parameters, where all parameters are optional, might require a different query syntax than a view without parameters, e.g. SELECT ... FROM A_SCHEMA.A_VIEW() vs. SELECT ... FROM A_SCHEMA.A_VIEW. Therefore, such a change is considered an incompatbile API change.

As some (database) servers might require a schema context or even a database context, we allow optional schema and database sub-objects for structuring, i.e.

objects:
  tableOriented:
    A_VIEW:
      ...
    ...
  procedures:
    ...
objects:
  schemas:
    A_SCHEMA:
      tableOriented:
        A_VIEW:
          ...
        ...
      procedures:
        ...
objects:
  databases:
    A_DATABASE:
      schemas:
        A_SCHEMA:
          tableOriented:
            A_VIEW:
              ...
            ...
          procedures:
            ...

Types of API Objects

Table-oriented Objects

Tables, views, views with parameters, and table functions are unified as table-oriented objects.

Table-oriented objects are listed in the API objects enumeration as a tableOriented map from string to table-oriented object, where the map key defines the object's name.

Fields of a table-oriented object:

Field Name Type Description
name string DERIVED. Name of the object; derived from the element's key in the tableOriented map.
description string An optional description of the object.
kind string An optional indicator of the tableOriented object's implementation kind, e.g. table, view, parameterized-view, table-function; a change of the kind is not considered incompatible unless it is visible to the API consumer due to a different access syntax.
operations [Operation string] An optional array with the set of supported operations, e.g. select, insert, update, delete, replicate.
columns [Field object] REQUIRED. Non-empty array with a definition of the object's columns.
constraints [Constraint object] An optional array with column constraints of the object, e.g. columns with unique values.
parameters [Parameter object] An optional array of the object's (input) parameters.
accessPaths [Access Path object] An optional array of access path specifications.

Procedures

Procedures are listed in the API objects enumeration as a procedures map from string to Procedure object, where the map key defines the procedure's name.

Fields of a Procedure object:

Field Name Type Description
name string DERIVED. Name of the object; derived from the element's key in the procedures map.
description string An optional description of the procedure.
operations [Operation string] An optional array with the set of supported operations, e.g. execute.
parameters [Parameter object] REQUIRED. Array of procedure's parameters.

Additional Data Structures

Field

A Field object describes a column of a view or a table type, or a field in a structured type, etc., i.e. it is used inside a columns array.

Fields of a Field object:

Field Name Type Description
name string REQUIRED. Name of the field.
description string An optional description of the field.
type Type object REQUIRED. Type of the field.
notNull boolean Optional NOT NULL property of the field; default false.

Parameter

A Parameter object is used to describe a single parameters inside a parameters array of a tabluar objects, e.g. a view or table function, or a procedure.

Fields of a Parameter object:

Field Name Type Description
name string REQUIRED. Name of the parameter.
description string An optional description of the parameter.
mode string REQUIRED. Indicator of the parameter mode: in, out, inout.
type Type object REQUIRED. Type of the parameter.
optional boolean An optional indicator that the parameter is optional; default is false.

Constraint

A constraint defines a restriction of the values of some columns, e.g. uniqueness for the values of a certain column or the value pairs of two specific columns.

Fields of a Constraint object:

Field Name Type Description
kind string REQUIRED. The kind of constraint, e.g. unique.
columns [string] REQUIRED. Non-empty array with the constraint's column names.

Access Paths

By using query predicates on the attributes that are contained in an access path, query operations on a SQL data source can be executed in an efficient way. Currently, we envision two kinds of access paths to be provided. Indexes allow for efficient point queries targeting only a few records, while partitions can also be used to accelerate complex analytical queries but provide less benefit for lookup operations.

Fields of an Access Path object:

Field Name Type Description
kind string REQUIRED. The kind of access path that is provided, i.e. index or partition.
columns [string] REQUIRED. Non-empty array with the column names the access path is provided for.

Type

The construction of types is generic, i.e. it is possible to construct a table type with columns of tables or arrays of tables. As a SQL interface description always exists in the context of a real DBMS, the real type construction is limited by this system's capabilities. The metadata about the DBMS is contained in the info section of the SQL interface document.

Fields of a Type object:

Field Name Type Description
$ref Reference Optional reference to a type definition.
atomic Atomic Type object Optional atomic type definition.
table Table Type object Optional table type definition.
array Array Type object Optional array type definition.
structure Structure Type object Optional structure type definition.

Exactly one of the optional fields must be set.

Example:

A_VIEW:
  columns:
    - name: COL1
      type:
        atomic:
          name: NVARCHAR
          length: 40
    - name: COL2
      type:
        array:
          type:
            atomic:
              name: INTEGER
    - name: COL3
      type:
        structure:
          fields:
            - name: FIELD1
              type:
                atomic:
                  name: INTEGER
            - name: FIELD2
              type:
                atomic:
                  name: INTEGER

Atomic Type

Field Name Type Description
name string REQUIRED. Name of the atomic type.
length number Optional length of the type; type-dependent setting.
precision number Optional precision of the type; type-dependent setting.
scale number Optional scale of the type; type-dependent setting.

Table Type

Field Name Type Description
columns [Field object] REQUIRED. Non-empty array with a definition of the type's columns.

Array Type

Field Name Type Description
type Type object REQUIRED. Data type of the array's elements.
length number Optional max. length of the array.

Structure Type

Field Name Type Description
fields [Field object] REQUIRED. Non-empty array with a definition of the type's fields.

Re-usable Aspects

OpenAPI defines a components object which contains re-use aspects for the API specification. For a SQL interface, such a section could be used to e.g. define symbolic names for common data types:

components:
  types:
    atomic:
      NVARCHAR(3):
        atomic:
          name: NVARCHAR
          length: 3
      ...

A data type from the re-use aspects can be referenced with its path instead of defining the type inline, e.g.

  - name: AIRPORT_ID
    type:
      atomic:
        name: NVARCHAR
        length: 3

can be written as

  - name: AIRPORT_ID
    type:
      $ref: "#/components/types/atomic/NVARCHAR(3)"

Note that the components/types object contains dedicated sub-objects for array, atomic, structure, and table type definitions to have a readable path like #/components/types/atomic/NVARCHAR(3). Each type definition still needs to be a complete type definition including its type keyword, i.e. array, atomic, structure, and table appear pair-wise.

Extensions

The specification reserves property names starting with x- for custom extensions at certain locations of a SQL interface document. This way, a provider can add and transport additional metadata which is not covered by the specification (yet).

References

[1] The OpenAPI Specification, Version 3.1.0, 2021. https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.1.0.md

[2] The OpenAPI Specification, Petstore example, Version 3.0, 2020. https://github.com/OAI/OpenAPI-Specification/blob/master/examples/v3.0/petstore-expanded.yaml

[3] ABAP Flight Reference Scenario https://help.sap.com/docs/ABAP_PLATFORM_NEW/fc4c71aa50014fd1b43721701471913d/a9d7c7c140a0408dbc5966c52d156b49.html, https://github.com/SAP-samples/abap-platform-refscen-flight

[4] Semantic Versioning, Version 2.0.0, 2013. https://semver.org/spec/v2.0.0.html

[5] RFC 6901, JavaScript Object Notation (JSON) Pointer, 2013. https://www.rfc-editor.org/rfc/rfc6901.txt