Skip to content

Feature request: first-class hook to attach BigQuery job labels to generated queries #11054

Description

@darrenjl

Problem

When Cube executes queries against BigQuery, there is currently no supported way to attach BigQuery job labels to the underlying jobs. Job labels appear in INFORMATION_SCHEMA.JOBS.labels and are the standard mechanism for attributing query cost and usage back to an application, user, or request.

Without labels, the only way to correlate a Cube request with a BigQuery job is via timestamp matching — fragile and imprecise under concurrent load.

This was raised previously in #7753, where the reporter found a workaround by subclassing BigQueryDriver and reading requestId from the options parameter passed to query(). That workaround works but requires maintaining a custom driver subclass and is not documented or officially supported.

Use Case

Teams running Cube on BigQuery often want to attribute query cost and usage back to individual users, tenants, or upstream requests for observability and chargeback purposes. BigQuery job labels are the standard way to do this — they appear in INFORMATION_SCHEMA.JOBS.labels and can be used to slice cost and usage data.

This requires labels to be dynamic per request — derived from context available at query time such as the security context, request ID, or custom headers — not statically configured at server startup. For example:

SELECT * FROM `region-eu.INFORMATION_SCHEMA.JOBS`
WHERE labels['user_id'] = '123'
  AND labels['tenant'] = 'acme'

Proposed Solution

A new configuration hook — queryLabels or similar — that receives per-request context and returns a flat key-value map to attach as labels on the outgoing BigQuery job:

// cube.js
module.exports = {
  queryLabels: ({ securityContext, requestId }) => ({
    user_id: securityContext.userId,
    tenant: securityContext.tenantId,
    request_id: requestId,
  }),
};

The hook should receive at minimum:

  • securityContext — the resolved security context for the request
  • requestId — Cube's internal request ID

This would be BigQuery-specific initially but could generalise to other drivers that support query tagging (e.g. Snowflake query tags, Redshift query groups).

Alternatives Considered

  • Subclassing BigQueryDriver (current workaround from Support for BigQuery labels #7753) — works but requires custom driver maintenance and deep knowledge of driver internals
  • SQL commentsqueryRewrite operates at the Cube DSL level, not raw SQL, so injecting /* request_id=... */ comments is not possible without patching the driver
  • Timestamp correlation — works but is imprecise under concurrent load

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions