Skip to content

Data track schema metadata#1994

Draft
ladvoc wants to merge 7 commits into
mainfrom
ladvoc/schema-metadata
Draft

Data track schema metadata#1994
ladvoc wants to merge 7 commits into
mainfrom
ladvoc/schema-metadata

Conversation

@ladvoc

@ladvoc ladvoc commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Adds support for attaching schema metadata to data tracks mirroring the implementation in livekit/rust-sdks#1159.

Usage:

// Publish
const track = await room.localParticipant.publishDataTrack({
  name: 'my_sensor_data',
  schema: { name: 'rgb', encoding: 'protobuf' },
  frameEncoding: 'protobuf',
});

// Subscribe
room.on(RoomEvent.DataTrackPublished, (track) => {
  console.log(track.info.schema);         // { name: 'rgb', encoding: 'protobuf' }
  console.log(track.info.frameEncoding);  // 'protobuf'
});

Note: the scope of this PR has been limited to exposing schema metadata fields; schema storage will be added in a follow-up.

Areas for reviewers to scrutinize:

  • Convention for Protobuf conversion
  • API shape of DataTrackSchemaId

@changeset-bot

changeset-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 8c5df6a

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

size-limit report 📦

Path Size
dist/livekit-client.esm.mjs 101.66 KB (+0.92% 🔺)
dist/livekit-client.umd.js 110.56 KB (+0.78% 🔺)

@1egoman 1egoman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me!

Comment thread src/room/data-track/outgoing/OutgoingDataTrackManager.test.ts
Comment thread src/room/data-track/schema.test.ts Outdated
Comment on lines +135 to +139

export const DataTrackSchemaEncoding = {
from(protocol: ProtocolDataTrackSchemaEncoding): DataTrackSchemaEncoding {
switch (protocol.value.case) {
case 'wellKnown':

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I'm a big fan of this pattern where you have a type and value which share the same name, and typescript merges them into one identifier which can be used in either context. However, @lukasIO has raised in the past this can be somewhat confusing if you don't understand what is going on.

I'll let him weigh in but I'm fine with this as is especially since it's not directly part of the external interface.

Comment on lines +77 to +90
/**
* Identifier for a data track schema.
*
* A compound identifier with two components: {@link name} and {@link encoding}.
*
* Two IDs are equal only if both components match; the same name with a different
* encoding refers to a distinct schema.
*/
export type DataTrackSchemaId = {
/** Name component of the identifier. Must be non-empty and no longer than 256 characters. */
name: string;
/** Encoding component of the identifier. */
encoding: DataTrackSchemaEncoding;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: It is weird to me to have a type which ends in -Id that is a non primative type. Should this be called something maybe like DataTrackSchemaReference or even just DataTrackSchema?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to changing this, but my thinking here was this acts like a compound ID or key in a database:

  • When publishing a track, you provide the ID of the schema it uses, not the schema itself.
  • When storing/retrieving the definition for the schema, it is referenced by this ID.

Maybe DataTrackSchemaKey would make more sense here since this isn't a primitive? Also worth noting, not that we necessarily need to align with it, but it is called -Id at the protocol level.

Comment thread src/room/data-track/schema.ts Outdated
Comment on lines +102 to +112
const WELL_KNOWN_TO_SCHEMA_ENCODING: Partial<
Record<ProtocolWellKnownSchemaEncoding, DataTrackSchemaEncoding>
> = {
[ProtocolWellKnownSchemaEncoding.PROTOBUF]: 'protobuf',
[ProtocolWellKnownSchemaEncoding.FLATBUFFER]: 'flatbuffer',
[ProtocolWellKnownSchemaEncoding.ROS1_MSG]: 'ros1Msg',
[ProtocolWellKnownSchemaEncoding.ROS2_MSG]: 'ros2Msg',
[ProtocolWellKnownSchemaEncoding.ROS2_IDL]: 'ros2Idl',
[ProtocolWellKnownSchemaEncoding.OMG_IDL]: 'omgIdl',
[ProtocolWellKnownSchemaEncoding.JSON_SCHEMA]: 'jsonSchema',
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I'm assuming this always is a 1:1 mapping? If so, you (I'll leave it up to you to decide) could avoid the duplication with something like:

Suggested change
const WELL_KNOWN_TO_SCHEMA_ENCODING: Partial<
Record<ProtocolWellKnownSchemaEncoding, DataTrackSchemaEncoding>
> = {
[ProtocolWellKnownSchemaEncoding.PROTOBUF]: 'protobuf',
[ProtocolWellKnownSchemaEncoding.FLATBUFFER]: 'flatbuffer',
[ProtocolWellKnownSchemaEncoding.ROS1_MSG]: 'ros1Msg',
[ProtocolWellKnownSchemaEncoding.ROS2_MSG]: 'ros2Msg',
[ProtocolWellKnownSchemaEncoding.ROS2_IDL]: 'ros2Idl',
[ProtocolWellKnownSchemaEncoding.OMG_IDL]: 'omgIdl',
[ProtocolWellKnownSchemaEncoding.JSON_SCHEMA]: 'jsonSchema',
};
const WELL_KNOWN_TO_SCHEMA_ENCODING = Object.fromEntries(
Object.entries(SCHEMA_ENCODING_TO_WELL_KNOWN)
.map(([key, value]) => [value, key])
) as Record<ProtocolWellKnownSchemaEncoding, string>;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add a Partial to get the types to work. However, I am still getting this error in CI:

error TS2550: Property 'fromEntries' does not exist on type 'ObjectConstructor'.

Probably missing something obvious.

Comment thread src/room/data-track/types.ts Outdated
Comment on lines 11 to 20
/** Options for publishing a data track. */
export type DataTrackOptions = {
name: string;

/** Schema describing frames sent on the track. */
schema?: DataTrackSchemaId;

/** Encoding of frames sent on the track. */
frameEncoding?: DataTrackFrameEncoding;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: To confirm - is there a relationship between schema encoding and frame encoding? Or can a user specify any type of frame encoding for any time of schema encoding?

If there is a relationship and there are some combinations that aren't allowed, I think it would be good at least in the public type (DataTrackOptions) to make sure these non allowed combinations will cause a type error. A way you could accomplish this could be add some generics to DataTrackOptions (with defaults so it is backwards compatible) and use those to key into WELL_KNOWN_TO_SCHEMA_ENCODING / WELL_KNOWN_TO_FRAME_ENCODING (or a similar type of structure which represents the allowed combinations) within the type.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great callout! Yes, the valid combinations are documented in the doc comments for the enum, however, I would like to have some validation of this. I'm not sure if it is feasible to make this validated at the type level (at least not in all client SDKs), but we could at least validate at publish time. Another option is to implement this on the SFU side and have it return an error during track publication if an invalid combination is used (possible advantage: no need to update this logic in every client if new schema/frame encoding cases are added in the future). What are your thoughts here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is feasible to make this validated at the type level (at least not in all client SDKs)

It's probably not something that would need to be done in all client sdks, but if this isn't something that is likely to change and in the future new relationships will be purely additive (ie, a -> b will always be invalid, but a -> (a newly introduced) c may be no longer added in the future) then I think at least in a web context, it would be typical for typescript to assert this. So IMO if it's not too hard it's worth doing.

Definitely though implement this as a "denylist" and not an "allowlist" so that future newly introduced combos will work in old sdks.

Another option is to implement this on the SFU side and have it return an error during track publication if an invalid combination is used

IMO it's worth doing this in both places, just like you'd have both client side and server side form validation.

@1egoman

1egoman commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Longer term, I want to discuss how to get this using the Serializer type but I think that can wait until schema related serialization / deserialization is being done in other sdks. And I don't think the migration path to that would be too crazy.

ladvoc and others added 3 commits July 2, 2026 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants