Skip to content

Telemetry for time to first token when streaming #605

@airhorns

Description

@airhorns

TTFT is a super key user experience metric I'd like to monitor across the various LLM providers I use. It'd be great to have a uniform way of measuring TTFT using all these instrumentations. I can see a couple ways of doing it:

  • emitting a span event when the first token is received
  • measuring the time until the first token and setting a new genai.time_to_first_token attribute (or similar) on the span.

This measurement wouldn't really apply to non-streaming use cases, but for streams that take many 10s of seconds, my janky version of it has proven really useful for showing what is TTFT vs actual streaming time in my own use case. See a WIP implementation here: 53b6bb4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions