Feature Type
Nice to have
Feature Description
Hello there! This is a follow-up design question to PR #6258, which fixes
the formatting gap (started_at/duration not being emitted
by to_dict()) and a small cleanup. This issue is the part I left
out of that PR because it changes the meaning of an existing field and should
be agreed on first.
Context:
JobContext.make_session_report,
SessionReport.
duration is gated on audio recording
In make_session_report, duration is only computed when
audio_recording_started_at is set:
if sr.audio_recording_started_at:
sr.duration = sr.timestamp - sr.audio_recording_started_at
(-> New code from PR #6258)
This is the case both in the original code and
in the adjusted code of PR #6258 and boils down to whether audio is being recorded in the session.
So a session with no audio recording gets duration = None, even though
started_at (session start) and timestamp (report build time) are both
available and would yield a perfectly valid duration. A session's duration
arguably shouldn't depend on whether audio happened to be recorded.
Additionally, the start point used here is the audio start
(recording_started_at, set on the first audio frame) while the end point is
the report build time (timestamp). So the two ends of duration come from
two different clocks/concepts.
What should duration mean semantically?
There seem to be two distinct concepts being confused:
- Session duration:
started_at (session start) -> session end. Probably
what a thing called SessionReport should report.
- Audio/call duration: first audio frame -> last audio frame. A property of
the recording, not the session.
The current duration mixes the two (audio start + report end). It's also worth
mentioning timestamp is when the report is built, not when the last audio
frame arrived, so it isn't a precise "end of audio" either.
Open questions
- Should
duration be the session duration (based on started_at and an
explicit session end), and audio timing be exposed separately?
- If audio/call timing is wanted, "recording length" and "session
duration" are different metrics, not both answers to one
question:
- Recording length could be derived from the encoded
audio.ogg itself
(total samples / sample rate, i.e. the sum of written frame durations). This
is the true length of the artifact and can't shift from the file.
- A wall-clock
audio_recording_ended_at - audio_recording_started_at is a
separate measurement that can be different with the file's real length depending
on how gaps etc. are handled, and on exactly when the end timestamp is sampled.
- Is there an incentive for an explicit session-end timestamp (e.g. set in
AgentSession.aclose)
instead of relying on report.timestamp?
Backward-compatibility note
SessionReport.to_dict() feeds observability, so redefining the meaning of
duration (audio-based -> session-based) is a behavior change a consumer could
rely on. It might be safer to keep an clear audio duration and add session
duration as its own clearly-named field rather than silently redefining
duration.
Proposed direction (open to feedback)
- Stop gating
duration on recording, compute it from session lifecycle
timestamps (started_at -> session end).
- Decide whether to introduce clearly-named audio/session timing fields rather
than overloading duration.
Happy to PR once the semantics here are agreed.
Workarounds / Alternatives
No response
Additional Context
No response
Feature Type
Nice to have
Feature Description
Hello there! This is a follow-up design question to PR #6258, which fixes
the formatting gap (
started_at/durationnot being emittedby
to_dict()) and a small cleanup. This issue is the part I leftout of that PR because it changes the meaning of an existing field and should
be agreed on first.
Context:
JobContext.make_session_report,SessionReport.durationis gated on audio recordingIn
make_session_report,durationis only computed whenaudio_recording_started_atis set:(-> New code from PR #6258)
This is the case both in the original code and
in the adjusted code of PR #6258 and boils down to whether audio is being recorded in the session.
So a session with no audio recording gets
duration = None, even thoughstarted_at(session start) andtimestamp(report build time) are bothavailable and would yield a perfectly valid duration. A session's duration
arguably shouldn't depend on whether audio happened to be recorded.
Additionally, the start point used here is the audio start
(
recording_started_at, set on the first audio frame) while the end point isthe report build time (
timestamp). So the two ends ofdurationcome fromtwo different clocks/concepts.
What should
durationmean semantically?There seem to be two distinct concepts being confused:
started_at(session start) -> session end. Probablywhat a thing called
SessionReportshould report.the recording, not the session.
The current
durationmixes the two (audio start + report end). It's also worthmentioning
timestampis when the report is built, not when the last audioframe arrived, so it isn't a precise "end of audio" either.
Open questions
durationbe the session duration (based onstarted_atand anexplicit session end), and audio timing be exposed separately?
duration" are different metrics, not both answers to one
question:
audio.oggitself(total samples / sample rate, i.e. the sum of written frame durations). This
is the true length of the artifact and can't shift from the file.
audio_recording_ended_at - audio_recording_started_atis aseparate measurement that can be different with the file's real length depending
on how gaps etc. are handled, and on exactly when the end timestamp is sampled.
AgentSession.aclose)instead of relying on
report.timestamp?Backward-compatibility note
SessionReport.to_dict()feeds observability, so redefining the meaning ofduration(audio-based -> session-based) is a behavior change a consumer couldrely on. It might be safer to keep an clear audio duration and add session
duration as its own clearly-named field rather than silently redefining
duration.Proposed direction (open to feedback)
durationon recording, compute it from session lifecycletimestamps (
started_at-> session end).than overloading
duration.Happy to PR once the semantics here are agreed.
Workarounds / Alternatives
No response
Additional Context
No response