Skip to content

Node WarmTransferTask post-merge cleanup calls getJobContext after job context is gone #1895

Description

@dtran26

Description

In the Node agents SDK, beta.WarmTransferTask registers a post-merge caller-room cleanup listener after connect_to_caller succeeds. When either SIP participant disconnects after the bridge, the listener fires, but it throws before it can delete the caller room because the late event callback calls getJobContext() after the job/agent context is no longer available.

This means the SDK appears to detect the exact event needed for cleanup, but the built-in cleanup path fails with an unhandled rejection.

Environment

Scenario

  1. Original caller is in the caller room with the agent.
  2. WarmTransferTask dials the transfer destination into the human-agent room.
  3. The transfer destination accepts, and connect_to_caller bridges/moves the transfer destination into the caller room.
  4. The application closes the agent session with reason transfer_completed after the bridge, so the AI agent is no longer participating in the human-to-human call.
  5. Later, either human SIP participant hangs up.

Expected behavior

When the SDK-registered ParticipantDisconnected handler fires after the bridge, it should either:

  • delete the caller room / disconnect the remaining SIP participant successfully, or
  • no-op safely if cleanup is unavailable.

It should not throw an unhandled rejection from getJobContext().

Actual behavior

The SDK logs that a participant disconnected from the caller room, then throws because the handler calls getJobContext() after the job context is gone.

This reproduces when either side disconnects after the bridge:

participant disconnected from caller room, closing
    participantIdentity: "human-agent-sip"

Unhandled promise rejection
Error: no job context found, are you running this code inside a job entrypoint?
    at getJobContext (.../@livekit/agents/src/job.ts:49:11)
    at Room.onCallerParticipantDisconnected (.../@livekit/agents/src/beta/workflows/warm_transfer.ts:341:41)
    at Room.emit (node:events:520:35)
    at Room.Room.processFfiEvent (.../@livekit/rtc-node/src/room.ts:554:14)
    at FfiClient.Room.onFfiEvent (.../@livekit/rtc-node/src/room.ts:509:18)

And similarly for the original caller:

participant disconnected from caller room, closing
    participantIdentity: "sip_+1XXXXXXXXXX"

Unhandled promise rejection
Error: no job context found, are you running this code inside a job entrypoint?
    at getJobContext (.../@livekit/agents/src/job.ts:49:11)
    at Room.onCallerParticipantDisconnected (.../@livekit/agents/src/beta/workflows/warm_transfer.ts:341:41)

Source-level diagnosis

From the current implementation:

  • onEnter() captures the caller room while job context exists:
    • this._callerRoom = getJobContext().room
  • After connect_to_caller succeeds, the task registers a late room event handler:
    • this._callerRoom.on(RoomEvent.ParticipantDisconnected, this.onCallerParticipantDisconnected)
  • But the later callback constructs the room service client using getJobContext() again:
    • const rooms = new RoomServiceClient(getJobContext().info.url)

That late callback can run after the job entrypoint/agent session has completed, so getJobContext() throws before deleteRoom() is attempted.

Workaround attempted

We tried adding an application-level participant_left webhook reaper that lists participants and deletes the room when a post-transfer SIP room has one remaining SIP participant. That did not reliably tear down the original caller room in live testing, and in any case the SDK's own in-process listener is still throwing when it sees the disconnect.

Suggested fix

A few possible SDK-side fixes seem viable:

  1. Capture the JobContext, RoomServiceClient, or required connection/config values while still inside onEnter() / mergeCalls(), then use the captured values in onCallerParticipantDisconnected.
  2. Use getJobContext(false) in the late callback and no-op/log if the context is gone, avoiding the unhandled rejection.
  3. Expose an option to disable the built-in post-merge caller-room cleanup so applications can fully own teardown without racing the SDK handler.

Related community thread with the same live-test behavior: https://community.livekit.io/t/warmtransfertask-how-to-tear-down-a-2-party-sip-room-when-one-party-hangs-up-after-the-agent-has-left/1499

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions