Skip to content

openai-v2: BaseStreamWrapper.close() never awaits AsyncStream.close() — async streaming leaks connections/memory (OOM) #4710

@yulin-li

Description

@yulin-li

Describe your environment

  • opentelemetry-instrumentation-openai-v2==2.4b0 (latest release on PyPI; also applies to the earlier 2.x betas that ship the single sync-close StreamWrapper)
  • openai async client (AsyncOpenAI / AsyncAzureOpenAI), streaming chat completions
  • Python 3.13

What happened?

BaseStreamWrapper.close() is a synchronous method that calls self.stream.close() without awaiting it:

# src/opentelemetry/instrumentation/openai_v2/patch.py  (2.4b0, line 655)
def close(self):
    self.stream.close()
    self.cleanup()

For the async streaming path, async_chat_completions_create returns ChatStreamWrapper(BaseStreamWrapper) (patch.py:243), so self.stream is an openai.AsyncStream, whose close() is a coroutine:

# openai/_streaming.py
class AsyncStream(Generic[_T]):
    async def close(self) -> None:   # coroutine
        await self.response.aclose()
        ...

Calling self.stream.close() without await therefore:

  1. emits RuntimeWarning: coroutine 'AsyncStream.close' was never awaited, and
  2. never closes the underlying httpx response/connection, so the buffered response + connection are leaked on every streamed async turn.

This also defeats application-side cleanup: code that does await wrapper.close() gets None back (the sync close() returns None after creating-and-dropping the coroutine), so even a correct caller cannot close the real stream. Over a long-lived async process this accumulates as a connection/memory leak.

Steps to Reproduce

import asyncio
from openai import AsyncOpenAI
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

OpenAIInstrumentor().instrument()
client = AsyncOpenAI()

async def main():
    for _ in range(1000):
        stream = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "hi"}],
            stream=True,
        )
        # consume a little, then close early (mirrors cancellation / early-exit)
        async for _ in stream:
            break
        await stream.close()  # wrapper.close() is sync -> AsyncStream.close() never awaited

asyncio.run(main())
# observe: RuntimeWarning: coroutine 'AsyncStream.close' was never awaited
# and growing httpx connections / RSS

Expected Result

Closing an async streaming wrapper closes the underlying AsyncStream (awaits AsyncStream.close()), releasing the httpx response/connection. No coroutine ... was never awaited warning, no leak.

Actual Result

The AsyncStream.close() coroutine is created but never awaited; the underlying connection/response is leaked.

Additional context

This appears to already be fixed on main by #4500 (merged 2026-05-12), which introduced AsyncChatStreamWrapper extending util-genai's AsyncStreamWrapper, whose close() is now async and awaits the underlying stream (util/opentelemetry-util-genai/.../stream.py, async def close(self): ... await self._self_stream.close()).

However, no released version contains the fix2.4b0 is the latest on PyPI (published 2026-05-01) and predates #4500. Could a release be cut that includes the async-close fix? Until then, every released openai-v2 instrumentation version leaks AsyncStreams for async streaming callers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    gen-aiRelated to generative AI

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions