Skip to content

[EventHub] Fix aiohttp ClientSession leak in WebSocketTransportAsync close/connect#46829

Open
MuhammadAliShahzad wants to merge 4 commits into
Azure:mainfrom
MuhammadAliShahzad:fix/eventhub-websocket-session-leak
Open

[EventHub] Fix aiohttp ClientSession leak in WebSocketTransportAsync close/connect#46829
MuhammadAliShahzad wants to merge 4 commits into
Azure:mainfrom
MuhammadAliShahzad:fix/eventhub-websocket-session-leak

Conversation

@MuhammadAliShahzad

Copy link
Copy Markdown

Summary

Fixes #46828.

WebSocketTransportAsync.close() calls self.sock.close() and self.session.close() sequentially without try/except. When sock.close() raises (e.g. the websocket is already disconnected or timed out), session.close() never runs and the aiohttp.ClientSession leaks. The sibling AsyncTransport.close() in the same file already handles this correctly — this PR brings WebSocketTransportAsync in line with that pattern.

WebSocketTransportAsync.connect() had two related leaks:

  • On reconnect, a previously-assigned self.session was not closed before being overwritten.
  • If ws_connect() raised anything other than ClientConnectorError, the newly-created session was not cleaned up.

Changes

  • azure/eventhub/_pyamqp/aio/_transport_async.py
    • close(): wrap sock.close() and session.close() independently in try/except, log via _LOGGER.debug(..., extra=self.network_trace_params) (same pattern as AsyncTransport.close() at line 370–385), and null state after close. self.connected = False moved outside the async with block to match the sibling.
    • connect(): close any previous self.session before assigning a new one; add a broad except Exception: clause that cleans up the new session before re-raising; refactor session cleanup into a small _close_session_safely() helper used by all three call sites.
  • tests/pyamqp_tests/asynctests/test_websocket_transport_async_unit.py (new)
    • 6 unit tests that mock aiohttp.ClientSession and assert the cleanup invariants — no live Event Hubs needed.
  • CHANGELOG.md: bullet under a new 5.15.2 (Unreleased) ### Bugs Fixed section.

Why the sibling sync transport is not affected

The sync WebSocketTransport in _transport.py uses the websocket-client library, not aiohttp, and does not maintain a separate ClientSession. Its close() (line 743) and connect() (line 655) already perform proper cleanup. So this fix is async-only.

Test plan

  • New unit tests fail against the previous (unfixed) code, for the right reasons (mocked sock.close() raises → session.close() never called)
  • All 6 new unit tests pass against the fixed code
  • mypy reports zero new errors in the modified file
  • pyright reports zero new errors in the modified file (pre-existing reportOptionalMemberAccess in unchanged _read/_write remain)
  • pylint clean for the modified file (only unrelated unknown-option-value warnings from the pylint-guidelines-checker plugin in my local env)
  • CI (azpysdk checks) — to be validated on PR run

Reproduction of the original bug

Run an EventHub consumer/producer over WebSocket transport (TransportType.AmqpOverWebsocket) for an extended period with intermittent network disruptions. The Unclosed client session errors appear in logs on each reconnection cycle and on shutdown.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 11, 2026 20:43
@github-actions github-actions Bot added Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs labels May 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thank you for your contribution @MuhammadAliShahzad! We will review the pull request and get back to you soon.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes aiohttp.ClientSession resource leaks in the async WebSocket transport for azure-eventhub by making close()/connect() exception-safe and adding unit coverage to prevent regressions.

Changes:

  • Made WebSocketTransportAsync.connect() close any prior session before reconnect and ensure new sessions are closed on unexpected ws_connect() failures.
  • Made WebSocketTransportAsync.close() resilient to exceptions from websocket/session close so cleanup still completes and state is reset.
  • Added focused unit tests for cleanup invariants and documented the fix in the changelog.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
sdk/eventhub/azure-eventhub/azure/eventhub/_pyamqp/aio/_transport_async.py Adds safe session cleanup helper; hardens connect()/close() against exceptions to prevent ClientSession leaks.
sdk/eventhub/azure-eventhub/tests/pyamqp_tests/asynctests/test_websocket_transport_async_unit.py New unit tests validating session/socket cleanup behavior across failure paths.
sdk/eventhub/azure-eventhub/CHANGELOG.md Notes the leak fix under a new 5.15.2 (Unreleased) entry.

Comment on lines +557 to +565
try:
if self.session is not None:
await self.session.close()
except Exception as e: # pylint: disable=broad-except
_LOGGER.debug(
"Error closing aiohttp session: %r", e, extra=self.network_trace_params
)
self.sock = None
self.session = None
@MuhammadAliShahzad

Copy link
Copy Markdown
Author

Thanks for the review. Applied the suggestion in 5394f07close() now calls _close_session_safely() instead of duplicating the session-close try/except. All 6 unit tests still pass.

@MuhammadAliShahzad

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@MuhammadAliShahzad

Copy link
Copy Markdown
Author

Pushed 7e8fbc3 to address Build Analyze failures:

  • Verify ChangeLogEntries: bumped VERSION in _version.py from 5.15.1 → 5.15.2 (so the verifier matches my new top-of-file entry) and trimmed the empty ### Features Added / ### Breaking Changes / ### Other Changes subsections.
  • Run MyPy: the failures are all pre-existing on main in files not touched by this PR (_common.py, _producer.py, _consumer.py, _consumer_async.py). Verified by running the same mypy command locally against upstream/main. Happy to defer to maintainers on whether those should be fixed in a separate PR or whether the CI mypy check needs to be relaxed.

@j7nw4r

j7nw4r commented May 22, 2026

Copy link
Copy Markdown
Member

Mind rebasing on current main? _common.py (one of your "pre-existing mypy" files) got a commit since your merge base, so a rebase will either clear Build Analyze or validate the claim against today's main. No conflicts; _transport_async.py itself hasn't moved.

MuhammadAliShahzad and others added 3 commits June 17, 2026 22:42
WebSocketTransportAsync.close() called self.sock.close() and
self.session.close() sequentially without try/except. If sock.close()
raised (e.g. the websocket was already disconnected or timed out),
session.close() never ran and the aiohttp ClientSession leaked.

WebSocketTransportAsync.connect() had two related leaks:
- on reconnect, a previously assigned self.session was not closed before
  being overwritten;
- if ws_connect() raised anything other than ClientConnectorError, the
  newly created session was not closed.

This change makes both methods exception-safe, matching the cleanup
pattern already used by AsyncTransport.close() in the same file
(log via _LOGGER.debug with extra=network_trace_params, broad-except
pragma, null state after close).

Adds unit tests under tests/pyamqp_tests/asynctests/ that exercise the
cleanup paths via mocked aiohttp ClientSession.

Fixes Azure#46828

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses Copilot review feedback: close() had inline session-cleanup
logic that duplicated _close_session_safely(). Replacing it with the
helper keeps the cleanup behavior and log message consistent across
close paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Build Analyze (Verify ChangeLogEntries) requires that the entry matching
VERSION in _version.py be the topmost entry in CHANGELOG.md, and that
no subsections under it are empty. Bump VERSION to 5.15.2 so the
verifier matches the new top-of-file entry, and remove the empty
Features Added / Breaking Changes / Other Changes subsections.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@MuhammadAliShahzad MuhammadAliShahzad force-pushed the fix/eventhub-websocket-session-leak branch from 7e8fbc3 to 031701e Compare June 17, 2026 20:42
@MuhammadAliShahzad

MuhammadAliShahzad commented Jun 17, 2026

Copy link
Copy Markdown
Author

Rebased onto current main. Now that #46988 is merged, the previously-failing mypy errors in _common.py (and the other untouched files) are fixed on main, so Build Analyze should come back green on this run. _transport_async.py itself was unchanged by the rebase.
@j7nw4r

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebSocketTransportAsync.close() leaks aiohttp ClientSession when sock.close() raises

3 participants