Skip to content

[Storage] Fixed IndexError in azure-storage-blob-changefeed when listing directory marker blobs#47555

Open
weirongw23-msft wants to merge 1 commit into
Azure:mainfrom
weirongw23-msft:weirongw23/fix-changefeed-directory-marker-paths
Open

[Storage] Fixed IndexError in azure-storage-blob-changefeed when listing directory marker blobs#47555
weirongw23-msft wants to merge 1 commit into
Azure:mainfrom
weirongw23-msft:weirongw23/fix-changefeed-directory-marker-paths

Conversation

@weirongw23-msft

Copy link
Copy Markdown
Member

No description provided.

@weirongw23-msft weirongw23-msft marked this pull request as ready for review June 18, 2026 11:09
Copilot AI review requested due to automatic review settings June 18, 2026 11:09
@github-actions github-actions Bot added the Storage Storage Service (Queues, Blobs, Files) label Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an IndexError in azure-storage-blob-changefeed encountered when listing change feed segments in accounts where the idx/segments/ hierarchy includes directory marker blobs, by adding validation to skip non-segment paths.

Changes:

  • Added segment-path validation to skip directory marker blobs when enumerating segment blobs.
  • Added a CHANGELOG entry documenting the bug fix.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
sdk/storage/azure-storage-blob-changefeed/azure/storage/blob/changefeed/_models.py Adds _is_valid_segment_path and uses it to filter out non-segment blob paths during segment listing.
sdk/storage/azure-storage-blob-changefeed/CHANGELOG.md Documents the fix under the unreleased version’s “Bugs Fixed” section.

Comment on lines +299 to +304
# A valid segment path is of the form "idx/segments/YYYY/MM/DD/HHMM/<file>".
# Directory marker blobs (e.g. "idx/segments/2026/02/20") have too few tokens to
# represent a segment and must be skipped to avoid an IndexError while parsing.
path_tokens = segment_path.split(PATH_DELIMITER)
if len(path_tokens) < 6:
return False
paths = self.client.list_blobs(name_starts_with=SEGMENT_COMMON_PATH + str(start_year))
for path in paths:
yield path.name
# Skip directory marker blobs that does not conform to the expected segment path shape.
Comment on lines 282 to +288
while not start_year or start_year <= cur_year:
paths = self.client.list_blobs(name_starts_with=SEGMENT_COMMON_PATH + str(start_year))
for path in paths:
yield path.name
# Skip directory marker blobs that does not conform to the expected segment path shape.
# Azure Storage can return zero-length directory markers that are not real segment files.
if self._is_valid_segment_path(path.name):
yield path.name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Storage Storage Service (Queues, Blobs, Files)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug Report: ChangeFeedClient crashes with IndexError when encountering directory marker paths

2 participants