Ldrozdz93/azure blob storage source by ldrozdz93 · Pull Request #1 · ldrozdz93/vector

ldrozdz93 · 2026-01-20T10:24:17Z

Summary

This PR adds Azure Blob Storage source. From the user's perspective, it's intended to work in a similar manner to AWS S3 source.

Vector configuration

sources:
  azure_blob:
    type: azure_blob
    connection_string: REDACTED
    container_name: logs
    queue:
      queue_name: eventgrid

sinks:
  console:
    type: console
    inputs:
      - azure_blob
    encoding:
      codec: json

How did you test this PR?

Unit tests.
Integration tests.
Manual tests, described in the steps.md.

Change Type

Bug fix
New feature
Non-functional (chore, refactoring, docs)
Performance

Is this a breaking change?

Yes
No

Does this PR include user facing changes?

Yes. Please add a changelog fragment based on our guidelines.
No. A maintainer will apply the no-changelog label to this PR.

References

Closes: Add Azure Blob Storage source vectordotdev/vector#13882 Related: #

Notes

make build-licenses was run to regenerate the license inventory.
historical context: This Azure source was originally implemented for vector 0.38 and run in our product MVP for a few months. It supported just a tiny subset of features, only precisely what we needed. During that time, I've rebased it on the latest vector versions at the time multiple times. Then we've dropped the Azure Blobs approach in our product. I picked this up again a few months ago, added the intended feature parity with AWS S3 source, updated docs, extended tests etc. Basically done all that was required to contribute this to open-source.
This is my first opensource vector contribution. I'd appreciate feedback related to both the code and the process.

PR_CHECKLIST.md

testing/github-XXXXX/test-results.md

website/cue/reference/components/sources/azure_blob.cue

+				Example connection string:
+
+				```text
+				DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net


website/cue/reference/components/sources/azure_blob.cue

+				container_name = "logs"
+
+				[sources.azure_logs.queue]
+				queue_name = "eventgrid"


Copilot

Pull request overview

This pull request introduces a new azure_blob source for Vector that enables reading logs from Azure Blob Storage via Event Grid notifications delivered through Azure Storage Queues.

Changes:

New Azure Blob Storage source with queue-based event processing
Support for compression (gzip, zstd), multiple codecs, and multiline aggregation
Comprehensive unit and integration tests
Documentation files and configuration examples

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/sources/azure_blob/mod.rs	Main source implementation with streaming and event processing
src/sources/azure_blob/queue.rs	Queue integration, blob retrieval, and Event Grid message processing
src/sources/azure_blob/test.rs	Unit tests for compression detection and blob processing
src/sources/azure_blob/integration_tests.rs	Integration tests covering various scenarios
src/internal_events/azure_queue.rs	Internal event definitions for metrics and logging
website/cue/reference/components/sources/azure_blob.cue	Component documentation and metadata
Cargo.toml	Dependency and feature flag additions
tests/integration/azure/config/*.yaml	Integration test configuration
testing/github-XXXXX/*	Test artifacts and documentation (should be removed)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/internal_events/mod.rs

src/sources/azure_blob/queue.rs

src/sources/azure_blob/mod.rs

PR_CHECKLIST.md

src/internal_events/azure_queue.rs

src/lib.rs

ldrozdz93 · 2026-02-20T07:38:52Z

@copilot review

Copilot

Pull request overview

Copilot reviewed 258 out of 292 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-20T07:47:35Z

scripts/environment/prepare.sh

 if contains_module cargo-deny; then
-  if ! cargo-deny --version 2>/dev/null | grep -q '^cargo-deny 0.18.9'; then
+  if ! cargo-deny --version 2>/dev/null | grep -q '^cargo-deny 0.19.0'; then
    cargo "${install[@]}" cargo-deny --version 0.18.9 --force --locked


The version gate checks for cargo-deny 0.19.0, but the install command still pins 0.18.9. This will keep reinstalling the wrong version (or fail policy expectations). Update the install line to install 0.19.0 so it matches the check.

Suggested change

cargo "${install[@]}" cargo-deny --version 0.18.9 --force --locked

cargo "${install[@]}" cargo-deny --version 0.19.0 --force --locked

Copilot · 2026-02-20T07:47:35Z

src/sources/util/wrappers.rs

+impl<T, D> Stream for LenientFramedRead<T, D>
+where
+    T: AsyncRead + Unpin,
+    D: Decoder,
+{
+    type Item = Result<D::Item, D::Error>;
+
+    fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
+        let mut this = self.project();
+        match this.inner.as_mut().poll_next(cx) {
+            Poll::Ready(Some(Err(DecoderError::IO(e))))
+                if e.kind() == io::ErrorKind::ConnectionReset =>
+            {
+                let buffer = this.inner.read_buffer();
+
+                if buffer.is_empty() {
+                    // Clean RST - no partial data, treat as EOF
+                    Poll::Ready(None)
+                } else {
+                    // Partial frame in buffer
+                    Poll::Ready(Some(Err(D::Error::from(e))))
+                }
+            }
+            other => other.map_err(|e| e.into_inner()),
+        }
+    }
+}


This poll_next implementation won’t compile as written: (1) Poll<...> doesn’t provide map_err here, and (2) D::Error::from(e) requires D::Error: From<io::Error>, but that bound is not present on this Stream impl. A concrete fix is to add D::Error: From<io::Error> to the where clause and replace the last arm with an explicit mapping that converts DecoderError<D::Error> into D::Error (e.g., map over Poll/Option/Result and call into_inner() on the DecoderError).

Copilot · 2026-02-20T07:47:36Z

src/top/cmd.rs

+    let mut starting_state = State::new(BTreeMap::new());
+    starting_state.sort_state.column = opts.sort_field;
+    starting_state.sort_state.reverse = opts.sort_desc;
+    starting_state.filter_state.column = opts.filter_field;
+    starting_state.filter_state.pattern = opts
+        .filter_value
+        .as_deref()
+        .map(Regex::new)
+        .and_then(Result::ok);


Invalid regex patterns are silently ignored (Result::ok()), which can make vector top --filter-value ... appear to do nothing without any feedback. Since this is user-provided CLI input, it should fail fast with a clear error (e.g., return a non-zero exit code and print the regex parse error), rather than disabling filtering implicitly.

Suggested change

let mut starting_state = State::new(BTreeMap::new());

starting_state.sort_state.column = opts.sort_field;

starting_state.sort_state.reverse = opts.sort_desc;

starting_state.filter_state.column = opts.filter_field;

starting_state.filter_state.pattern = opts

.filter_value

.as_deref()

.map(Regex::new)

.and_then(Result::ok);

// Validate the filter regex (if provided) and fail fast on errors.

let filter_pattern = match opts.filter_value.as_deref() {

Some(pattern) => match Regex::new(pattern) {

Ok(regex) => Some(regex),

Err(err) => {

#[allow(clippy::print_stderr)]

{

eprintln!(

"Invalid regular expression for --filter-value: {pattern}\nError: {err}"

);

}

return exitcode::USAGE;

}

},

None => None,

};

let mut starting_state = State::new(BTreeMap::new());

starting_state.sort_state.column = opts.sort_field;

starting_state.sort_state.reverse = opts.sort_desc;

starting_state.filter_state.column = opts.filter_field;

starting_state.filter_state.pattern = filter_pattern;

Copilot · 2026-02-20T07:47:36Z

src/transforms/sample/transform.rs

    }

-    fn increment(&mut self, group_by_key: &Option<String>, value: &Option<Cow<'_, str>>) -> bool {
+    fn increment(&mut self, group_by_key: Option<String>, value: Option<&Value>) -> bool {


Calling value.to_string_lossy() inside increment can allocate a temporary string for every sampled event and then immediately discard it. Since this code runs in the hot path for sampling, consider hashing a borrowed representation when possible (e.g., pass a Cow<'_, str>/&str into increment, or add a fast path for string/bytes Value variants) to avoid repeated allocations.

Copilot · 2026-02-20T07:47:36Z

src/transforms/sample/transform.rs

        };
        if let Some(value) = value {
-            self.hash_within_ratio(value.as_bytes())
+            self.hash_within_ratio(value.to_string_lossy().as_bytes())


Calling value.to_string_lossy() inside increment can allocate a temporary string for every sampled event and then immediately discard it. Since this code runs in the hot path for sampling, consider hashing a borrowed representation when possible (e.g., pass a Cow<'_, str>/&str into increment, or add a fast path for string/bytes Value variants) to avoid repeated allocations.

Suggested change

self.hash_within_ratio(value.to_string_lossy().as_bytes())

if let Some(bytes) = value.as_bytes() {

self.hash_within_ratio(bytes)

} else {

self.hash_within_ratio(value.to_string_lossy().as_bytes())

}

Copilot · 2026-02-20T07:47:36Z

src/test_util/mock/transforms/noop.rs

 impl FunctionTransform for NoopTransform {
    fn transform(&mut self, output: &mut OutputBuffer, event: Event) {
+        if let Some(delay) = self.delay {
+            std::thread::sleep(delay);
+        }
        output.push(event);
    }
 }


Using std::thread::sleep inside a FunctionTransform can block the Tokio runtime worker thread if the transform executes on it, causing unrelated async tasks/tests to stall and increasing flakiness. Since this delay is for deterministic latency tests, consider limiting delays to the TaskTransform path (which already uses tokio::time::sleep), or run the blocking sleep in a dedicated blocking context (e.g., tokio::task::block_in_place) with a comment explaining why blocking is safe here.

Copilot · 2026-02-20T07:47:37Z

src/sinks/azure_common/service.rs

+            let result = blob_client
+                .upload(
+                    RequestContent::from(request.blob_data.to_vec()),
+                    false,
+                    byte_size as u64,
+                    Some(upload_options),
+                )


This converts request.blob_data into a new Vec<u8> (to_vec()), which forces an extra allocation + full copy of the payload for every request. If the Azure SDK supports building RequestContent from the existing buffer type (e.g., Bytes/Vec<u8> directly), prefer that to avoid the copy.

Copilot · 2026-02-20T07:47:37Z

tests/integration/aws/config/compose.yaml

    image: docker.io/amazon/amazon-ecs-local-container-endpoints:latest
+    environment:
+    # https://github.com/vectordotdev/vector/issues/24687
+    - DOCKER_API_VERSION=1.44


There are now two different Docker API version workarounds in the repo (DOCKER_API_VERSION=1.44 here, vs DOCKER_API_VERSION=1.43 in scripts/test-e2e-kubernetes.sh). Consider standardizing on a single version (and referencing a single upstream issue) to reduce confusion and avoid drifting behavior between CI paths.

Suggested change

- DOCKER_API_VERSION=1.44

- DOCKER_API_VERSION=1.43

Copilot

Pull request overview

Copilot reviewed 20 out of 21 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-20T10:38:37Z

src/azure/mod.rs

+        let connection_string = ConnectionString::new(&connection_string)?;
+        let account_name = connection_string
+            .account_name
+            .ok_or("Account name missing in connection string")?;
+


The code unconditionally requires account_name from the connection string (ok_or(...)), but the comment below states it is not required when a custom blob_endpoint is provided. Either update the comment to match the actual requirement, or change the logic to only require account_name when it’s actually needed.

Copilot · 2026-02-20T10:38:38Z

src/sources/azure_blob/queue.rs

+                select! {
+                    _ = shutdown.clone() => {
+                        info!("Shutdown signal received, stopping Azure Blob queue polling.");
+                        break;
+                    },


Shutdown is only awaited in the messages.messages.is_empty() branch. If the queue is continuously non-empty, this loop will keep processing and never observe shutdown, preventing timely shutdown. Consider selecting on shutdown in the main loop (or at least between messages) so shutdown is responsive under load.

Addressed in a70d22f. Added a non-blocking shutdown check between message batches using shutdown.clone().now_or_never().

Copilot · 2026-02-20T10:38:38Z

src/sources/azure_blob/queue.rs

+                    .map(move |res| {
+                        res.inspect_err(|err| {
+                            error!("Framing error for blob '{}': {}", blob_for_error, err);
+                        })
+                        .ok()


Framing errors are logged and then dropped via .ok(), which can lead to partially processed blobs being treated as successful and the queue message being deleted, causing data loss. The AWS S3 source records a read/framing error and avoids ack/deletion in that case; mirroring that approach here would preserve at-least-once semantics.

Copilot · 2026-02-20T10:38:38Z

website/cue/reference/components/sources/azure_blob.cue

+				source_type: {
+					description: "The name of the source type."
+					required:    true
+					type: string: {
+						examples: ["azure_blob"]
+					}
+				}


The documented output fields don’t include container and blob, but the source always enriches events with these (Legacy namespace overwrites top-level fields; Vector namespace adds source metadata). Add them to the documented output schema so users can rely on these fields like bucket/object in aws_s3.

Copilot · 2026-02-20T10:38:39Z

website/cue/reference/components/sources/azure_blob.cue

+				ingest_timestamp: {
+					description: "The timestamp of when the blob was ingested by Vector."
+					required:    true
+					type: string: {
+						examples: ["2020-10-26T12:34:56.789Z"]
+					}
+				}


ingest_timestamp is documented here as a required top-level string field, but in the implementation it’s written as a timestamp under standard Vector metadata (.metadata.vector.ingest_timestamp). Either remove this from the top-level output fields (and rely on standard metadata docs), or document it in the correct namespace/type to avoid misleading users.

Suggested change

ingest_timestamp: {

description: "The timestamp of when the blob was ingested by Vector."

required: true

type: string: {

examples: ["2020-10-26T12:34:56.789Z"]

}

}

Copilot · 2026-02-20T10:38:39Z

website/cue/reference/components/sources/azure_blob.cue

+				The source automatically handles:
+				- Blob downloads with streaming to handle large files efficiently
+				- 404 errors for blobs that no longer exist
+				- Queue message visibility timeouts and retries


This section claims the source “automatically handles … queue message visibility timeouts and retries”, but the implementation does not set/extend message visibility (and there’s no retry/renewal logic). Please adjust the documentation to match the actual behavior or implement visibility timeout management.

Suggested change

- Queue message visibility timeouts and retries

Addressed in a70d22f. Updated CUE docs to accurately describe behavior: visibility timeout is set once at receive time via the new visibility_timeout_secs config (default 300s) and is NOT extended during processing. Removed misleading claims about automatic retry management. Also documented the new max_number_of_messages and delete_failed_message config options.

Copilot · 2026-02-20T10:38:39Z

testing/github-13882/steps.md

@@ -0,0 +1,164 @@
+# Testing Plan - Azure Blob Storage Source (#XXXXX)


The title still contains a #XXXXX placeholder. Update it to reference the actual issue/PR number (e.g. #13882) so the testing plan is traceable.

Suggested change

# Testing Plan - Azure Blob Storage Source (#XXXXX)

# Testing Plan - Azure Blob Storage Source (#13882)

Copilot · 2026-02-20T10:38:39Z

Cargo.toml

+azure_storage_queues = { version = "0.21", default-features = false, optional = true }
+
+# Needed to bridge with outdated version of azure_core used in azure_storage*
+azure_core_for_storage = { package = "azure_core", version = "0.21.0", default-features = false, features = ["enable_reqwest", "hmac_openssl"] }


azure_core_for_storage is declared as a non-optional dependency, which means the older azure_core (0.21.x) and its transitive deps will be pulled into every build even when Azure features are disabled. Make this dependency optional = true and gate it behind the existing azure / sources-azure_blob features (and update src/azure/mod.rs to compile accordingly).

Suggested change

azure_core_for_storage = { package = "azure_core", version = "0.21.0", default-features = false, features = ["enable_reqwest", "hmac_openssl"] }

azure_core_for_storage = { package = "azure_core", version = "0.21.0", default-features = false, features = ["enable_reqwest", "hmac_openssl"], optional = true }

anshul-thakkar · 2026-03-11T09:28:08Z

Hi @ldrozdz93, thanks for the work on adding the Azure Blob Storage source.
I wanted to check if there are any updates on this PR or if there’s anything blocking progress toward getting it reviewed/merged?

This feature would be very useful for Azure-based pipelines, so I’m keen to see it move forward. Let me know if there’s anything I can help with from the community side.

A new `azure_blob` source for reading logs from Azure Blob Storage containers via Azure Storage Queue notifications (Event Grid). Designed for feature parity with the existing `aws_s3` source. Key features: - Event-driven architecture using Azure Event Grid via Storage Queue - Connection string authentication - Configurable compression (gzip, zstd) with auto-detection - Configurable framing (newline-delimited, character-delimited, etc.) - Multiline aggregation for stack traces and multi-line logs - Event metadata enrichment (container, blob, timestamp) - Acknowledgement support

…r pattern Encapsulate the Arc<Mutex<Option<String>>> read_error flag as an implementation detail inside queue.rs, replacing the separate success_handler + read_error fields on BlobWithAck with a single completion_handler(StreamResult) that handles both queue message deletion and read-error checking internally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Make azure_core_for_storage dependency optional and gate it behind the azure feature flag - Reword CloudLocation::Custom comment for clarity - Add container and blob metadata fields to website docs - Remove inaccurate docs line about queue message visibility timeouts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove redundant default settings from config.toml to keep it minimal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

testing/github-13882/steps.md

+6. Graceful shutdown:
+     ```bash
+     sleep 5
+     pkill -SIGTERM vector


Copilot

Pull request overview

Copilot reviewed 19 out of 20 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Cargo.toml

+azure_storage_blobs = { version = "0.21", default-features = false, optional = true }
+azure_storage_queues = { version = "0.21", default-features = false, optional = true }
+
+# Needed to bridge with outdated version of azure_core used in azure_storage*


ldrozdz93 · 2026-03-14T11:51:09Z

src/sources/azure_blob/mod.rs

+                    (blob.completion_handler)(StreamResult::Errored).await;
+                    emit!(QueueMessageProcessingErrored {});
+                }
+                BatchStatus::Rejected => {


Addressed in a70d22f. Added delete_failed_message config option (default: true, matching S3 parity). Added StreamResult::Rejected variant. When delete_failed_message is true and batch is rejected, the completion handler is called with Rejected which deletes the queue message. When false, the message is retained. Added tests for both behaviors.

src/sources/azure_blob/mod.rs

+                // TODO: consider selecting with a shutdown
+                while let Some(chunk) = data_stream.next().await {
+                    bytes_received.emit(ByteSize(chunk.len()));
+                    let (events, _) = match decoder.deserializer_parse(chunk) {
+                        Ok(result) => result,
+                        Err(_error) => {
+                            // Error is handled by codecs::Decoder, no further handling needed
+                            continue;
+                        }
+                    };
+                    for mut event in events {
+                        event = event.with_batch_notifier_option(&batch);
+                        match event {
+                            Event::Log(ref mut log_event) => {
+                                log_namespace.insert_source_metadata(
+                                    AzureBlobConfig::NAME,
+                                    log_event,
+                                    Some(LegacyKey::Overwrite(path!("container"))),
+                                    path!("container"),
+                                    container.clone(),
+                                );
+                                log_namespace.insert_source_metadata(
+                                    AzureBlobConfig::NAME,
+                                    log_event,
+                                    Some(LegacyKey::Overwrite(path!("blob"))),
+                                    path!("blob"),
+                                    blob.clone(),
+                                );
+
+                                // Insert timestamp metadata following AWS S3 pattern
+                                let timestamp = chrono::Utc::now();
+                                match log_namespace {
+                                    LogNamespace::Vector => {
+                                        let ts_path = metadata_path!(AzureBlobConfig::NAME, "timestamp");
+                                        log_event.insert(ts_path, timestamp);
+                                        let ingest_path = metadata_path!("vector", "ingest_timestamp");
+                                        log_event.insert(ingest_path, timestamp);
+                                    }
+                                    LogNamespace::Legacy => {
+                                        if let Some(timestamp_key) = log_schema().timestamp_key() {
+                                            log_event.try_insert((PathPrefix::Event, timestamp_key), timestamp);
+                                        }
+                                    }
+                                }
+
+                                events_received.emit(CountByteSize(1, event.estimated_json_encoded_size_of()));
+                                yield event
+                            }
+                            _ => {
+                                emit!(InvalidRowEventType{event: &event})
+                            }


testing/github-13882/steps.md

+# Testing Plan - Azure Blob Storage Source (#XXXXX)
+
+## Context
+
+This PR introduces a new Vector source that reads logs from Azure Blob Storage by processing events from an Azure Storage Queue. The implementation aims at supporting the same features as the AWS S3 source, providing Event Grid integration for real-time blob processing.


src/sources/azure_blob/queue.rs

ldrozdz93 · 2026-03-14T11:50:44Z

src/sources/azure_blob/queue.rs

+        loop {
+            let messages = match queue_client.get_messages().number_of_messages(num_messages()).await {
+                Ok(messages) => messages,
+                Err(e) => {
+                    emit!(QueueMessageReceiveError{error: &e});
+                    continue;
+                }
+            };
+            if !messages.messages.is_empty() {
+                for message in messages.messages {
+                    let msg_id = message.message_id.clone();
+                    match process_event_grid_message(
+                        message,
+                        &container_client,
+                        &queue_client,
+                        compression,
+                        framer.clone(),
+                        multiline_config.clone(),
+                    ).await {
+                        Ok(Some(bp)) => yield bp,
+                        Ok(None) => trace!("Message {msg_id} processed, no blob stream produced (event may have been ignored or blob unavailable)."),
+                        Err(e) => {
+                            emit!(QueueMessageProcessingError{
+                                error: &e,
+                                message_id: &msg_id
+                            });
+                        }
+                    }
+                }
+            } else {


Addressed in a70d22f. Added a non-blocking shutdown check (shutdown.clone().now_or_never()) after processing each batch of messages, so the source observes shutdown even when messages keep arriving continuously.

ldrozdz93 · 2026-03-14T11:51:16Z

src/sources/azure_blob/queue.rs

+fn parse_subject(subject: String) -> Option<(String, String)> {
+    let parts: Vec<&str> = subject.split('/').collect();
+    if parts.len() < 7 {
+        warn!(
+            "Ignoring event: subject has invalid format (expected /blobServices/default/containers/{{container}}/blobs/{{blob}}), got: '{}'",
+            subject
+        );
+        return None;
+    }
+    let container = parts[4];
+    let blob = parts[6..].join("/");
+    Some((container.to_string(), blob))
+}


Addressed in a70d22f. parse_subject now validates the fixed path segments: parts[1] == "blobServices", parts[3] == "containers", parts[5] == "blobs". Added 4 new test cases covering invalid fixed segments.

src/sources/azure_blob/mod.rs

+/// Converts Content-Type header value to Compression enum.
+fn content_type_to_compression(content_type: &str) -> Option<Compression> {
+    match content_type {
+        "application/gzip" | "application/x-gzip" => Some(Compression::Gzip),
+        "application/zstd" => Some(Compression::Zstd),
+        _ => None,
+    }
+}


… rejected batch, subject validation - Add shutdown check between message batches to prevent indefinite processing - Add configurable max_number_of_messages (default 10, was hardcoded 32) and visibility_timeout_secs (default 300s) to queue config - Add delete_failed_message config (default true) with StreamResult::Rejected variant so rejected messages are properly deleted from the queue - Validate fixed path segments in parse_subject (blobServices/containers/blobs) - Update CUE docs to accurately describe visibility timeout and batch behavior - Split rejected test into delete/retain variants for new config option Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 19 out of 20 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Cargo.toml

 sources-aws_kinesis_firehose = ["dep:base64"]
 sources-aws_s3 = ["aws-core", "dep:aws-sdk-sqs", "dep:aws-sdk-s3", "dep:async-compression", "sources-aws_sqs", "tokio-util/io"]
 sources-aws_sqs = ["aws-core", "dep:aws-sdk-sqs"]
+sources-azure_blob = ["azure", "dep:azure_core", "dep:azure_storage_queues", "dep:base64", "dep:async-compression", "tokio-util/io"]


website/cue/reference/components/sources/azure_blob.cue

+		component_errors_total:                      components.sources.internal_metrics.output.metrics.component_errors_total
+		component_received_bytes_total:              components.sources.internal_metrics.output.metrics.component_received_bytes_total
+		component_received_event_bytes_total:        components.sources.internal_metrics.output.metrics.component_received_event_bytes_total
+		component_received_events_total:             components.sources.internal_metrics.output.metrics.component_received_events_total


website/cue/reference/components/sources/azure_blob.cue

+			"""
+				The Azure Blob Storage source requires an Azure Storage Queue configured to
+				receive Event Grid notifications for the desired Azure Blob Storage container.
+				The queue should be subscribed to BlobCreated events from the storage account.


src/sources/azure_blob/queue.rs

+            container: container_client.container_name(),
+            subject: &body.subject,
+            event_type: &body.event_type,
+        });


src/sources/azure_blob/queue.rs

+            configured_container: container_client.container_name(),
+            container: container.as_str(),
+        });
+


src/internal_events/azure_queue.rs

+        error!(
+            message = "Failed reading messages",
+            event = format!("{:?}", self.error),
+            error_code = "failed_fetching_azure_queue_events",
+            error_type = error_type::REQUEST_FAILED,
+            stage = error_stage::RECEIVING,
+        );


ldrozdz93 requested a review from Copilot January 20, 2026 10:24

github-actions bot added domain: sources domain: external docs labels Jan 20, 2026

Copilot started reviewing on behalf of ldrozdz93 January 20, 2026 10:24 View session

github-advanced-security bot found potential problems Jan 20, 2026

View reviewed changes

Copilot AI reviewed Jan 20, 2026

View reviewed changes

ldrozdz93 force-pushed the ldrozdz93/azure-blob-storage-source branch 2 times, most recently from e962d8b to 8a72d0d Compare January 21, 2026 16:11

ldrozdz93 force-pushed the ldrozdz93/azure-blob-storage-source branch from 8a72d0d to fa44b4a Compare February 20, 2026 07:35

github-actions bot added domain: sinks domain: transforms domain: topology domain: codecs domain: core domain: ci domain: vdev domain: releasing labels Feb 20, 2026

ldrozdz93 requested a review from Copilot February 20, 2026 07:44

Copilot AI reviewed Feb 20, 2026

View reviewed changes

ldrozdz93 force-pushed the ldrozdz93/azure-blob-storage-source branch from fa44b4a to 6fd2c39 Compare February 20, 2026 10:28

ldrozdz93 changed the base branch from master to blob-storage-rebase February 20, 2026 10:30

ldrozdz93 changed the base branch from blob-storage-rebase to master February 20, 2026 10:30

ldrozdz93 requested a review from Copilot February 20, 2026 10:31

Copilot started reviewing on behalf of ldrozdz93 February 20, 2026 10:31 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

ldrozdz93 force-pushed the ldrozdz93/azure-blob-storage-source branch from 6fd2c39 to 16b3efc Compare March 13, 2026 22:37

github-actions bot removed the domain: codecs label Mar 13, 2026

ldrozdz93 and others added 3 commits March 14, 2026 00:15

chore(azure_blob): simplify testing config

97f2ad9

Remove redundant default settings from config.toml to keep it minimal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ldrozdz93 force-pushed the ldrozdz93/azure-blob-storage-source branch from 16b3efc to 97f2ad9 Compare March 13, 2026 23:23

update tests results

4cf4ad2

github-advanced-security bot found potential problems Mar 13, 2026

View reviewed changes

testing/github-13882/steps.md

6. Graceful shutdown:

```bash

sleep 5

pkill -SIGTERM vector

Check failure

Code scanning / check-spelling

Unrecognized Spelling Error test

pkill is not a recognized word. (unrecognized-spelling)

ldrozdz93 changed the base branch from master to azure-blob-source March 13, 2026 23:41

ldrozdz93 changed the base branch from azure-blob-source to master March 13, 2026 23:41

ldrozdz93 requested a review from Copilot March 13, 2026 23:44

Copilot started reviewing on behalf of ldrozdz93 March 13, 2026 23:44 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

ldrozdz93 mentioned this pull request Mar 14, 2026

Add Azure Blob Storage source vectordotdev/vector#13882

Open

github-actions bot removed domain: sinks domain: transforms domain: topology domain: core domain: ci domain: vdev domain: releasing labels Mar 14, 2026

ldrozdz93 requested a review from Copilot March 14, 2026 11:51

Copilot started reviewing on behalf of ldrozdz93 March 14, 2026 11:52 View session

Copilot AI reviewed Mar 14, 2026

View reviewed changes

	cargo "${install[@]}" cargo-deny --version 0.18.9 --force --locked
	cargo "${install[@]}" cargo-deny --version 0.19.0 --force --locked

		@@ -0,0 +1,164 @@
		# Testing Plan - Azure Blob Storage Source (#XXXXX)

	azure_core_for_storage = { package = "azure_core", version = "0.21.0", default-features = false, features = ["enable_reqwest", "hmac_openssl"] }
	azure_core_for_storage = { package = "azure_core", version = "0.21.0", default-features = false, features = ["enable_reqwest", "hmac_openssl"], optional = true }

Conversation

ldrozdz93 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Vector configuration

How did you test this PR?

Change Type

Is this a breaking change?

Does this PR include user facing changes?

References

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check failure

Check failure

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldrozdz93 commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

ldrozdz93 Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 20, 2026

ldrozdz93 commented Jan 20, 2026 •

edited

Loading