From ac2af2c715f4c683a837c3c9818ec9ffe4bf0a2a Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Tue, 23 Jun 2026 14:22:44 -0400 Subject: [PATCH 1/7] [SVLS-8979] Add CloudFormation template for Lambda Durable Function event forwarder Captures AWS Lambda Durable Function execution status change events from EventBridge and forwards them to the Datadog HTTP intake via Amazon Data Firehose. Records arrive at Datadog as the raw EventBridge envelope; reshaping is handled by the Datadog-side logs pipeline. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 91 ++++ .../durable_function_event_forwarder.yaml | 437 ++++++++++++++++++ .../release.sh | 62 +++ 3 files changed, 590 insertions(+) create mode 100644 aws_durable_function_event_forwarder/README.md create mode 100644 aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml create mode 100755 aws_durable_function_event_forwarder/release.sh diff --git a/aws_durable_function_event_forwarder/README.md b/aws_durable_function_event_forwarder/README.md new file mode 100644 index 0000000..94606d5 --- /dev/null +++ b/aws_durable_function_event_forwarder/README.md @@ -0,0 +1,91 @@ +# Datadog Lambda Durable Function Event Forwarder + +A self-contained CloudFormation template that captures AWS Lambda Durable +Function execution status change events and delivers them to the Datadog +HTTP intake via Amazon Data Firehose. Records arrive at Datadog as the +raw EventBridge envelope; any reshaping (field renaming, ARN qualifier +stripping, timestamp parsing) is configured on the Datadog side via a +logs processing pipeline. + +## Architecture + +``` +EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) + \ + -> S3 backup bucket (failed records only) +``` + +- The EventBridge rule subscribes to `aws.lambda` source events with + detail-type `Durable Execution Status Change` and routes them to + Firehose. +- Firehose forwards each record unchanged to + `https://aws-kinesis-http-intake.logs./v1/input` using the + Datadog API key as the `X-Amz-Firehose-Access-Key` header. The stack + does **not** attach any custom metadata to Firehose's outbound + requests; tagging and reshaping are handled on the Datadog side. +- Records the endpoint rejects are written to the S3 backup bucket + (`S3BackupMode: FailedDataOnly`); under normal operation the bucket + stays empty. + +## Parameters + +| Parameter | Required | Default | Description | +| --- | --- | --- | --- | +| `DdApiKey` | one of three | "" | Plaintext Datadog API key (`NoEcho`). | +| `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | +| `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | +| `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | +| `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses. | +| `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. | +| `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | + +`Rules.ApiKeyRequired` asserts at least one of the three API key parameters +is set and fails the stack action with a clear message otherwise. + +## Outputs + +| Output | Description | +| --- | --- | +| `DeliveryStreamArn` | Firehose delivery stream ARN. | +| `BackupBucketName` | S3 bucket name for failed records. | +| `EventRuleArn` | EventBridge rule ARN. | +| `ForwarderVersion` | Template version (from `Mappings.Constants`). | + +## Forwarded log shape + +The stack does **no transformation in AWS**. Firehose forwards each +EventBridge record to Datadog verbatim, so Datadog receives the raw +envelope. See AWS's +[Monitoring durable functions](https://docs.aws.amazon.com/lambda/latest/dg/durable-monitoring.html#durable-monitoring-eventbridge) +for the full event schema and the five `status` values (`RUNNING`, +`SUCCEEDED`, `FAILED`, `TIMED_OUT`, `STOPPED`): + +```json +{ + "version": "0", + "id": "d019b03c-a8a3-9d58-85de-241e96206538", + "detail-type": "Durable Execution Status Change", + "source": "aws.lambda", + "account": "123456789012", + "time": "2025-11-20T13:08:22Z", + "region": "us-east-1", + "resources": [], + "detail": { + "durableExecutionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-function:$LATEST/durable-execution/090c4189-b18b-4296-9d0c-cfd01dc3a122/9f7d84c9-ea3d-3ffc-b3e5-5ec51c34ffc9", + "durableExecutionName": "order-123", + "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-function:2", + "status": "RUNNING", + "startTimestamp": "2025-11-20T13:08:22.345Z" + } +} +``` + +Terminal states (`SUCCEEDED`, `STOPPED`, `FAILED`, `TIMED_OUT`) also include +an `endTimestamp`. + +### Datadog-side processing pipeline + +Install the **AWS Lambda** integration in Datadog; its out-of-the-box logs +pipeline is provisioned automatically and reshapes these events (field +renaming, ARN qualifier stripping, timestamp parsing, human-readable +message). No manual pipeline setup is required. diff --git a/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml b/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml new file mode 100644 index 0000000..5630c45 --- /dev/null +++ b/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml @@ -0,0 +1,437 @@ +AWSTemplateFormatVersion: "2010-09-09" +Description: >- + Captures AWS Lambda Durable Function execution status change events from + EventBridge, transforms them into Datadog log documents, and forwards them + to the Datadog HTTP intake via Amazon Data Firehose. + +Metadata: + AWS::CloudFormation::Interface: + ParameterGroups: + - Label: + default: Datadog API key (one required) + Parameters: + - DdApiKey + - DdApiKeySecretArn + - DdApiKeySsmParameterName + - Label: + default: Datadog routing + Parameters: + - DdSite + - Label: + default: Event filters (Optional) + Parameters: + - Statuses + - FunctionArnFilter1 + - FunctionArnFilter2 + - FunctionArnFilter3 + - FunctionArnFilter4 + - FunctionArnFilter5 + - Label: + default: Tuning + Parameters: + - BufferIntervalSeconds + ParameterLabels: + DdApiKey: { default: API key (plaintext) } + DdApiKeySecretArn: { default: Secrets Manager secret ARN } + DdApiKeySsmParameterName: { default: SSM SecureString parameter name } + DdSite: { default: Datadog site } + Statuses: { default: Statuses to forward (optional) } + FunctionArnFilter1: { default: Function ARN filter 1 (optional) } + FunctionArnFilter2: { default: Function ARN filter 2 (optional) } + FunctionArnFilter3: { default: Function ARN filter 3 (optional) } + FunctionArnFilter4: { default: Function ARN filter 4 (optional) } + FunctionArnFilter5: { default: Function ARN filter 5 (optional) } + BufferIntervalSeconds: { default: Firehose buffer interval (seconds) } + +Mappings: + Constants: + DdDurableEventForwarder: + Version: "0.1.0" + +Parameters: + # ---- Datadog API key (exactly one of the three is required) ---- + DdApiKey: + Type: String + NoEcho: true + Default: "" + Description: >- + Datadog API key. Provide a plaintext value here OR set DdApiKeySecretArn + OR DdApiKeySsmParameterName instead. + DdApiKeySecretArn: + Type: String + Default: "" + AllowedPattern: "^$|^arn:.*:secretsmanager:.*" + Description: >- + ARN of a Secrets Manager secret whose SecretString is the Datadog API + key. + DdApiKeySsmParameterName: + Type: String + Default: "" + AllowedPattern: "^$|^/[a-zA-Z0-9/_.-]+$" + Description: >- + Name (not ARN) of an SSM Parameter Store SecureString parameter that + holds the Datadog API key. + + # ---- Routing ---- + DdSite: + Type: String + Default: datadoghq.com + AllowedPattern: .+ + Description: Datadog site to deliver events to. + + # ---- Event filters ---- + Statuses: + Type: CommaDelimitedList + Default: "" + Description: >- + Comma-separated list of execution status values to forward. Valid values + are RUNNING, SUCCEEDED, FAILED, TIMED_OUT, and STOPPED. Leave empty (the + default) to forward every status. + # Up to 5 independent function-ARN filters. CloudFormation has no + # native iteration that fits AWS::Events::Rule.EventPattern (a Json blob, + # not a schema-typed list), so each slot is exposed as its own optional + # parameter. Each populated slot emits one wildcard matcher: the supplied + # UNqualified ARN with ":*" appended, since the event's functionArn always + # carries a version/alias qualifier. The AllowedPattern rejects a trailing + # qualifier so a pasted qualified ARN fails at deploy time instead of + # silently matching nothing. Slots left empty are removed from the + # EventPattern via AWS::NoValue, so they have no effect on the rendered rule. + FunctionArnFilter1: + Type: String + Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: >- + Optional UNqualified Lambda function ARN, or an EventBridge wildcard + pattern over one (for example + "arn:aws:lambda:us-east-2:123456789012:function:my-durable-*"), used to + restrict which functions' events are captured. Do not include a version + or alias suffix - ":*" is appended automatically to match any qualifier. + Scope by region and account by including them in the pattern. If all + five FunctionArnFilterN parameters are empty, the rule matches every + function in this region. + FunctionArnFilter2: + Type: String + Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter3: + Type: String + Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter4: + Type: String + Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter5: + Type: String + Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + + # ---- Tuning ---- + BufferIntervalSeconds: + Type: Number + Default: 60 + MinValue: 60 + MaxValue: 900 + Description: >- + Firehose buffer interval in seconds. Increasing this trades freshness + for fewer outbound requests; the maximum (900) is fine for low-volume + durable-execution streams. + + +Conditions: + UseApiKey: !Not [!Equals [!Ref DdApiKey, ""]] + UseApiKeySecret: !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + UseApiKeySsm: !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + # Statuses is a CommaDelimitedList; an empty default joins to "" so this is + # false, which drops the status key from the EventPattern (forward all). + HasStatusFilter: !Not [!Equals [!Join ["", !Ref Statuses], ""]] + HasFilter1: !Not [!Equals [!Ref FunctionArnFilter1, ""]] + HasFilter2: !Not [!Equals [!Ref FunctionArnFilter2, ""]] + HasFilter3: !Not [!Equals [!Ref FunctionArnFilter3, ""]] + HasFilter4: !Not [!Equals [!Ref FunctionArnFilter4, ""]] + HasFilter5: !Not [!Equals [!Ref FunctionArnFilter5, ""]] + HasFunctionFilter: !Or + - !Condition HasFilter1 + - !Condition HasFilter2 + - !Condition HasFilter3 + - !Condition HasFilter4 + - !Condition HasFilter5 + # When neither a status nor a function filter is set, the detail block is + # omitted entirely - an empty "detail: {}" is rejected by EventBridge. + HasDetailFilter: !Or + - !Condition HasStatusFilter + - !Condition HasFunctionFilter + +Rules: + ApiKeyRequired: + Assertions: + - Assert: !Or + - !Not [!Equals [!Ref DdApiKey, ""]] + - !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + - !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + AssertDescription: >- + One of DdApiKey, DdApiKeySecretArn, or DdApiKeySsmParameterName + must be set. + +Resources: + # --------------------------------------------------------------------------- + # Firehose backup bucket. Receives only records that fail to deliver to the + # Datadog endpoint (S3BackupMode: FailedDataOnly), so it stays empty under + # normal operation. Retained on stack deletion to preserve any failed + # records the operator may need to inspect or replay. + # --------------------------------------------------------------------------- + BackupBucket: + Type: AWS::S3::Bucket + DeletionPolicy: Retain + UpdateReplacePolicy: Retain + Properties: + BucketEncryption: + ServerSideEncryptionConfiguration: + - ServerSideEncryptionByDefault: + SSEAlgorithm: AES256 + PublicAccessBlockConfiguration: + BlockPublicAcls: true + BlockPublicPolicy: true + IgnorePublicAcls: true + RestrictPublicBuckets: true + OwnershipControls: + Rules: + - ObjectOwnership: BucketOwnerEnforced + + BackupBucketPolicy: + Type: AWS::S3::BucketPolicy + Properties: + Bucket: !Ref BackupBucket + PolicyDocument: + Version: "2012-10-17" + Statement: + - Sid: EnforceSSL + Effect: Deny + Principal: "*" + Action: s3:* + Resource: + - !GetAtt BackupBucket.Arn + - !Sub "${BackupBucket.Arn}/*" + Condition: + Bool: + aws:SecureTransport: "false" + + # --------------------------------------------------------------------------- + # Firehose delivery stream. HTTP endpoint destination targets the Datadog + # Firehose-specific intake (which speaks the Firehose protocol - do not use + # the standard /api/v2/logs endpoint here). Backup mode is FailedDataOnly so + # the bucket only receives records the endpoint rejected. + # --------------------------------------------------------------------------- + FirehoseLogGroup: + Type: AWS::Logs::LogGroup + Properties: + LogGroupName: !Sub "/aws/kinesisfirehose/${AWS::StackName}" + RetentionInDays: 7 + + FirehoseHttpLogStream: + Type: AWS::Logs::LogStream + Properties: + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: HttpEndpointDelivery + + FirehoseS3LogStream: + Type: AWS::Logs::LogStream + Properties: + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: S3Backup + + FirehoseRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: + Service: firehose.amazonaws.com + Action: sts:AssumeRole + Policies: + - PolicyName: FirehoseDelivery + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: + - s3:AbortMultipartUpload + - s3:GetBucketLocation + - s3:GetObject + - s3:ListBucket + - s3:ListBucketMultipartUploads + - s3:PutObject + Resource: + - !GetAtt BackupBucket.Arn + - !Sub "${BackupBucket.Arn}/*" + - Effect: Allow + Action: + - logs:PutLogEvents + Resource: + - !GetAtt FirehoseLogGroup.Arn + + DeliveryStream: + Type: AWS::KinesisFirehose::DeliveryStream + Properties: + DeliveryStreamType: DirectPut + HttpEndpointDestinationConfiguration: + EndpointConfiguration: + Name: Datadog + # Firehose's Url field accepts only https://[/path], no + # query string. Static metadata is attached via CommonAttributes + # below (Firehose sends them as the X-Amz-Firehose-Common- + # Attributes header on each request, which Datadog's Firehose + # intake parses into log metadata / tags). + Url: !Sub "https://aws-kinesis-http-intake.logs.${DdSite}/v1/input" + # The API key becomes the X-Amz-Firehose-Access-Key header on each + # request and is stored opaquely by Firehose. The two dynamic- + # reference paths resolve the value straight into this resource at + # deploy time, so the plaintext never appears in the template source, + # the stack parameters, or stack events. + AccessKey: !If + - UseApiKey + - !Ref DdApiKey + - !If + - UseApiKeySecret + - !Sub "{{resolve:secretsmanager:${DdApiKeySecretArn}:SecretString}}" + - !If + - UseApiKeySsm + - !Sub "{{resolve:ssm-secure:${DdApiKeySsmParameterName}}}" + - !Ref AWS::NoValue + BufferingHints: + IntervalInSeconds: !Ref BufferIntervalSeconds + SizeInMBs: 4 + RetryOptions: + DurationInSeconds: 60 + # Datadog's Firehose intake does not interpret common-attributes + # header keys (dd-service / dd-source / dd-tags) as log metadata - + # it surfaces each as a raw tag with the literal key, and tag + # values can't contain commas so a joined dd-tags value would be + # mangled. We explicitly set CommonAttributes: [] (instead of + # omitting RequestConfiguration entirely) because CloudFormation + # does not push outright property removals to Firehose - omission + # would leave previously-configured attributes live on the stream. + # Datadog's AWS integration auto-tags service/source/region/ + # aws_account from the raw envelope's source field and the + # Firehose ARN, so we get those for free. Any extra metadata + # (service override, env, version, custom tags) is set by a + # Datadog log processing pipeline against these events. + RequestConfiguration: + CommonAttributes: [] + CloudWatchLoggingOptions: + Enabled: true + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: !Ref FirehoseHttpLogStream + RoleARN: !GetAtt FirehoseRole.Arn + S3BackupMode: FailedDataOnly + S3Configuration: + BucketARN: !GetAtt BackupBucket.Arn + RoleARN: !GetAtt FirehoseRole.Arn + BufferingHints: + IntervalInSeconds: 300 + SizeInMBs: 5 + CompressionFormat: GZIP + CloudWatchLoggingOptions: + Enabled: true + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: !Ref FirehoseS3LogStream + # Firehose forwards each EventBridge envelope to Datadog unchanged; + # all reshaping (function ARN qualifier stripping, detail.* + # flattening, ISO timestamp parsing) is configured on the Datadog + # side via a logs processing pipeline. We explicitly set + # Enabled: false instead of omitting ProcessingConfiguration - + # CloudFormation does not push outright property removals to + # Firehose, so omitting it would leave a previously-attached Lambda + # processor live on the stream. + ProcessingConfiguration: + Enabled: false + + # --------------------------------------------------------------------------- + # EventBridge rule. Captures aws.lambda "Durable Execution Status Change" + # events and routes them to Firehose. Each filter is an unqualified + # function ARN with ":*" appended, because the event's detail.functionArn + # always carries a version/alias qualifier. + # --------------------------------------------------------------------------- + EventRule: + Type: AWS::Events::Rule + Properties: + Description: >- + Routes Lambda Durable Function execution status-change events to the + Datadog Firehose delivery stream. + State: ENABLED + EventPattern: + source: + - aws.lambda + detail-type: + - Durable Execution Status Change + # detail is omitted entirely when neither a status nor a function + # filter is set (an empty "detail: {}" is rejected by EventBridge), + # so the default rule matches on source + detail-type alone. + detail: !If + - HasDetailFilter + - # Status key omitted when Statuses is empty, so the rule forwards + # every status by default. + status: !If [HasStatusFilter, !Ref Statuses, !Ref AWS::NoValue] + # One wildcard matcher per populated filter slot. The user supplies + # an UNqualified function ARN (or a wildcard over one) and we append + # ":*" - the durable-execution detail.functionArn is always + # version/alias-qualified (see AWS "Monitoring durable functions" + # docs), so the ":*" is what actually matches and a bare-ARN matcher + # would never fire. Empty slots resolve to AWS::NoValue and are + # stripped from the rendered list by CloudFormation, so the + # EventPattern ends up with exactly N matchers where N is the count + # of populated FunctionArnFilterN slots. + functionArn: !If + - HasFunctionFilter + - - !If [HasFilter1, {wildcard: !Sub "${FunctionArnFilter1}:*"}, !Ref AWS::NoValue] + - !If [HasFilter2, {wildcard: !Sub "${FunctionArnFilter2}:*"}, !Ref AWS::NoValue] + - !If [HasFilter3, {wildcard: !Sub "${FunctionArnFilter3}:*"}, !Ref AWS::NoValue] + - !If [HasFilter4, {wildcard: !Sub "${FunctionArnFilter4}:*"}, !Ref AWS::NoValue] + - !If [HasFilter5, {wildcard: !Sub "${FunctionArnFilter5}:*"}, !Ref AWS::NoValue] + - !Ref AWS::NoValue + - !Ref AWS::NoValue + Targets: + - Id: FirehoseTarget + Arn: !GetAtt DeliveryStream.Arn + RoleArn: !GetAtt EventBridgeRole.Arn + + EventBridgeRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: + Service: events.amazonaws.com + Action: sts:AssumeRole + Policies: + - PolicyName: PutToFirehose + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: + - firehose:PutRecord + - firehose:PutRecordBatch + Resource: !GetAtt DeliveryStream.Arn + +Outputs: + DeliveryStreamArn: + Description: ARN of the Firehose delivery stream. + Value: !GetAtt DeliveryStream.Arn + BackupBucketName: + Description: S3 bucket that captures records the Datadog intake rejected. + Value: !Ref BackupBucket + EventRuleArn: + Description: ARN of the EventBridge rule capturing durable execution events. + Value: !GetAtt EventRule.Arn + ForwarderVersion: + Description: Version of this forwarder template. + Value: !FindInMap [Constants, DdDurableEventForwarder, Version] diff --git a/aws_durable_function_event_forwarder/release.sh b/aws_durable_function_event_forwarder/release.sh new file mode 100755 index 0000000..3db3d63 --- /dev/null +++ b/aws_durable_function_event_forwarder/release.sh @@ -0,0 +1,62 @@ +#!/usr/bin/env bash + +# Usage: ./release.sh [--private] [--yes] + +set -e + +# Read the S3 bucket +if [ -z "$1" ]; then + echo "Must specify a S3 bucket to publish the template" + exit 1 +else + BUCKET=$1 +fi + +# Parse optional flags +PRIVATE_TEMPLATE=false +AUTO_YES=false +shift +while [[ $# -gt 0 ]]; do + case "$1" in + --private) + PRIVATE_TEMPLATE=true + shift + ;; + --yes) + AUTO_YES=true + shift + ;; + *) + echo "Unknown option: $1" + echo "Usage: ./release.sh [--private] [--yes]" + exit 1 + ;; + esac +done + +# Confirm to proceed +for i in *.yaml; do + [ -f "$i" ] || break + echo "About to upload $i to s3://${BUCKET}/aws/$i" +done + +if [ "$AUTO_YES" = false ]; then + read -p "Continue (y/n)?" CONT + if [ "$CONT" != "y" ]; then + echo "Exiting" + exit 1 + fi +else + echo "Proceeding with upload (--yes flag provided)" +fi + +# Upload +if [ "$PRIVATE_TEMPLATE" = true ] ; then + aws s3 cp . s3://${BUCKET}/aws --recursive --exclude "*" --include "*.yaml" +else + aws s3 cp . s3://${BUCKET}/aws --recursive --exclude "*" --include "*.yaml" +fi +echo "Done uploading the template, and here is the CloudFormation quick launch URL" +echo "https://console.aws.amazon.com/cloudformation/home#/stacks/create/review?stackName=datadog-durable-function-event-forwarder&templateURL=https://${BUCKET}.s3.amazonaws.com/aws/durable_function_event_forwarder.yaml" + +echo "Done!" From 812c224fdcfdf21ba4517081c7536df68c4a69e9 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Tue, 23 Jun 2026 20:38:32 -0400 Subject: [PATCH 2/7] Remove release.sh; release will be handled in a separate repo Co-Authored-By: Claude Opus 4.8 (1M context) --- .../release.sh | 62 ------------------- 1 file changed, 62 deletions(-) delete mode 100755 aws_durable_function_event_forwarder/release.sh diff --git a/aws_durable_function_event_forwarder/release.sh b/aws_durable_function_event_forwarder/release.sh deleted file mode 100755 index 3db3d63..0000000 --- a/aws_durable_function_event_forwarder/release.sh +++ /dev/null @@ -1,62 +0,0 @@ -#!/usr/bin/env bash - -# Usage: ./release.sh [--private] [--yes] - -set -e - -# Read the S3 bucket -if [ -z "$1" ]; then - echo "Must specify a S3 bucket to publish the template" - exit 1 -else - BUCKET=$1 -fi - -# Parse optional flags -PRIVATE_TEMPLATE=false -AUTO_YES=false -shift -while [[ $# -gt 0 ]]; do - case "$1" in - --private) - PRIVATE_TEMPLATE=true - shift - ;; - --yes) - AUTO_YES=true - shift - ;; - *) - echo "Unknown option: $1" - echo "Usage: ./release.sh [--private] [--yes]" - exit 1 - ;; - esac -done - -# Confirm to proceed -for i in *.yaml; do - [ -f "$i" ] || break - echo "About to upload $i to s3://${BUCKET}/aws/$i" -done - -if [ "$AUTO_YES" = false ]; then - read -p "Continue (y/n)?" CONT - if [ "$CONT" != "y" ]; then - echo "Exiting" - exit 1 - fi -else - echo "Proceeding with upload (--yes flag provided)" -fi - -# Upload -if [ "$PRIVATE_TEMPLATE" = true ] ; then - aws s3 cp . s3://${BUCKET}/aws --recursive --exclude "*" --include "*.yaml" -else - aws s3 cp . s3://${BUCKET}/aws --recursive --exclude "*" --include "*.yaml" -fi -echo "Done uploading the template, and here is the CloudFormation quick launch URL" -echo "https://console.aws.amazon.com/cloudformation/home#/stacks/create/review?stackName=datadog-durable-function-event-forwarder&templateURL=https://${BUCKET}.s3.amazonaws.com/aws/durable_function_event_forwarder.yaml" - -echo "Done!" From 67144be4615c07a529a366adbc984d44053adfac Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Wed, 24 Jun 2026 13:43:47 -0400 Subject: [PATCH 3/7] Trim implementation details from durable forwarder README Keep the README user-facing: drop CloudFormation-internal notes (NoEcho, {{resolve:...}} dynamic references, Rules.ApiKeyRequired, Mappings.Constants) and the Firehose-URL aside from the parameter and output tables. Co-Authored-By: Claude Opus 4.8 (1M context) --- aws_durable_function_event_forwarder/README.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/aws_durable_function_event_forwarder/README.md b/aws_durable_function_event_forwarder/README.md index 94606d5..d848e56 100644 --- a/aws_durable_function_event_forwarder/README.md +++ b/aws_durable_function_event_forwarder/README.md @@ -31,17 +31,14 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | Parameter | Required | Default | Description | | --- | --- | --- | --- | -| `DdApiKey` | one of three | "" | Plaintext Datadog API key (`NoEcho`). | -| `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | -| `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | -| `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | +| `DdApiKey` | one of three | "" | Plaintext Datadog API key. | +| `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. | +| `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. | +| `DdSite` | no | `datadoghq.com` | Datadog site to deliver events to. | | `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses. | | `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. | | `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | -`Rules.ApiKeyRequired` asserts at least one of the three API key parameters -is set and fails the stack action with a clear message otherwise. - ## Outputs | Output | Description | @@ -49,7 +46,7 @@ is set and fails the stack action with a clear message otherwise. | `DeliveryStreamArn` | Firehose delivery stream ARN. | | `BackupBucketName` | S3 bucket name for failed records. | | `EventRuleArn` | EventBridge rule ARN. | -| `ForwarderVersion` | Template version (from `Mappings.Constants`). | +| `ForwarderVersion` | Template version. | ## Forwarded log shape From 4ceb21adae2127c662aab3c4b5e08966f3fc9263 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Wed, 24 Jun 2026 22:02:03 -0400 Subject: [PATCH 4/7] Address PR review feedback on durable forwarder MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Description: drop "transforms into Datadog log documents" — the stack forwards raw EventBridge envelopes unchanged. - Add Rules.ApiKeyExclusive so exactly one API key source is enforced (previously multiple could be set, with plaintext silently winning). - Expire S3 backup records after 30 days instead of retaining forever. - README: minor grammar fix. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 2 +- .../durable_function_event_forwarder.yaml | 38 ++++++++++++++++++- 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/aws_durable_function_event_forwarder/README.md b/aws_durable_function_event_forwarder/README.md index d848e56..55b13c3 100644 --- a/aws_durable_function_event_forwarder/README.md +++ b/aws_durable_function_event_forwarder/README.md @@ -23,7 +23,7 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) Datadog API key as the `X-Amz-Firehose-Access-Key` header. The stack does **not** attach any custom metadata to Firehose's outbound requests; tagging and reshaping are handled on the Datadog side. -- Records the endpoint rejects are written to the S3 backup bucket +- Records that the endpoint rejects are written to the S3 backup bucket (`S3BackupMode: FailedDataOnly`); under normal operation the bucket stays empty. diff --git a/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml b/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml index 5630c45..ccfdd8b 100644 --- a/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml +++ b/aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml @@ -1,8 +1,8 @@ AWSTemplateFormatVersion: "2010-09-09" Description: >- Captures AWS Lambda Durable Function execution status change events from - EventBridge, transforms them into Datadog log documents, and forwards them - to the Datadog HTTP intake via Amazon Data Firehose. + EventBridge and forwards them to the Datadog HTTP intake via Amazon Data + Firehose. Metadata: AWS::CloudFormation::Interface: @@ -176,6 +176,33 @@ Rules: AssertDescription: >- One of DdApiKey, DdApiKeySecretArn, or DdApiKeySsmParameterName must be set. + # Enforce that exactly one key source is provided. Without this, setting + # more than one is silently resolved by the AccessKey !If chain + # (plaintext > secret > ssm), so the extra values would be ignored without + # warning. Each assertion rejects one pair being set simultaneously. + ApiKeyExclusive: + Assertions: + - Assert: !Not + - !And + - !Not [!Equals [!Ref DdApiKey, ""]] + - !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + AssertDescription: >- + Provide only one Datadog API key source: DdApiKey and + DdApiKeySecretArn cannot both be set. + - Assert: !Not + - !And + - !Not [!Equals [!Ref DdApiKey, ""]] + - !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + AssertDescription: >- + Provide only one Datadog API key source: DdApiKey and + DdApiKeySsmParameterName cannot both be set. + - Assert: !Not + - !And + - !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + - !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + AssertDescription: >- + Provide only one Datadog API key source: DdApiKeySecretArn and + DdApiKeySsmParameterName cannot both be set. Resources: # --------------------------------------------------------------------------- @@ -201,6 +228,13 @@ Resources: OwnershipControls: Rules: - ObjectOwnership: BucketOwnerEnforced + # Failed records are kept only for inspection/replay, so expire them + # after 30 days to bound storage cost rather than retaining forever. + LifecycleConfiguration: + Rules: + - Id: ExpireFailedRecords + Status: Enabled + ExpirationInDays: 30 BackupBucketPolicy: Type: AWS::S3::BucketPolicy From fb6b129e6fbd1ac248dff7389d0821b9348f5fa1 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Thu, 25 Jun 2026 11:15:49 -0400 Subject: [PATCH 5/7] Add CODEOWNERS for durable function event forwarder template Make serverless-aws and serverless-onboarding-enablement owners of aws_durable_function_event_forwarder/. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/CODEOWNERS | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index a9ed16a..a0ca443 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -12,3 +12,6 @@ # Storage Management templatess /aws_storage_management* @DataDog/storage-management + +# Lambda Durable Function event forwarder template +/aws_durable_function_event_forwarder/ @DataDog/serverless-aws @DataDog/serverless-onboarding-enablement From 1bf693a54f66c035734553129e79ef5390c379c9 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Thu, 25 Jun 2026 11:35:10 -0400 Subject: [PATCH 6/7] Re-trigger devflow/mergegate re-evaluation Co-Authored-By: Claude Opus 4.8 (1M context) From fc7caffc992b18c542185d5e7491d96b18de957e Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Thu, 25 Jun 2026 11:54:31 -0400 Subject: [PATCH 7/7] Drop serverless-onboarding-enablement from durable forwarder CODEOWNERS Leave serverless-aws as the sole owner of aws_durable_function_event_forwarder/. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/CODEOWNERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index a0ca443..d140fe3 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -14,4 +14,4 @@ /aws_storage_management* @DataDog/storage-management # Lambda Durable Function event forwarder template -/aws_durable_function_event_forwarder/ @DataDog/serverless-aws @DataDog/serverless-onboarding-enablement +/aws_durable_function_event_forwarder/ @DataDog/serverless-aws