Skip to content

Commit a8beecd

Browse files
authored
Merge pull request #2 from aws-samples/feat-register-event
feat: Update solution to register models using CloudWatch events
2 parents 90dd84a + f670eaf commit a8beecd

22 files changed

+209
-217
lines changed

OPERATIONS.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@ Having created the A/B Testing Deployment Pipeline, this operations manual provi
44

55
## A/B Testing for Machine Learning models
66

7-
Successful A/B Testing for machine learning models requires measuring how effective predictions are against end users.
8-
It is important to be able to identify users consistently and be able to attribute success actions against the model predictions back to users.
7+
Successful A/B Testing for machine learning models requires measuring how effective predictions are against end users. It is important to be able to identify users consistently and be able to attribute success actions against the model predictions back to users.
98

109
### Conversion Metrics
1110

@@ -46,7 +45,11 @@ The configuration is stored in the CodeCommit source repository by stage name eg
4645
* `epsilon` - The epsilon parameter used by the `EpsilonGreedy` strategy.
4746
* `warmup` - The number of invocations to warm up before applying the strategy.
4847

49-
In addition to the above, you must specify the `champion` and `challenger` model variants for the deployment.
48+
In addition to the above, you must specify the `champion` and `challenger` model variants for the deployment.
49+
50+
These will be loaded from the two Model Package Groups in the registry that include the project name and suffixed with `champion` or `challenger` for example project name `ab-testing-pipeline` these model package groups in the sample notebook:
51+
52+
![\[Model Registry\]](docs/ab-testing-pipeline-model-registry.png)
5053

5154
**Latest Approved Versions**
5255

README.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ Follow are a list of context values that are provided in the `cdk.json`, which c
146146
|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------|
147147
| `api_name` | The API Gateway Name | "ab-testing" |
148148
| `stage_name` | The stage namespace for resource and API Gateway path | "dev" |
149-
| `endpoint_filter` | A prefix to filter which Amazon SageMaker endpoints the API can invoke | "*" |
149+
| `endpoint_prefix` | A prefix to filter which Amazon SageMaker endpoints the API can invoked. | "" |
150150
| `api_lambda_memory` | The [lambda memory](https://docs.aws.amazon.com/lambda/latest/dg/configuration-memory.html) allocation for API endpoint. | 768 |
151151
| `api_lambda_timeout` | The lambda timeout for the API endpoint. | 10 |
152152
| `metrics_lambda_memory` | The [lambda memory](https://docs.aws.amazon.com/lambda/latest/dg/configuration-memory.html) allocated for metrics processing Lambda | 768 |
@@ -164,9 +164,7 @@ Run the following command to deploy the API and testing infrastructure, optional
164164
cdk deploy ab-testing-api
165165
```
166166

167-
This stack will ask you to confirm any changes, and output the `RegisterLambda` which you will provide to the MLOps Project, and the `ApiEndpoint` which you will provide to the A/B Testing sample notebook.
168-
169-
Amazon SageMaker Studio projects will be granted access to invoke the Register Lambda, so if you are seeing errors running the above command ensure you have [Enable SageMaker project templates for Studio users](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-studio-updates.html).
167+
This stack will ask you to confirm any changes, and output the `ApiEndpoint` which you will provide to the A/B Testing sample notebook.
170168

171169
## Create the SageMaker MLOps Project Template
172170

@@ -276,7 +274,6 @@ On the Create project page, SageMaker templates is chosen by default. This optio
276274
- The project name must have 32 characters or fewer.
277275
10. In the Project template parameters, provide the *Repository Name** you created previously eg:
278276
- For **StageName**, enter `dev`
279-
- For **RegisterLambda**, enter the `RegisterLambda` output from the `ab-testing-api` stack
280277
- For **CodeCommitSeedBucket**, enter the `CodeCommitSeedBucket` output from the `ab-testiing-service-catalog` stack
281278
- For **CodeCommitSeedKey**, enter the `CodeCommitSeedKey` output from the `ab-testiing-service-catalog` stack
282279
11. Choose Create project.

ab-testing-pipeline.yml

Lines changed: 0 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,6 @@ Parameters:
2323
Type: String
2424
Description: The optional s3 seed key
2525
MinLength: 1
26-
RegisterLambda:
27-
Type: String
28-
Description: The AWS Lambda to invoke when registering this model
29-
MinLength: 1
3026
Resources:
3127
CodeRepo:
3228
Type: AWS::CodeCommit::Repository
@@ -55,8 +51,6 @@ Resources:
5551
- Key: sagemaker:project-name
5652
Value:
5753
Ref: SageMakerProjectName
58-
Metadata:
59-
aws:cdk:path: ab-testing-pipeline/CodeRepo
6054
CdkBuild455F642E:
6155
Type: AWS::CodeBuild::Project
6256
Properties:
@@ -131,76 +125,6 @@ Resources:
131125
- Ref: SageMakerProjectName
132126
- -cdk-
133127
- Ref: StageName
134-
Metadata:
135-
aws:cdk:path: ab-testing-pipeline/CdkBuild/Resource
136-
RegisterBuild8D2EA4DC:
137-
Type: AWS::CodeBuild::Project
138-
Properties:
139-
Artifacts:
140-
Type: CODEPIPELINE
141-
Environment:
142-
ComputeType: BUILD_GENERAL1_SMALL
143-
EnvironmentVariables:
144-
- Name: SAGEMAKER_PROJECT_NAME
145-
Type: PLAINTEXT
146-
Value:
147-
Ref: SageMakerProjectName
148-
- Name: STAGE_NAME
149-
Type: PLAINTEXT
150-
Value:
151-
Ref: StageName
152-
- Name: REGISTER_LAMBDA
153-
Type: PLAINTEXT
154-
Value:
155-
Ref: RegisterLambda
156-
Image: aws/codebuild/standard:1.0
157-
ImagePullCredentialsType: CODEBUILD
158-
PrivilegedMode: false
159-
Type: LINUX_CONTAINER
160-
ServiceRole:
161-
Fn::Join:
162-
- ""
163-
- - "arn:"
164-
- Ref: AWS::Partition
165-
- ":iam::"
166-
- Ref: AWS::AccountId
167-
- :role/service-role/AmazonSageMakerServiceCatalogProductsUseRole
168-
Source:
169-
BuildSpec: >-
170-
{
171-
"version": "0.2",
172-
"phases": {
173-
"build": {
174-
"commands": [
175-
"python register.py > output.txt"
176-
]
177-
}
178-
},
179-
"artifacts": {
180-
"files": [
181-
"output.txt"
182-
]
183-
},
184-
"environment": {
185-
"buildImage": {
186-
"type": "LINUX_CONTAINER",
187-
"defaultComputeType": "BUILD_GENERAL1_SMALL",
188-
"imageId": "aws/codebuild/amazonlinux2-x86_64-standard:3.0",
189-
"imagePullPrincipalType": "CODEBUILD"
190-
}
191-
}
192-
}
193-
Type: CODEPIPELINE
194-
EncryptionKey: alias/aws/s3
195-
Name:
196-
Fn::Join:
197-
- ""
198-
- - sagemaker-
199-
- Ref: SageMakerProjectName
200-
- -register-
201-
- Ref: StageName
202-
Metadata:
203-
aws:cdk:path: ab-testing-pipeline/RegisterBuild/Resource
204128
S3Artifact80610462:
205129
Type: AWS::S3::Bucket
206130
Properties:
@@ -215,8 +139,6 @@ Resources:
215139
- Ref: AWS::Region
216140
UpdateReplacePolicy: Delete
217141
DeletionPolicy: Delete
218-
Metadata:
219-
aws:cdk:path: ab-testing-pipeline/S3Artifact/Resource
220142
PipelineC660917D:
221143
Type: AWS::CodePipeline::Pipeline
222144
Properties:
@@ -292,22 +214,6 @@ Resources:
292214
Name: SageMaker_CFN_Deploy
293215
RunOrder: 1
294216
Name: Deploy
295-
- Actions:
296-
- ActionTypeId:
297-
Category: Build
298-
Owner: AWS
299-
Provider: CodeBuild
300-
Version: "1"
301-
Configuration:
302-
ProjectName:
303-
Ref: RegisterBuild8D2EA4DC
304-
InputArtifacts:
305-
- Name: Artifact_Source_CodeCommit_Source
306-
Name: Register_Build
307-
OutputArtifacts:
308-
- Name: Artifact_Register_Register_Build
309-
RunOrder: 1
310-
Name: Register
311217
ArtifactStore:
312218
Location:
313219
Ref: S3Artifact80610462
@@ -319,8 +225,6 @@ Resources:
319225
- Ref: SageMakerProjectName
320226
- -pipeline-
321227
- Ref: StageName
322-
Metadata:
323-
aws:cdk:path: ab-testing-pipeline/Pipeline/Resource
324228
DeployRule0F8E909D:
325229
Type: AWS::Events::Rule
326230
Properties:
@@ -369,8 +273,6 @@ Resources:
369273
- ":iam::"
370274
- Ref: AWS::AccountId
371275
- :role/service-role/AmazonSageMakerServiceCatalogProductsUseRole
372-
Metadata:
373-
aws:cdk:path: ab-testing-pipeline/DeployRule/Resource
374276
CodeRule663E3DC0:
375277
Type: AWS::Events::Rule
376278
Properties:
@@ -430,14 +332,10 @@ Resources:
430332
- ":iam::"
431333
- Ref: AWS::AccountId
432334
- :role/service-role/AmazonSageMakerServiceCatalogProductsUseRole
433-
Metadata:
434-
aws:cdk:path: ab-testing-pipeline/CodeRule/Resource
435335
CDKMetadata:
436336
Type: AWS::CDK::Metadata
437337
Properties:
438338
Analytics: v2:deflate64:H4sIAAAAAAAAE0WN0YrCMBBFv8X3OCIVln0T/YFSvyBOR3ZskynJRCkh/742Lfh0L3M5Z47we4Lj7mzfcY/9cMgogSDf1OJgrg/f2mAdKQXTUZQUkMxVfNSQUJf9e334z9CzsvhiFl1m6yC3MjLOVVVbMSg9oTjHClUwSWSVMNfhnnjsPxRPNLKnNsiT1k9bLSY2kC8JB6rnta3WaaO+fAW3Xgy9yGuE3KWxLkuWUkw765/4QwM/0OyekXkfkld2BN2a//2ekvEmAQAA
439-
Metadata:
440-
aws:cdk:path: ab-testing-pipeline/CDKMetadata/Default
441339
Condition: CDKMetadataAvailable
442340
Conditions:
443341
CDKMetadataAvailable:

cdk.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"log_level": "INFO",
1212
"api_name": "ab-testing",
1313
"stage_name": "dev",
14-
"endpoint_filter": "*",
14+
"endpoint_prefix": "",
1515
"api_lambda_memory": 768,
1616
"api_lambda_timeout": 60,
1717
"metrics_lambda_memory": 768,

deployment_pipeline/README.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,18 @@ This deployment pipeline contains a few stages.
1010
1. **Source**: Pull the latest deployment configuration from AWS CodeCommit repository.
1111
1. **Build**: AWS CodeBuild job to create the AWS CloudFormation template for deploying the endpoint.
1212
- Query the Amazon SageMaker project to get the top approved models.
13-
- Use the AWS CDK to create a CFN stack with multiple endpoint variants.
14-
- Create a `register.json` file that contains the target SageMaker endpoint and A/B testing strategy.
15-
2. **Deploy**: Run the AWS CloudFormation stack to create/update the SageMaker endpoint.
16-
3. **Register**: Call the RegisterAPI for the endpoint to create/clear the A/B testing metrics.
13+
- Use the AWS CDK to create a CFN stack to deploy multi-variant SageMaker Endpoint.
14+
2. **Deploy**: Run the AWS CloudFormation stack to create/update the SageMaker endpoint, tagged with properties based on configuration:
15+
- `ab-testing:enabled` equals `true`
16+
- `ab-testing:strategy` is one `WeightedSampling`, `EpslionGreedy`, `UCB1` or `ThompsonSampling`.
17+
- `ab-testing:epsilon` is parameters for `EpslionGreedy` strategy, defaults to `0.1`.
18+
- `ab-testing:warmup` the number of invocations to warmup with `WeightedSampling` strategy, defaults to `0`.
1719

1820
![\[AWS CodePipeline\]](../docs/ab-testing-pipeline-code-pipeline.png)
1921

2022
## Testing
2123

22-
Once you have created a SageMaker Project, you can test the **Build** and **Register** stages locally by setting some environment variables, and running the commands found in the `buildspec` defined in the pipeline.
24+
Once you have created a SageMaker Project, you can test the **Build** stage and **Register** events locally by setting some environment variables.
2325

2426
### Build Stage
2527

@@ -32,12 +34,11 @@ export STAGE_NAME="dev"
3234
cdk synth
3335
```
3436

35-
### Register Stage
37+
### Register
3638

3739
Export the environment variable for the `REGISTER_LAMBDA` created as part of the `ab-testing-api` stack, then run `register.py` file.
3840

3941
```
4042
export REGISTER_LAMBDA="<<register_lambda>>"
4143
python register.py
4244
```
43-

deployment_pipeline/app.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,13 @@
4242
with open(f"{stage_name}-config.json", "r") as f:
4343
j = json.load(f)
4444
deployment_config = DeploymentConfig(**j)
45+
# Append tags for ab-testing
46+
tags += [
47+
core.CfnTag(key="ab-testing:enabled", value="true"),
48+
core.CfnTag(key="ab-testing:strategy", value=deployment_config.strategy),
49+
core.CfnTag(key="ab-testing:epsilon", value=str(deployment_config.epsilon)),
50+
core.CfnTag(key="ab-testing:warmup", value=str(deployment_config.warmup)),
51+
]
4552

4653
sagemaker = SageMakerStack(
4754
app,

deployment_pipeline/infra/deployment_config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ def __init__(
4242
instance_type: str = "ml.t2.medium",
4343
strategy: str = "ThompsonSampling",
4444
warmup: int = 0,
45+
epsilon: float = 0.1,
4546
):
4647
self.stage_name = stage_name
4748
# Provide either the challenger variant count, or specific champion/challenger config
@@ -74,4 +75,5 @@ def __init__(
7475
self.challenger_variant_config = None
7576
self.strategy = strategy
7677
self.warmup = warmup
78+
self.epsilon = epsilon
7779
super().__init__(instance_count, instance_type)

deployment_pipeline/infra/model_registry.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ def get_latest_approved_packages(
5959
# Return error if no packages found
6060
if len(model_packages) == 0 and creation_time_after is None:
6161
error_message = (
62-
f"No latest packages found for: {model_package_group_name}"
62+
f"No approved packages found for: {model_package_group_name}"
6363
)
6464
logger.error(error_message)
6565
raise Exception(error_message)
@@ -121,9 +121,7 @@ def get_versioned_approved_packages(
121121

122122
# Return error if no packages found
123123
if len(model_packages) == 0:
124-
error_message = (
125-
f"No versioned packages found for: {model_package_group_name}"
126-
)
124+
error_message = f"No approved packages found for: {model_package_group_name} and versions: {model_package_versions}"
127125
logger.error(error_message)
128126
raise Exception(error_message)
129127

deployment_pipeline/prod-config.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
{
22
"stage_name": "prod",
3-
"strategy": "ThompsonSampling",
3+
"strategy": "EpsilonGreedy",
44
"warmup": 100,
5+
"epsilon": 0.1,
56
"instance_count": 2,
67
"instance_type": "ml.c5.large",
78
"champion_variant_config": {

deployment_pipeline/register.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,20 @@
2424
# Get the config and include with endpoint to register this model
2525
with open(f"{stage_name}-config.json", "r") as f:
2626
j = json.load(f)
27-
event = json.dumps({"endpoint_name": endpoint_name, **j})
27+
event = json.dumps({
28+
'source': 'aws.sagemaker',
29+
'detail-type': 'SageMaker Endpoint State Change',
30+
'detail': {
31+
'EndpointName': endpoint_name,
32+
'EndpointStatus': 'IN_SERVICE',
33+
'Tags': {
34+
'ab-testing:enabled': 'true',
35+
'ab-testing:strategy': j.get('strategy', 'ThompsonSampling'),
36+
'ab-testing:epsilon': str(j.get('epsilon', 0.1)),
37+
'ab-testing:warmup': str(j.get('warmup', 0)),
38+
}
39+
}
40+
})
2841
response = lambda_client.invoke(
2942
FunctionName=register_lambda,
3043
InvocationType="RequestResponse",

0 commit comments

Comments
 (0)