Skip to content
This repository was archived by the owner on Dec 30, 2024. It is now read-only.

Commit a26f05f

Browse files
committed
Update to version v1.2.0
1 parent a280c46 commit a26f05f

File tree

133 files changed

+7522
-2159
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

133 files changed

+7522
-2159
lines changed

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ Steps to reproduce the behavior.
1717
A clear and concise description of what you expected to happen.
1818

1919
**Please complete the following information about the solution:**
20-
- [ ] Version: [e.g. v1.0.0]
20+
- [ ] Version: [e.g. v1.1.0]
2121

22-
To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0122) - Discovering Hot Topics using Machine Learning. Version v1.0.0".
22+
To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0122) - Discovering Hot Topics using Machine Learning. Version v1.1.0".
2323

2424
- [ ] Region: [e.g. us-east-1]
2525
- [ ] Was the solution modified from the version published on this repository?

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ pids
4242
lib-cov
4343

4444
# Coverage directory used by tools like istanbul
45+
coverage-reports/
4546
coverage
4647
*.lcov
4748

@@ -186,6 +187,9 @@ coverage.xml
186187
.hypothesis/
187188
.pytest_cache/
188189

190+
# linting, scanning configurations, sonarqube
191+
.scannerwork/
192+
189193
# Translations
190194
*.mo
191195
*.pot

CHANGELOG.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,19 @@
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.2.0] - 2020-10-29
9+
### Added
10+
- New and simplified interactive Amazon QuickSight dashboard that is now automatically generated through an AWS CloudFormation deployment and that customers can extend to suit their business case
11+
12+
### Updated
13+
- Updated to AWS CDK v1.69.0
14+
- Consolidate Amazon S3 access Log bucket across the solution. All access log files have a prefix that corresponds to the bucket for which they are generated
15+
816
## [1.1.0] - 2020-09-29
917
### Updated
1018
- S3 storage for inference outputs to use Apache Parquet
11-
- Add paritioning to AWS Glue tables
12-
- Update to CDK v1.63.0
19+
- Add partitioning to AWS Glue tables
20+
- Update to AWS CDK v1.63.0
1321
- Update to AWS SDK v2.755.0
1422

1523
## [1.0.0] - 2020-08-28

NOTICE.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ pytest-cov - MIT license
3434
pytest - MIT license
3535
requests - Apache-2.0
3636
sinon - BSD license
37+
tenacity under the Apache License Version 2.0
3738
ts-jest - MIT license
3839
ts-node - MIT license
3940
twitter-lite - MIT License

README.md

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
## Discovering Hot Topics using Machine Learning
22

3-
With so much customer sentiment available for analysis today, understanding the contextualization of the most relevant topics can be difficult at scale. Separating the signal from the noise requires analysis that goes beyond basic aggregation of sentiment analysis. Diving deeper and truly understanding the conversation at scale can help organizations to succeed in the market, identify new opportunities and react quickly.
3+
The Discovering Hot Topics Using Machine Learning solution helps you identify the most dominant topics associated with your products, policies, events, and brands. Implementing this solution helps you react quickly to new growth opportunities, address negative brand associations, and deliver higher levels of customer satisfaction.
44

5-
The Discovering Hot Topics Using Machine Learning solution addresses the problem of organizing large-scale customer feedback analytics by automating ingesting digital assets and performing near real-time analysis using machine learning algorithms. Organizations can gain insight about new product launches, service announcements, public relations, crisis management, and changes to company policies that impact their customers.
5+
The solution uses machine learning algorithms to automate digital asset (text and image) ingestion and perform near real-time topic modeling, sentiment analysis, and image detection. The solution then visualizes these large-scale customer analyses using an Amazon QuickSight dashboard. This guide provides step-by-step instructions to building a dashboard that provides you with the context and insights necessary to identify trends that help or harm you brand.
66

7-
This solution ingests text and images from online discourse and performs topic and sentiment analyses, and detect unsafe content in images. The default input data source for the solution is Twitter, but it can be extended to ingest other social media platforms in addition to any stream from an enterprise’s internal systems. The output of this inference is organized and visualized in a dashboard for users to consult and analyze.
7+
The solution performs the following key features:
8+
* **Performs topic modeling to detect dominant topics**: identifies the terms that collectively form a topic from within customer feedback
9+
* **Identifies the sentiment of what customers are saying**: uses contextual semantic search to understand the nature of online discussions
10+
* **Determines if images associated with your brand contain unsafe content**: detects unsafe and negative imagery in content
11+
* **Helps customers identify insights in near real-time**: you can use a visualization dashboard to better understand context, threats, and opportunities almost instantly
812

913
For an overview and solution deployment guide, please visit [Discovering Hot Topics using Machine Learning](https://aws.amazon.com/solutions/implementations/discovering-hot-topics-using-machine-learning)
1014

@@ -22,14 +26,15 @@ Deploying this solution with the default parameters builds the following environ
2226
* Application Integration – Event based architecture approach through the use of AWS Events Bridge
2327
* Storage and Visualization – A combination of Kinesis Data Firehose, S3 Buckets, Glue, Athena and QuickSight
2428

25-
Once the solution is deployed, use QuickSight to create a dashboard like the one below.
29+
After you deploy the solution, use the included Amazon QuickSight dashboard to visualize the solution's machine learning inferences. The image to the right is an example
30+
visualization dashboard featuring a dominant topic list, donut charts, weekly and monthly trend graphs, a word cloud, a tweet table, and a heat map.
2631

2732
<p align="center">
2833
<img src="source/images/dashboard.png">
2934
<br/>
3035
</p>
3136

32-
This is an example Amazon QuickSight dashboard built by the solution. The first row of visuals in the dashboard shows the aggregation of all the dominant topics detected, and the second row drills down to the most dominant topic '000'. The bottom left corner of the image demonstrates that selecting a specific phrase (in this example, machine learning) in the word cloud filters the data for the related donut chart and table.
37+
The first row of visuals in the dashboard shows the aggregation of all the dominant topics detected, and the second row drills down to the most dominant topic '000'. The bottom left corner of the image demonstrates that selecting a specific phrase (in this example, machine learning) in the word cloud filters the data for the related donut chart and table.
3338

3439
## 1. Build the solution
3540

@@ -48,17 +53,26 @@ chmod +x ./run-all-tests.sh
4853

4954
* Configure the bucket name of your target Amazon S3 distribution bucket
5055
```
51-
export DIST_OUTPUT_BUCKET=my-bucket-name # bucket where customized code will reside
52-
export VERSION=my-version # version number for the customized code
56+
export DIST_OUTPUT_BUCKET=my-bucket-name
57+
export VERSION=my-version
5358
```
54-
_Note:_ You would have to create an S3 bucket with the prefix 'my-bucket-name-<aws_region>'; aws_region is where you are testing the customized solution. Also, the assets in bucket should be publicly accessible.
5559

5660
* Now build the distributable:
5761
```
5862
cd <rootDir>/deployment
5963
chmod +x ./build-s3-dist.sh
60-
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $$VERSION
64+
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION $CF_TEMPLATE_BUCKET_NAME QS_TEMPLATE_ACCOUNT
65+
66+
```
67+
* Parameter details
6168
```
69+
$DIST_OUTPUT_BUCKET - This is the global name of the distribution. For the bucket name, the AWS Region is added to the global name (example: 'my-bucket-name-us-east-1') to create a regional bucket. The lambda artifact should be uploaded to the regional buckets for the CloudFormation template to pick it up for deployment.
70+
$SOLUTION_NAME - The name of This solution (example: discovering-hot-topics-using-machine-learning)
71+
$VERSION - The version number of the change
72+
$CF_TEMPLATE_BUCKET_NAME - The name of the S3 bucket where the CloudFormation templates should be uploaded
73+
$QS_TEMPLATE_ACCOUNT - The account from which the Amazon QuickSight templates should be sourced for Amazon QuickSight Analysis and Dashboard creation
74+
```
75+
6276

6377
* Deploy the distributable to an Amazon S3 bucket in your account. _Note:_ you must have the AWS Command Line Interface installed.
6478
```

deployment/build-s3-dist.sh

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,23 @@
2626
set -e
2727

2828
# Important: CDK global version number
29-
cdk_version=1.63.0
29+
cdk_version=1.69.0
3030

3131
# Check to see if input has been provided:
32-
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ]; then
32+
if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$4" ] || [ -z "$5" ] || [ -z "$6" ]; then
3333
echo "Please provide all required parameters for the build script"
34-
echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.0.0"
34+
echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.2.0 template-bucket-name template_account_id solutions"
3535
exit 1
3636
fi
3737

3838
bucket_name="$1"
3939
solution_name="$2"
4040
solution_version="$3"
41+
template_bucket_name="$4"
42+
template_account_id="$5"
43+
dist_quicksight_namespace="$6"
44+
45+
dashed_version="${solution_version//./$'_'}"
4146

4247
# Get reference for all important folders
4348
template_dir="$PWD"
@@ -126,6 +131,18 @@ do
126131

127132
replace="s/%%VERSION%%/$solution_version/g"
128133
sed -i -e $replace $file
134+
135+
replace="s/%%TEMPLATE_BUCKET_NAME%%/$template_bucket_name/g"
136+
sed -i -e $replace $file
137+
138+
replace="s/%%TEMPLATE_ACCOUNT_ID%%/$template_account_id/g"
139+
sed -i -e $replace $file
140+
141+
replace="s/%%DIST_QUICKSIGHT_NAMESPACE%%/$dist_quicksight_namespace/g"
142+
sed -i -e $replace $file
143+
144+
replace="s/%%DASHED_VERSION%%/$dashed_version/g"
145+
sed -i -e $replace $file
129146
done
130147

131148
echo "------------------------------------------------------------------------------"

deployment/cdk-solution-helper/index.js

Lines changed: 46 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,22 +34,66 @@ fs.readdirSync(global_s3_assets).forEach(file => {
3434
if (fn.Properties.Code.hasOwnProperty('S3Bucket')) {
3535
// Set the S3 key reference
3636
let artifactHash = Object.assign(fn.Properties.Code.S3Bucket.Ref);
37-
artifactHash = artifactHash.replace('AssetParameters', '');
37+
// console.debug(`Old artificatHash is ${artifactHash}`);
38+
artifactHash = artifactHash.replace(/[\w]*AssetParameters/g, '');
3839
artifactHash = artifactHash.substring(0, artifactHash.indexOf('S3Bucket'));
40+
// console.debug(`New artificatHash is ${artifactHash}`);
3941
const assetPath = `asset${artifactHash}`;
4042
fn.Properties.Code.S3Key = `%%SOLUTION_NAME%%/%%VERSION%%/${assetPath}.zip`;
4143

4244
// Set the S3 bucket reference
4345
fn.Properties.Code.S3Bucket = {
4446
'Fn::Sub': '%%BUCKET_NAME%%-${AWS::Region}'
4547
};
48+
} else {
49+
// console.debug(`Here is the fn dump ${JSON.stringify(fn)}`);
4650
}
4751
});
4852

53+
// Clean-up nested template stack dependencies
54+
const nestedStacks = Object.keys(resources).filter(function(key) {
55+
return resources[key].Type === 'AWS::CloudFormation::Stack'
56+
});
57+
58+
nestedStacks.forEach(function(f) {
59+
const fn = template.Resources[f];
60+
fn.Properties.TemplateURL = {
61+
'Fn::Join': [
62+
'',
63+
[
64+
'https://s3.',
65+
{
66+
'Ref' : 'AWS::URLSuffix'
67+
},
68+
'/',
69+
`%%TEMPLATE_BUCKET_NAME%%/%%SOLUTION_NAME%%/%%VERSION%%/${fn.Metadata.nestedStackFileName}`
70+
]
71+
]
72+
};
73+
74+
const params = fn.Properties.Parameters ? fn.Properties.Parameters : {};
75+
const nestedStackParameters = Object.keys(params).filter(function(key) {
76+
if (key.search(/[\w]*AssetParameters/g) > -1) {
77+
return true;
78+
}
79+
return false;
80+
});
81+
82+
nestedStackParameters.forEach(function(stkParam) {
83+
fn.Properties.Parameters[stkParam] = undefined;
84+
});
85+
});
86+
4987
// Clean-up parameters section
5088
const parameters = (template.Parameters) ? template.Parameters : {};
5189
const assetParameters = Object.keys(parameters).filter(function (key) {
52-
return key.includes('AssetParameters');
90+
console.debug(`key to analyze ${key}`);
91+
if (key.search(/[\w]*AssetParameters/g) > -1) {
92+
// console.debug('Pattern match');
93+
return true;
94+
}
95+
// console.debug('Pattern did not match');
96+
return false;
5397
});
5498
assetParameters.forEach(function (a) {
5599
template.Parameters[a] = undefined;

source/bin/discovering-hot-topics-app.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,5 +19,6 @@ import { DiscoveringHotTopicsStack } from '../lib/discovering-hot-topics-stack';
1919
const app = new cdk.App();
2020
new DiscoveringHotTopicsStack(app, 'discovering-hot-topics-using-machine-learning', {
2121
description: '(SO0122) - Discovering Hot Topics using Machine Learning. Version %%VERSION%%',
22-
solutionID: 'SO0122'
22+
solutionID: 'SO0122',
23+
solutionName: 'discovering-hot-topics-using-machine-learning'
2324
});

source/cdk.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
{
2-
"app": "npx ts-node bin/discovering-hot-topics-app.ts"
2+
"app": "npx ts-node bin/discovering-hot-topics-app.ts",
3+
"context": {
4+
"quicksight_source_template_arn": "arn:aws:quicksight:us-east-1:%%TEMPLATE_ACCOUNT_ID%%:template/%%DIST_QUICKSIGHT_NAMESPACE%%_%%SOLUTION_NAME%%_%%DASHED_VERSION%%"
5+
}
36
}

source/images/dashboard.png

-392 KB
Loading

0 commit comments

Comments
 (0)