You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 30, 2024. It is now read-only.
All notable changes to this project will be documented in this file.
3
+
All notable changes to this project will be documented in this file.
4
4
5
-
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
-
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
5
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
-
## [1.2.0] - 2020-10-29
9
-
### Added
10
-
- New and simplified interactive Amazon QuickSight dashboard that is now automatically generated through an AWS CloudFormation deployment and that customers can extend to suit their business case
8
+
## [1.3.0] - 2020-11-24
11
9
12
-
### Updated
13
-
- Updated to AWS CDK v1.69.0
14
-
- Consolidate Amazon S3 access Log bucket across the solution. All access log files have a prefix that corresponds to the bucket for which they are generated
10
+
### Changed
15
11
16
-
## [1.1.0] - 2020-09-29
17
-
### Updated
18
-
- S3 storage for inference outputs to use Apache Parquet
19
-
- Add partitioning to AWS Glue tables
20
-
- Update to AWS CDK v1.63.0
21
-
- Update to AWS SDK v2.755.0
12
+
- Implementation to refactor to reuse the following architecture patterns from [AWS Solutions Constructs](https://aws.amazon.com/solutions/constructs/)
13
+
- aws-kinesisfirehose-s3
14
+
- aws-kinesisstreams-lambda
15
+
- aws-lambda-step-function
22
16
23
-
## [1.0.0] - 2020-08-28
24
-
### Added
25
-
- Initial release
17
+
### Updated
18
+
19
+
- The join condition for Topic Modeling in Amazon QuickSight dataset to provide accurate topic identification for a specific run
20
+
- ID and name generation for Amazon QuickSigh resource to use dynamic value based on the stack name
21
+
- AWS CDK version to 1.73.0
22
+
- AWS SDK version to 2.790.0
23
+
24
+
## [1.2.0] - 2020-10-29
25
+
26
+
### Added
27
+
28
+
- New and simplified interactive Amazon QuickSight dashboard that is now automatically generated through an AWS CloudFormation deployment and that customers can extend to suit their business case
29
+
30
+
### Updated
31
+
32
+
- Updated to AWS CDK v1.69.0
33
+
- Consolidate Amazon S3 access Log bucket across the solution. All access log files have a prefix that corresponds to the bucket for which they are generated
34
+
35
+
## [1.1.0] - 2020-09-29
36
+
37
+
### Updated
38
+
39
+
- S3 storage for inference outputs to use Apache Parquet
Copy file name to clipboardExpand all lines: README.md
+84-48Lines changed: 84 additions & 48 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,26 +5,35 @@ The Discovering Hot Topics Using Machine Learning solution helps you identify th
5
5
The solution uses machine learning algorithms to automate digital asset (text and image) ingestion and perform near real-time topic modeling, sentiment analysis, and image detection. The solution then visualizes these large-scale customer analyses using an Amazon QuickSight dashboard. This guide provides step-by-step instructions to building a dashboard that provides you with the context and insights necessary to identify trends that help or harm you brand.
6
6
7
7
The solution performs the following key features:
8
-
***Performs topic modeling to detect dominant topics**: identifies the terms that collectively form a topic from within customer feedback
9
-
***Identifies the sentiment of what customers are saying**: uses contextual semantic search to understand the nature of online discussions
10
-
***Determines if images associated with your brand contain unsafe content**: detects unsafe and negative imagery in content
11
-
***Helps customers identify insights in near real-time**: you can use a visualization dashboard to better understand context, threats, and opportunities almost instantly
8
+
9
+
-**Performs topic modeling to detect dominant topics**: identifies the terms that collectively form a topic from within customer feedback
10
+
-**Identifies the sentiment of what customers are saying**: uses contextual semantic search to understand the nature of online discussions
11
+
-**Determines if images associated with your brand contain unsafe content**: detects unsafe and negative imagery in content
12
+
-**Helps customers identify insights in near real-time**: you can use a visualization dashboard to better understand context, threats, and opportunities almost instantly
12
13
13
14
For an overview and solution deployment guide, please visit [Discovering Hot Topics using Machine Learning](https://aws.amazon.com/solutions/implementations/discovering-hot-topics-using-machine-learning)
14
15
15
-
## Architecture Diagram
16
+
## On this Page
17
+
18
+
-[Architecture Overview](#architecture-overview)
19
+
-[Deployment](#deployment)
20
+
-[Source Code](#source-code)
21
+
-[Creating a custom build](#creating-a-custom-build)
22
+
23
+
## Architecture Overview
16
24
17
25
Deploying this solution with the default parameters builds the following environment in the AWS Cloud. The overall architecture of the solution has the following key components. Note that the below diagram represents Twitter as the ingestion feed - there are plans to add other social media platforms in future releases.
26
+
18
27
<palign="center">
19
28
<imgsrc="source/images/architecture.png">
20
29
<br/>
21
30
</p>
22
31
23
-
* Ingestion – Social media feed ingestion using a combination of Lambda functions, Kinesis Data Stream and DynamoDB to manage state
24
-
* Workflow – An AWS Step Function based workflow to orchestrate various services
25
-
* Inference – AWS Cloud’s machine learning capabilities through Amazon Translate, Amazon Comprehend, and Amazon Rekognition
26
-
* Application Integration – Event based architecture approach through the use of AWS Events Bridge
27
-
* Storage and Visualization – A combination of Kinesis Data Firehose, S3 Buckets, Glue, Athena and QuickSight
32
+
- Ingestion – Social media feed ingestion using a combination of Lambda functions, Kinesis Data Stream and DynamoDB to manage state
33
+
- Workflow – An AWS Step Function based workflow to orchestrate various services
34
+
- Inference – AWS Cloud’s machine learning capabilities through Amazon Translate, Amazon Comprehend, and Amazon Rekognition
35
+
- Application Integration – Event based architecture approach through the use of AWS Events Bridge
36
+
- Storage and Visualization – A combination of Kinesis Data Firehose, S3 Buckets, Glue, Athena and QuickSight
28
37
29
38
<palign="center">
30
39
<imgsrc="source/images/dashboard.png">
@@ -33,35 +42,92 @@ Deploying this solution with the default parameters builds the following environ
33
42
34
43
After you deploy the solution, use the included Amazon QuickSight dashboard to visualize the solution's machine learning inferences. The image above is an example visualization dashboard featuring a dominant topic list, donut charts, weekly and monthly trend graphs, a word cloud, a tweet table, and a heat map.
35
44
36
-
## 1. Build the solution
45
+
# AWS CDK Constructs
46
+
47
+
[AWS CDK Solutions Constructs](https://aws.amazon.com/solutions/constructs/) make it easier to consistently create well-architected applications. All AWS Solutions Constructs are reviewed by AWS and use best practices established by the AWS Well-Architected Framework. This solution uses the following AWS CDK Constructs:
48
+
49
+
- aws-events-rule-lambda
50
+
- aws-kinesisfirehose-s3
51
+
- aws-kinesisstreams-lambda
52
+
- aws-lambda-dynamodb
53
+
- aws-lambda-s3
54
+
- aws-lambda-step-function
55
+
56
+
## Deployment
57
+
58
+
The solution is deployed using a CloudFormation template with a lambda backed custom resource that builds the Amazon QuickSight Analaysis and Dashboards. For details on deploying the solution please see the details on the solution home page: [Discovering Hot Topics Using Machine Learning](aws.amazon.com/solutions/implementations/discovering-hot-topics-using-machine-learning/)
59
+
60
+
## Source Code
61
+
62
+
### Project directory structure
63
+
64
+
```
65
+
├── deployment [folder containing build scripts]
66
+
│ ├── cdk-solution-helper [A helper function to help deploy lambda function code through S3 buckets]
67
+
└── source [source code containing CDK App and lambda functions]
68
+
├── bin [entrypoint of the CDK application]
69
+
├── lambda [folder containing source code the lambda functions]
70
+
│ ├── firehose-text-proxy [lambda function to write text analysis output to Amazon Kinesis Firehose]
71
+
│ ├── firehose_topic_proxy [lambda function to write topic analysis output to Amazon Kinesis Firehose]
72
+
│ ├── ingestion-consumer [lambda function that consumes messages from Amazon Kinesis Data Stream]
73
+
│ ├── ingestion-producer [lambda function that makes Twitter API call and pushes data to Amazon Kinesis Data Stream]
74
+
│ ├── integration [lambda function that publishes inference outputs to Amazon Events Bridge]
75
+
│ ├── storage-firehose-processor [lambda function that writes data to S3 buckets to build a relational model]
76
+
│ ├── wf-analyze-text [lambda function to detect sentiments, key phrases and entities using Amazon Comprehend]
77
+
│ ├── wf-check-topic-model [lambda function to check status of topic modeling jobs on Amazon Comprehend]
78
+
│ ├── wf-detect-moderation-labels [lambda function to detect content moderation using Amazon Rekognition]
79
+
│ ├── wf-extract-text-in-image [lambda function to extract text content from images using Amazon Rekognition]
80
+
│ ├── wf-publish-text-inference [lambda function to publish Amazon Comprehend inferences]
81
+
│ ├── wf-submit-topic-model [lambda function to submit topic modeling job]
82
+
│ ├── wf-translate-text [lambda function to translate non-english text using Amazon Translate]
83
+
│ └── wf_publish_topic_model [lambda function to publish topic modeling inferences from Amazon Comprehend]
84
+
├── lib
85
+
│ ├── ingestion [CDK constructs for data ingestion]
86
+
│ ├── integration [CDK constructs for Amazon Events Bridge]
87
+
│ ├── storage [CDK constructs that define storage of the inference events]
88
+
│ ├── text-analysis-workflow [CDK constructs for text analysis of ingested data]
89
+
│ ├── topic-analysis-workflow [CDK constructs for topic visualization of ingested data]
90
+
│ └── visualization [CDK constructs to build a relational database model for visualization]
91
+
```
92
+
93
+
## Creating a custom build
94
+
95
+
The solution can be deployed through the CloudFormation template available on the solution home page: [Discovering Hot Topics Using Machine Learning](aws.amazon.com/solutions/implementations/discovering-hot-topics-using-machine-learning/). To make changes to the solution, using the below steps download or clone this repo, update the source code and then run the deployment/build-s3-dist.sh script to deploy the updated Lambda code to an Amazon S3 bucket in your account.
$DIST_OUTPUT_BUCKET - This is the global name of the distribution. For the bucket name, the AWS Region is added to the global name (example: 'my-bucket-name-us-east-1') to create a regional bucket. The lambda artifact should be uploaded to the regional buckets for the CloudFormation template to pick it up for deployment.
67
133
$SOLUTION_NAME - The name of This solution (example: discovering-hot-topics-using-machine-learning)
@@ -70,44 +136,14 @@ $CF_TEMPLATE_BUCKET_NAME - The name of the S3 bucket where the CloudFormation te
70
136
$QS_TEMPLATE_ACCOUNT - The account from which the Amazon QuickSight templates should be sourced for Amazon QuickSight Analysis and Dashboard creation
71
137
```
72
138
139
+
- Deploy the distributable to an Amazon S3 bucket in your account. _Note:_ you must have the AWS Command Line Interface installed.
73
140
74
-
* Deploy the distributable to an Amazon S3 bucket in your account. _Note:_ you must have the AWS Command Line Interface installed.
0 commit comments