Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
279 changes: 279 additions & 0 deletions SRS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,279 @@
# SOFTWARE REQUIREMENTS SPECIFICATION (SRS)
## FOR SENTIMENT ANALYSIS PLATFORM

**Version:** 1.0
**Date:** October 26, 2023
**Prepared by:** Jules (AI Software Engineer)

---

# 1. INTRODUCTION

## 1.1 PURPOSE
The purpose of this Software Requirements Specification (SRS) is to document the full scope, functional and non-functional requirements, and design constraints for the **Sentiment Analysis Platform**. This platform is a multimodal AI system designed to analyze human sentiment and emotion through text, audio, and video processing. It aims to bridge the gap in traditional sentiment analysis by incorporating non-verbal cues (facial expressions, tone of voice) alongside textual content.

The intended audience for this document includes the development team, project stakeholders, quality assurance testers, and future maintainers of the software.

## 1.2 SCOPE
The software is named **"Sentiment Analysis Platform"**.
This is a SaaS (Software as a Service) application that provides:
1. **Offline Video/Audio Analysis**: Users upload video or audio files, which are processed to detect emotions (e.g., joy, sadness, anger) and sentiment (positive, negative, neutral) using advanced AI models deployed on cloud infrastructure.
2. **Live Emotion Detection**: A real-time feature that captures video and audio streams from the user's device (via WebRTC) and provides instantaneous feedback on the detected emotions.
3. **Document Analysis**: Users upload PDF or Excel documents containing feedback or text. The system extracts text, performs sentiment analysis, extractive summarization, Named Entity Recognition (NER), and generates word clouds.
4. **Dashboard & Analytics**: A central hub for users to view their analysis history, manage API keys, track quota usage, and manage subscriptions.
5. **API Access**: A developer-friendly REST API to integrate the sentiment analysis capabilities into third-party applications.

The system utilizes a modern tech stack including **Next.js** for the frontend/backend, **Python (FastAPI)** for the document analysis service, **AWS SageMaker** for scalable video inference, **AWS S3** for storage, and **Stripe** for payment processing.

## 1.3 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS
* **SRS**: Software Requirements Specification
* **SaaS**: Software as a Service
* **API**: Application Programming Interface
* **UI/UX**: User Interface / User Experience
* **AWS**: Amazon Web Services
* **S3**: Simple Storage Service (AWS)
* **SageMaker**: AWS service for building, training, and deploying ML models
* **JWT**: JSON Web Token
* **ORM**: Object-Relational Mapping (Prisma)
* **NER**: Named Entity Recognition
* **NLP**: Natural Language Processing
* **WebRTC**: Web Real-Time Communication
* **WebSocket**: A computer communications protocol, providing full-duplex communication channels over a single TCP connection.
* **BERT**: Bidirectional Encoder Representations from Transformers (Language Model)
* **CNN**: Convolutional Neural Network
* **ResNet**: Residual Neural Network

## 1.4 REFERENCES
1. Project Repository: GitHub - `UtkarsHMer05/sentiment-analysis`
2. IEEE Std 830-1998: IEEE Recommended Practice for Software Requirements Specifications
3. Next.js Documentation: https://nextjs.org/docs
4. AWS SageMaker Documentation: https://docs.aws.amazon.com/sagemaker/

## 1.5 OVERVIEW
The remainder of this document is organized as follows:
* **Section 2** describes the general factors that affect the product and its requirements, including user characteristics and constraints.
* **Section 3** details the specific functional and non-functional requirements of the system, including interfaces, performance, and security.
* **Section 4** outlines the system models and architecture.
* **Section 5** discusses future requirements and roadmap items.

---

# 2. OVERALL DESCRIPTION

## 2.1 PRODUCT PERSPECTIVE
The **Sentiment Analysis Platform** is an independent, self-contained web application that interacts with several external services:
* **Cloud Infrastructure**: Relies on AWS S3 for secure file storage and AWS SageMaker for hosting and invoking the heavy-weight video sentiment analysis models.
* **Database**: Uses a relational database (SQLite for dev, PostgreSQL for prod) managed via Prisma ORM to store user data, analysis results, and quotas.
* **Microservices**: Includes a dedicated Python-based FastAPI service (`pdf-analyzer-service`) for processing text-heavy documents (PDFs/Excel), keeping the main application lightweight.
* **Payment Gateway**: Integrates with Stripe to handle subscriptions (Basic, Professional, Premium) and billing events via webhooks.
* **Real-time Server**: Utilizes a WebSocket server for low-latency communication required by the Live Detection feature.

## 2.2 PRODUCT FUNCTIONALITY
The major functions of the system include:

### 2.2.1 User Management & Authentication
* User Registration and Login (Email/Password).
* Secure Session Management using NextAuth.js.
* Role-based access (implied by API quotas and subscription tiers).

### 2.2.2 Dashboard & Quota System
* Visual dashboard displaying recent activities and usage statistics.
* API Key generation, revocation, and management.
* Real-time tracking of API quota usage (requests remaining).
* Visual indicators for subscription plans.

### 2.2.3 Multimodal Video Analysis (Offline)
* Secure generation of pre-signed upload URLs for large video files.
* Direct upload to AWS S3.
* Asynchronous triggering of AWS SageMaker endpoints for inference.
* Analysis of video (facial expressions), audio (tone/prosody), and text (transcription).
* Detailed result visualization: Emotion distribution, Sentiment score, Confidence levels.

### 2.2.4 Live Emotion Detection
* Access to user's webcam and microphone via WebRTC.
* Real-time streaming of video frames and audio chunks to the WebSocket server.
* Instantaneous feedback on detected emotions (Joy, Sadness, Anger, Fear, Surprise, Disgust, Neutral).
* Live transcription and keyword-based sentiment analysis.

### 2.2.5 Document Sentiment Analysis
* Upload support for PDF and Excel (`.xlsx`, `.xls`) files.
* Text extraction from documents (line-by-line or row-by-row).
* **Sentiment Analysis**: Classification of text into Positive, Negative, or Neutral using `legal-bert-base-uncased`.
* **Summarization**: Generation of concise summaries using `distilbart-cnn-12-6`.
* **NER**: Identification of key entities (Persons, Orgs, Locations) using `spaCy`.
* **Visualization**: Generation of Word Clouds representing most frequent terms.
* Support for both "Individual Line" analysis and "Combined" document analysis.

### 2.2.6 Billing & Subscriptions
* Integration with Stripe Checkout for plan upgrades.
* Handling of Stripe Webhooks to automatically update user quotas and subscription status.
* Tiered access (Basic, Professional, Premium) with different quota limits.

## 2.3 USER CHARACTERISTICS
* **General Users/Content Creators**: Individuals looking to analyze the sentiment of their content or videos. No technical expertise required.
* **Developers**: Users who wish to integrate the sentiment analysis API into their own applications. Required knowledge of REST APIs and API Keys.
* **Data Analysts**: Users processing batches of documents (PDFs/Excel) for feedback analysis or market research.
* **Researchers**: Users interested in the multimodal emotion detection capabilities.

## 2.4 CONSTRAINTS
* **Hardware**: The Live Detection feature requires a device with a functional webcam and microphone. The server requires sufficient RAM (8GB+) for running Python ML models if self-hosted.
* **Network**: High-speed internet connection is required for uploading large video files and for low-latency live streaming.
* **Browser**: Modern web browsers (Chrome, Firefox, Safari, Edge) supporting WebRTC and WebSocket.
* **Cost**: Dependence on AWS SageMaker and S3 implies operational costs that scale with usage.
* **Regulatory**: Compliance with data privacy laws (e.g., GDPR) when handling user-uploaded videos and documents.

## 2.5 ASSUMPTIONS AND DEPENDENCIES
* **AWS Account**: An active AWS account with S3 and SageMaker permissions is required for video analysis.
* **Stripe Account**: A Stripe account is required for processing payments.
* **Python Environment**: The PDF Analyzer Service assumes a Python 3.10+ environment with specific ML libraries (PyTorch, Transformers, spaCy) installed.
* **Model Availability**: It is assumed that the required Hugging Face models (`nlpaueb/legal-bert-base-uncased`, `sshleifer/distilbart-cnn-12-6`) remain publicly available.
* **Video Model**: It is assumed the video sentiment model (Modified 3D ResNet) is trained and deployed to the specified SageMaker endpoint.

---

# 3. SPECIFIC REQUIREMENTS

## 3.1 EXTERNAL INTERFACE REQUIREMENTS

### 3.1.1 User Interfaces
* **Landing Page**: Informative page with "Quick Start", "Features", and "Pricing" sections. Responsive design using Tailwind CSS.
* **Auth Screens**: Clean forms for Login and Signup with validation feedback.
* **Dashboard**:
* Sidebar navigation (Overview, Analyze Video, Live Detection, PDF Analysis, API Keys, Settings).
* Usage cards showing "Requests Used", "Plan Details".
* Recent history table.
* **Analysis Results**:
* **Video**: Charts/Graphs showing emotion probabilities over time.
* **PDF**: Interactive tables for line-by-line results, word cloud images, and summary text blocks.
* **Live**: Overlay of detected emotion on the video feed.

### 3.1.2 Hardware Interfaces
* **Input**: Webcam and Microphone for live data capture.
* **Server**: CPU/GPU resources for running the Python `pdf-analyzer-service` and Node.js server.

### 3.1.3 Software Interfaces
* **Database**: Prisma Client connecting to SQLite (Dev) or PostgreSQL (Prod).
* **OS**: Linux/Unix-based environment recommended for deployment (Docker support).
* **Libraries**:
* Frontend: React, Next.js, Framer Motion, Radix UI.
* Backend: NextAuth.js, Stripe SDK, AWS SDK v3.
* ML/Python: PyTorch, Transformers, spaCy, pdfplumber, pandas.

### 3.1.4 Communication Interfaces
* **HTTPS**: All REST API traffic must be encrypted over HTTPS (Port 3000/443).
* **WebSocket**: Secure WebSocket (WSS) for live analysis (Port 8080).
* **Internal HTTP**: Communication between Next.js app and Python PDF service (Port 8001).

## 3.2 FUNCTIONAL REQUIREMENTS

### 3.2.1 Authentication Module
* **REQ-AUTH-01**: System shall allow users to sign up with Email and Password.
* **REQ-AUTH-02**: System shall encrypt passwords using `bcrypt` before storage.
* **REQ-AUTH-03**: System shall generate a session token upon successful login.
* **REQ-AUTH-04**: System shall protect private routes (`/dashboard`, `/api/*`) requiring a valid session.

### 3.2.2 Video Sentiment Analysis (Offline)
* **REQ-VID-01**: System shall provide an endpoint `/api/upload-url` to grant S3 upload permission.
* **REQ-VID-02**: System shall validate file types (e.g., `.mp4`, `.mov`) and size limits.
* **REQ-VID-03**: System shall provide an endpoint `/api/sentiment-inference` that triggers the AWS SageMaker endpoint.
* **REQ-VID-04**: System shall deduct API quota (e.g., 2 points) upon successful analysis initiation.
* **REQ-VID-05**: System shall handle SageMaker errors (timeout, unavailable) and automatically refund the user's quota.
* **REQ-VID-06**: System shall store analysis results (JSON) and update the `VideoFile` record in the database.

### 3.2.3 Live Analysis (Real-time)
* **REQ-LIVE-01**: System shall establish a WebSocket connection authenticated via JWT.
* **REQ-LIVE-02**: System shall accept `video_frame`, `audio_chunk`, and `text_input` messages.
* **REQ-LIVE-03**: System shall process text input using keyword matching to determine sentiment (Positive/Negative/Neutral) and Emotion.
* **REQ-LIVE-04**: System shall broadcast analysis results back to the client with low latency (< 1 sec).
* **REQ-LIVE-05**: System shall support `start_analysis` and `stop_analysis` control messages.

### 3.2.4 Document Analysis
* **REQ-DOC-01**: System shall accept PDF and Excel file uploads via `/api/pdf-analysis`.
* **REQ-DOC-02**: System shall forward the file to the local Python service running on port 8001.
* **REQ-DOC-03**: The Python service shall extract text from PDF pages or Excel rows.
* **REQ-DOC-04**: The Python service shall perform Zero-Shot Classification for sentiment (labels: positive, negative, neutral).
* **REQ-DOC-05**: The Python service shall perform Abstractive Summarization on the text.
* **REQ-DOC-06**: The Python service shall extract Named Entities (NER) and return a frequency count.
* **REQ-DOC-07**: The Python service shall generate a Word Cloud image (Base64 encoded).
* **REQ-DOC-08**: System shall return a combined JSON response with individual line analysis and overall document statistics.

### 3.2.5 Quota & Billing
* **REQ-BILL-01**: System shall maintain a `ApiQuota` record for each user.
* **REQ-BILL-02**: System shall block API requests if the user has insufficient quota.
* **REQ-BILL-03**: System shall create Stripe Checkout sessions for subscription upgrades.
* **REQ-BILL-04**: System shall listen for Stripe Webhooks (`customer.subscription.created`, `invoice.payment_succeeded`) to reset or upgrade quotas.

### 3.2.6 API Key Management
* **REQ-KEY-01**: System shall allow users to generate a unique API Key (starting with `sa_live_`).
* **REQ-KEY-02**: System shall allow users to revoke/delete their API Key.
* **REQ-KEY-03**: API endpoints must validate the `Authorization: Bearer <KEY>` header against the database.

## 3.3 NON-FUNCTIONAL REQUIREMENTS

### 3.3.1 Performance requirements
* **Latency**: Live analysis feedback must appear within 1 second of the event.
* **Throughput**: The PDF service should handle files up to 50MB within reasonable time (e.g., < 30s for 10 pages).
* **Scalability**: The video analysis must leverage AWS SageMaker's auto-scaling capabilities to handle concurrent requests.

### 3.3.2 Security Requirements
* **Data Protection**: API Keys must be stored securely (hashed/encrypted where possible, though often displayed once).
* **Transmission**: All data in transit must be encrypted via TLS 1.2+.
* **Access Control**: Resources (videos, analysis results) must be accessible only by the owner (UserId check).
* **Input Validation**: All file uploads must be validated for type and malicious content (basic extension/MIME check).

### 3.3.3 Reliability & Availability
* **Uptime**: The platform should target 99.9% availability.
* **Failover**: If the Python service is down, the main app should gracefully degrade (disable PDF features) and notify the user.
* **Error Handling**: The system must provide meaningful error messages to the user (e.g., "Quota Exceeded", "File too large") rather than generic 500 errors.

### 3.3.4 Maintainability
* **Code Style**: The codebase adheres to strict ESLint and Prettier configurations.
* **Modularity**: Separation of concerns between Next.js (Web/API), Python (ML Service), and WebSocket (Real-time).
* **Type Safety**: Full use of TypeScript for the frontend and backend to prevent runtime type errors.

---

# 4. SYSTEM MODELS AND ARCHITECTURE

## 4.1 ARCHITECTURE DIAGRAM DESCRIPTION
The system follows a **Microservices-based Hybrid Architecture**:

1. **Frontend Client**: Browser-based React application (Next.js App Router). Handles UI, WebRTC capture, and state management.
2. **API Gateway / Backend**: Next.js API Routes serve as the primary entry point. It handles Authentication, Database operations, and orchestrates calls to other services.
3. **ML Service (Text)**: A dedicated FastAPI (Python) service. It loads the BERT and BART models into memory to provide low-latency inference for documents. It exposes HTTP endpoints consumed by the Next.js backend.
4. **ML Service (Video)**: Hosted on AWS SageMaker. The Next.js backend invokes the endpoint using the AWS SDK. The model itself (3D ResNet) runs in the cloud.
5. **Real-time Service**: A Node.js WebSocket server. It maintains persistent connections for live data streaming. Currently utilizes lightweight logic but is architected to forward frames to an ML inference engine.
6. **Data Layer**:
* **Prisma + DB**: Stores relational data (Users, Sessions, Quotas).
* **AWS S3**: Stores large binary assets (User Videos).

## 4.2 DATA FLOW
1. **Video Analysis**: User -> Next.js (Get Upload URL) -> S3 (Upload Video) -> Next.js (Trigger Inference) -> AWS SageMaker (Process) -> Next.js (Receive Result) -> Database (Save).
2. **PDF Analysis**: User -> Next.js (Upload) -> Python Service (Analyze) -> Next.js (Receive Result) -> User (Display).
3. **Live Detection**: User (Webcam) -> WebSocket -> Server (Analyze) -> WebSocket -> User (Display).

---

# 5. FUTURE REQUIREMENTS

## 5.1 MOBILE APPLICATION
* Development of a React Native mobile app to allow users to record and analyze videos on the go.

## 5.2 ADVANCED ANALYTICS
* Implementation of "Aggregate Analytics" to show sentiment trends over time for a user.
* Comparative analysis between different video files.

## 5.3 GRAPHQL API
* Transition from REST API to GraphQL to allow clients to request specific data fields, reducing over-fetching.

## 5.4 MULTI-LANGUAGE SUPPORT
* Upgrade the NLP models to support multilingual sentiment analysis (e.g., Spanish, French, German).

## 5.5 WAITLIST & SCHEDULING
* (From Example SRS) Implementation of a waitlist system for high-demand processing times or enterprise consulting slots.

## 5.6 NOTIFICATION SYSTEM
* Email or SMS notifications when a long-running video analysis is complete.

---
**End of SRS**