diff --git a/SRS.md b/SRS.md new file mode 100644 index 0000000..84d0e10 --- /dev/null +++ b/SRS.md @@ -0,0 +1,279 @@ +# SOFTWARE REQUIREMENTS SPECIFICATION (SRS) +## FOR SENTIMENT ANALYSIS PLATFORM + +**Version:** 1.0 +**Date:** October 26, 2023 +**Prepared by:** Jules (AI Software Engineer) + +--- + +# 1. INTRODUCTION + +## 1.1 PURPOSE +The purpose of this Software Requirements Specification (SRS) is to document the full scope, functional and non-functional requirements, and design constraints for the **Sentiment Analysis Platform**. This platform is a multimodal AI system designed to analyze human sentiment and emotion through text, audio, and video processing. It aims to bridge the gap in traditional sentiment analysis by incorporating non-verbal cues (facial expressions, tone of voice) alongside textual content. + +The intended audience for this document includes the development team, project stakeholders, quality assurance testers, and future maintainers of the software. + +## 1.2 SCOPE +The software is named **"Sentiment Analysis Platform"**. +This is a SaaS (Software as a Service) application that provides: +1. **Offline Video/Audio Analysis**: Users upload video or audio files, which are processed to detect emotions (e.g., joy, sadness, anger) and sentiment (positive, negative, neutral) using advanced AI models deployed on cloud infrastructure. +2. **Live Emotion Detection**: A real-time feature that captures video and audio streams from the user's device (via WebRTC) and provides instantaneous feedback on the detected emotions. +3. **Document Analysis**: Users upload PDF or Excel documents containing feedback or text. The system extracts text, performs sentiment analysis, extractive summarization, Named Entity Recognition (NER), and generates word clouds. +4. **Dashboard & Analytics**: A central hub for users to view their analysis history, manage API keys, track quota usage, and manage subscriptions. +5. **API Access**: A developer-friendly REST API to integrate the sentiment analysis capabilities into third-party applications. + +The system utilizes a modern tech stack including **Next.js** for the frontend/backend, **Python (FastAPI)** for the document analysis service, **AWS SageMaker** for scalable video inference, **AWS S3** for storage, and **Stripe** for payment processing. + +## 1.3 DEFINITIONS, ACRONYMS, AND ABBREVIATIONS +* **SRS**: Software Requirements Specification +* **SaaS**: Software as a Service +* **API**: Application Programming Interface +* **UI/UX**: User Interface / User Experience +* **AWS**: Amazon Web Services +* **S3**: Simple Storage Service (AWS) +* **SageMaker**: AWS service for building, training, and deploying ML models +* **JWT**: JSON Web Token +* **ORM**: Object-Relational Mapping (Prisma) +* **NER**: Named Entity Recognition +* **NLP**: Natural Language Processing +* **WebRTC**: Web Real-Time Communication +* **WebSocket**: A computer communications protocol, providing full-duplex communication channels over a single TCP connection. +* **BERT**: Bidirectional Encoder Representations from Transformers (Language Model) +* **CNN**: Convolutional Neural Network +* **ResNet**: Residual Neural Network + +## 1.4 REFERENCES +1. Project Repository: GitHub - `UtkarsHMer05/sentiment-analysis` +2. IEEE Std 830-1998: IEEE Recommended Practice for Software Requirements Specifications +3. Next.js Documentation: https://nextjs.org/docs +4. AWS SageMaker Documentation: https://docs.aws.amazon.com/sagemaker/ + +## 1.5 OVERVIEW +The remainder of this document is organized as follows: +* **Section 2** describes the general factors that affect the product and its requirements, including user characteristics and constraints. +* **Section 3** details the specific functional and non-functional requirements of the system, including interfaces, performance, and security. +* **Section 4** outlines the system models and architecture. +* **Section 5** discusses future requirements and roadmap items. + +--- + +# 2. OVERALL DESCRIPTION + +## 2.1 PRODUCT PERSPECTIVE +The **Sentiment Analysis Platform** is an independent, self-contained web application that interacts with several external services: +* **Cloud Infrastructure**: Relies on AWS S3 for secure file storage and AWS SageMaker for hosting and invoking the heavy-weight video sentiment analysis models. +* **Database**: Uses a relational database (SQLite for dev, PostgreSQL for prod) managed via Prisma ORM to store user data, analysis results, and quotas. +* **Microservices**: Includes a dedicated Python-based FastAPI service (`pdf-analyzer-service`) for processing text-heavy documents (PDFs/Excel), keeping the main application lightweight. +* **Payment Gateway**: Integrates with Stripe to handle subscriptions (Basic, Professional, Premium) and billing events via webhooks. +* **Real-time Server**: Utilizes a WebSocket server for low-latency communication required by the Live Detection feature. + +## 2.2 PRODUCT FUNCTIONALITY +The major functions of the system include: + +### 2.2.1 User Management & Authentication +* User Registration and Login (Email/Password). +* Secure Session Management using NextAuth.js. +* Role-based access (implied by API quotas and subscription tiers). + +### 2.2.2 Dashboard & Quota System +* Visual dashboard displaying recent activities and usage statistics. +* API Key generation, revocation, and management. +* Real-time tracking of API quota usage (requests remaining). +* Visual indicators for subscription plans. + +### 2.2.3 Multimodal Video Analysis (Offline) +* Secure generation of pre-signed upload URLs for large video files. +* Direct upload to AWS S3. +* Asynchronous triggering of AWS SageMaker endpoints for inference. +* Analysis of video (facial expressions), audio (tone/prosody), and text (transcription). +* Detailed result visualization: Emotion distribution, Sentiment score, Confidence levels. + +### 2.2.4 Live Emotion Detection +* Access to user's webcam and microphone via WebRTC. +* Real-time streaming of video frames and audio chunks to the WebSocket server. +* Instantaneous feedback on detected emotions (Joy, Sadness, Anger, Fear, Surprise, Disgust, Neutral). +* Live transcription and keyword-based sentiment analysis. + +### 2.2.5 Document Sentiment Analysis +* Upload support for PDF and Excel (`.xlsx`, `.xls`) files. +* Text extraction from documents (line-by-line or row-by-row). +* **Sentiment Analysis**: Classification of text into Positive, Negative, or Neutral using `legal-bert-base-uncased`. +* **Summarization**: Generation of concise summaries using `distilbart-cnn-12-6`. +* **NER**: Identification of key entities (Persons, Orgs, Locations) using `spaCy`. +* **Visualization**: Generation of Word Clouds representing most frequent terms. +* Support for both "Individual Line" analysis and "Combined" document analysis. + +### 2.2.6 Billing & Subscriptions +* Integration with Stripe Checkout for plan upgrades. +* Handling of Stripe Webhooks to automatically update user quotas and subscription status. +* Tiered access (Basic, Professional, Premium) with different quota limits. + +## 2.3 USER CHARACTERISTICS +* **General Users/Content Creators**: Individuals looking to analyze the sentiment of their content or videos. No technical expertise required. +* **Developers**: Users who wish to integrate the sentiment analysis API into their own applications. Required knowledge of REST APIs and API Keys. +* **Data Analysts**: Users processing batches of documents (PDFs/Excel) for feedback analysis or market research. +* **Researchers**: Users interested in the multimodal emotion detection capabilities. + +## 2.4 CONSTRAINTS +* **Hardware**: The Live Detection feature requires a device with a functional webcam and microphone. The server requires sufficient RAM (8GB+) for running Python ML models if self-hosted. +* **Network**: High-speed internet connection is required for uploading large video files and for low-latency live streaming. +* **Browser**: Modern web browsers (Chrome, Firefox, Safari, Edge) supporting WebRTC and WebSocket. +* **Cost**: Dependence on AWS SageMaker and S3 implies operational costs that scale with usage. +* **Regulatory**: Compliance with data privacy laws (e.g., GDPR) when handling user-uploaded videos and documents. + +## 2.5 ASSUMPTIONS AND DEPENDENCIES +* **AWS Account**: An active AWS account with S3 and SageMaker permissions is required for video analysis. +* **Stripe Account**: A Stripe account is required for processing payments. +* **Python Environment**: The PDF Analyzer Service assumes a Python 3.10+ environment with specific ML libraries (PyTorch, Transformers, spaCy) installed. +* **Model Availability**: It is assumed that the required Hugging Face models (`nlpaueb/legal-bert-base-uncased`, `sshleifer/distilbart-cnn-12-6`) remain publicly available. +* **Video Model**: It is assumed the video sentiment model (Modified 3D ResNet) is trained and deployed to the specified SageMaker endpoint. + +--- + +# 3. SPECIFIC REQUIREMENTS + +## 3.1 EXTERNAL INTERFACE REQUIREMENTS + +### 3.1.1 User Interfaces +* **Landing Page**: Informative page with "Quick Start", "Features", and "Pricing" sections. Responsive design using Tailwind CSS. +* **Auth Screens**: Clean forms for Login and Signup with validation feedback. +* **Dashboard**: + * Sidebar navigation (Overview, Analyze Video, Live Detection, PDF Analysis, API Keys, Settings). + * Usage cards showing "Requests Used", "Plan Details". + * Recent history table. +* **Analysis Results**: + * **Video**: Charts/Graphs showing emotion probabilities over time. + * **PDF**: Interactive tables for line-by-line results, word cloud images, and summary text blocks. + * **Live**: Overlay of detected emotion on the video feed. + +### 3.1.2 Hardware Interfaces +* **Input**: Webcam and Microphone for live data capture. +* **Server**: CPU/GPU resources for running the Python `pdf-analyzer-service` and Node.js server. + +### 3.1.3 Software Interfaces +* **Database**: Prisma Client connecting to SQLite (Dev) or PostgreSQL (Prod). +* **OS**: Linux/Unix-based environment recommended for deployment (Docker support). +* **Libraries**: + * Frontend: React, Next.js, Framer Motion, Radix UI. + * Backend: NextAuth.js, Stripe SDK, AWS SDK v3. + * ML/Python: PyTorch, Transformers, spaCy, pdfplumber, pandas. + +### 3.1.4 Communication Interfaces +* **HTTPS**: All REST API traffic must be encrypted over HTTPS (Port 3000/443). +* **WebSocket**: Secure WebSocket (WSS) for live analysis (Port 8080). +* **Internal HTTP**: Communication between Next.js app and Python PDF service (Port 8001). + +## 3.2 FUNCTIONAL REQUIREMENTS + +### 3.2.1 Authentication Module +* **REQ-AUTH-01**: System shall allow users to sign up with Email and Password. +* **REQ-AUTH-02**: System shall encrypt passwords using `bcrypt` before storage. +* **REQ-AUTH-03**: System shall generate a session token upon successful login. +* **REQ-AUTH-04**: System shall protect private routes (`/dashboard`, `/api/*`) requiring a valid session. + +### 3.2.2 Video Sentiment Analysis (Offline) +* **REQ-VID-01**: System shall provide an endpoint `/api/upload-url` to grant S3 upload permission. +* **REQ-VID-02**: System shall validate file types (e.g., `.mp4`, `.mov`) and size limits. +* **REQ-VID-03**: System shall provide an endpoint `/api/sentiment-inference` that triggers the AWS SageMaker endpoint. +* **REQ-VID-04**: System shall deduct API quota (e.g., 2 points) upon successful analysis initiation. +* **REQ-VID-05**: System shall handle SageMaker errors (timeout, unavailable) and automatically refund the user's quota. +* **REQ-VID-06**: System shall store analysis results (JSON) and update the `VideoFile` record in the database. + +### 3.2.3 Live Analysis (Real-time) +* **REQ-LIVE-01**: System shall establish a WebSocket connection authenticated via JWT. +* **REQ-LIVE-02**: System shall accept `video_frame`, `audio_chunk`, and `text_input` messages. +* **REQ-LIVE-03**: System shall process text input using keyword matching to determine sentiment (Positive/Negative/Neutral) and Emotion. +* **REQ-LIVE-04**: System shall broadcast analysis results back to the client with low latency (< 1 sec). +* **REQ-LIVE-05**: System shall support `start_analysis` and `stop_analysis` control messages. + +### 3.2.4 Document Analysis +* **REQ-DOC-01**: System shall accept PDF and Excel file uploads via `/api/pdf-analysis`. +* **REQ-DOC-02**: System shall forward the file to the local Python service running on port 8001. +* **REQ-DOC-03**: The Python service shall extract text from PDF pages or Excel rows. +* **REQ-DOC-04**: The Python service shall perform Zero-Shot Classification for sentiment (labels: positive, negative, neutral). +* **REQ-DOC-05**: The Python service shall perform Abstractive Summarization on the text. +* **REQ-DOC-06**: The Python service shall extract Named Entities (NER) and return a frequency count. +* **REQ-DOC-07**: The Python service shall generate a Word Cloud image (Base64 encoded). +* **REQ-DOC-08**: System shall return a combined JSON response with individual line analysis and overall document statistics. + +### 3.2.5 Quota & Billing +* **REQ-BILL-01**: System shall maintain a `ApiQuota` record for each user. +* **REQ-BILL-02**: System shall block API requests if the user has insufficient quota. +* **REQ-BILL-03**: System shall create Stripe Checkout sessions for subscription upgrades. +* **REQ-BILL-04**: System shall listen for Stripe Webhooks (`customer.subscription.created`, `invoice.payment_succeeded`) to reset or upgrade quotas. + +### 3.2.6 API Key Management +* **REQ-KEY-01**: System shall allow users to generate a unique API Key (starting with `sa_live_`). +* **REQ-KEY-02**: System shall allow users to revoke/delete their API Key. +* **REQ-KEY-03**: API endpoints must validate the `Authorization: Bearer ` header against the database. + +## 3.3 NON-FUNCTIONAL REQUIREMENTS + +### 3.3.1 Performance requirements +* **Latency**: Live analysis feedback must appear within 1 second of the event. +* **Throughput**: The PDF service should handle files up to 50MB within reasonable time (e.g., < 30s for 10 pages). +* **Scalability**: The video analysis must leverage AWS SageMaker's auto-scaling capabilities to handle concurrent requests. + +### 3.3.2 Security Requirements +* **Data Protection**: API Keys must be stored securely (hashed/encrypted where possible, though often displayed once). +* **Transmission**: All data in transit must be encrypted via TLS 1.2+. +* **Access Control**: Resources (videos, analysis results) must be accessible only by the owner (UserId check). +* **Input Validation**: All file uploads must be validated for type and malicious content (basic extension/MIME check). + +### 3.3.3 Reliability & Availability +* **Uptime**: The platform should target 99.9% availability. +* **Failover**: If the Python service is down, the main app should gracefully degrade (disable PDF features) and notify the user. +* **Error Handling**: The system must provide meaningful error messages to the user (e.g., "Quota Exceeded", "File too large") rather than generic 500 errors. + +### 3.3.4 Maintainability +* **Code Style**: The codebase adheres to strict ESLint and Prettier configurations. +* **Modularity**: Separation of concerns between Next.js (Web/API), Python (ML Service), and WebSocket (Real-time). +* **Type Safety**: Full use of TypeScript for the frontend and backend to prevent runtime type errors. + +--- + +# 4. SYSTEM MODELS AND ARCHITECTURE + +## 4.1 ARCHITECTURE DIAGRAM DESCRIPTION +The system follows a **Microservices-based Hybrid Architecture**: + +1. **Frontend Client**: Browser-based React application (Next.js App Router). Handles UI, WebRTC capture, and state management. +2. **API Gateway / Backend**: Next.js API Routes serve as the primary entry point. It handles Authentication, Database operations, and orchestrates calls to other services. +3. **ML Service (Text)**: A dedicated FastAPI (Python) service. It loads the BERT and BART models into memory to provide low-latency inference for documents. It exposes HTTP endpoints consumed by the Next.js backend. +4. **ML Service (Video)**: Hosted on AWS SageMaker. The Next.js backend invokes the endpoint using the AWS SDK. The model itself (3D ResNet) runs in the cloud. +5. **Real-time Service**: A Node.js WebSocket server. It maintains persistent connections for live data streaming. Currently utilizes lightweight logic but is architected to forward frames to an ML inference engine. +6. **Data Layer**: + * **Prisma + DB**: Stores relational data (Users, Sessions, Quotas). + * **AWS S3**: Stores large binary assets (User Videos). + +## 4.2 DATA FLOW +1. **Video Analysis**: User -> Next.js (Get Upload URL) -> S3 (Upload Video) -> Next.js (Trigger Inference) -> AWS SageMaker (Process) -> Next.js (Receive Result) -> Database (Save). +2. **PDF Analysis**: User -> Next.js (Upload) -> Python Service (Analyze) -> Next.js (Receive Result) -> User (Display). +3. **Live Detection**: User (Webcam) -> WebSocket -> Server (Analyze) -> WebSocket -> User (Display). + +--- + +# 5. FUTURE REQUIREMENTS + +## 5.1 MOBILE APPLICATION +* Development of a React Native mobile app to allow users to record and analyze videos on the go. + +## 5.2 ADVANCED ANALYTICS +* Implementation of "Aggregate Analytics" to show sentiment trends over time for a user. +* Comparative analysis between different video files. + +## 5.3 GRAPHQL API +* Transition from REST API to GraphQL to allow clients to request specific data fields, reducing over-fetching. + +## 5.4 MULTI-LANGUAGE SUPPORT +* Upgrade the NLP models to support multilingual sentiment analysis (e.g., Spanish, French, German). + +## 5.5 WAITLIST & SCHEDULING +* (From Example SRS) Implementation of a waitlist system for high-demand processing times or enterprise consulting slots. + +## 5.6 NOTIFICATION SYSTEM +* Email or SMS notifications when a long-running video analysis is complete. + +--- +**End of SRS**