Twitter Sentiment Analysis Tool

A comprehensive tool for scraping tweets from Nitter.net and analyzing their sentiment using multiple machine learning models.

Features

Tweet Scraping - Scrape tweets from Nitter.net using Selenium without requiring a Twitter API key
Multiple Sentiment Models - Compare results from three different sentiment analysis approaches:
- Pre-trained BERT model (state-of-the-art transformer)
- Fine-tuned BERT model (customized for Twitter sentiment)
- Logistic Regression model (trained on 1.6M tweets dataset)
Comprehensive Visualizations - Generate detailed visual analytics including sentiment distributions, word clouds, and comparison charts

Requirements

Python 3.6+
Chrome/Chromium browser
Required Python packages (see requirements.txt)

Installation

Clone the repository:

git clone https://github.com/ShauryaDusht/selenium-twitter-sentiment-analysis.git
cd selenium-twitter-sentiment-analysis

Install the required dependencies:
```
pip install -r requirements.txt
```
Set up Selenium WebDriver:
- Download Chrome WebDriver that matches your Chrome browser version
- Extract the WebDriver executable and add it to your PATH, or place it in your project directory
- For detailed instructions, see the Selenium WebDriver installation guide

Usage

Run the main script to start the complete pipeline:

python main.py

The script will:

Download the BERT model if needed
Prompt for a keyword to search
Ask for number of tweets to scrape
Scrape tweets from Nitter.net
Analyze the tweets with all three models
Generate visualizations and save results

Individual Components

You can also run components separately:

For scraping only:

python scrapper.py

For analysis only (after scraping):

python analyser.py

Project Structure

main.py - Main script that runs the entire pipeline
download_bert.py - Downloads and extracts the fine-tuned BERT model. Model is saved in Drive.
scrapper.py - Scrapes tweets from Nitter.net using Selenium
analyser.py - Performs sentiment analysis and generates visualizations

Output

The tool generates a directory named after your search query containing:

CSV files with sentiment scores for each model
Visualization images showing sentiment distributions, word clouds, and model comparisons

Project Structure

selenium-twitter-sentiment-analysis/
├── finetuned_bert_sentiment_model/
│   ├── bert_test.py
│   ├── config.json
│   ├── model.safetensors
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   ├── tokenizer.json
│   └── vocab.txt
├── lr_sentiment_model/
│   ├── lr_test.py
│   ├── sentiment_lr_model.pkl
│   └── sentiment_tfidf_vectorizer.pkl
├── Notebooks/
│   └── sentiment_analysis_model.ipynb
├── analyser.py
├── download_bert.py
├── finetuned_bert_sentiment_model.zip
├── LICENSE
├── main.py
├── README.md
├── requirements.txt
└── scrapper.py

The repository contains:

Main Python scripts for scraping and analysis
Pre-trained model files in their respective directories
Jupyter notebooks used for model development
Configuration and documentation files

Notes

The first run will download the fine-tuned BERT model (approx. 500MB)
No Twitter API key is required
If you prefer using the Twitter API with tweepy instead of web scraping, see this repository

Model Accuracies

The sentiment analysis is performed using three different models, each with its own performance characteristics:

Logistic Regression Model: 76% accuracy on training dataset
- Traditional machine learning approach trained on 1.6M tweets
- Fastest inference time
Fine-tuned BERT Model: 84% accuracy on training dataset
- Base BERT model fine-tuned specifically for Twitter sentiment
- Good balance between performance and accuracy
Pre-trained BERT Model: 85% accuracy
- Hugging Face's implementation of BERT for sequence classification
- State-of-the-art performance, but slower inference

These models provide a comprehensive comparison between traditional machine learning approaches and modern transformer-based methods.

Screenshots

Model Visualizations

Fine-tuned BERT Model

Logistic Regression Model

Pre-trained BERT Model

Model Comparison

Scraped Tweets CSV File

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Nitter for providing a Twitter front-end
Hugging Face Transformers for BERT implementations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Sentiment Analysis Tool

Features

Requirements

Installation

Usage

Individual Components

Project Structure

Output

Project Structure

Notes

Model Accuracies

Screenshots

Model Visualizations

Fine-tuned BERT Model

Logistic Regression Model

Pre-trained BERT Model

Model Comparison

Scraped Tweets CSV File

License

Acknowledgments

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Notebooks		Notebooks
lr_sentiment_model		lr_sentiment_model
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyser.py		analyser.py
download_bert.py		download_bert.py
main.py		main.py
requirements.txt		requirements.txt
scrapper.py		scrapper.py

Folders and files

Latest commit

History

Repository files navigation

Twitter Sentiment Analysis Tool

Features

Requirements

Installation

Usage

Individual Components

Project Structure

Output

Project Structure

Notes

Model Accuracies

Screenshots

Model Visualizations

Fine-tuned BERT Model

Logistic Regression Model

Pre-trained BERT Model

Model Comparison

Scraped Tweets CSV File

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages