GPU-Powered BERT Fine-Tuning for Sentiment Analysis

This project provides a complete workflow for fine-tuning a BERT model for sentiment analysis on the IMDb dataset. It is optimized for use with NVIDIA GPUs and includes a Streamlit web application for real-time inference.

Features

Efficient Fine-Tuning: Leverages Hugging Face Transformers and PyTorch for efficient training on CUDA-enabled GPUs.
Pre-configured Scripts: Includes scripts for setup, training, and running the application.
Interactive Web App: A Streamlit app to test the fine-tuned model with your own text.

Tech Stack

Python 3.9+
PyTorch
Hugging Face Transformers
Streamlit
Scikit-learn
Accelerate

Getting Started

Prerequisites

NVIDIA GPU: A CUDA-compatible GPU is required for training.
NVIDIA Drivers: Ensure you have the latest NVIDIA drivers installed.
Python 3.9+: Make sure you have a compatible Python version installed.

Installation

Clone the repository:

git clone https://github.com/your-username/gpu-powered-bert-finetuning.git
cd gpu-powered-bert-finetuning

Run the setup script:

This will create a virtual environment, install the required dependencies, and check for GPU availability.
```
./setup.sh
```

Usage

Training the Model

To fine-tune the BERT model on the IMDb dataset, run the setup script with the --train flag:

./setup.sh --train

The script will download the dataset, tokenize it, and train the model. The fine-tuned model will be saved in the model/fine_tuned_bert directory.

Running the Streamlit App

Once the model is trained, you can run the Streamlit web application to perform sentiment analysis on your own text:

streamlit run app/app.py

The app will be available at http://localhost:8501.

Project Structure

.
├── app
│   └── app.py              # Streamlit application
├── finetuned_results       # Training checkpoints
├── model
│   └── fine_tuned_bert     # Saved fine-tuned model
├── src
│   ├── gpu_check.py        # GPU availability check
│   └── train_model.py      # Model training script
├── requirements.txt        # Python dependencies
└── setup.sh                # Setup script

Results

The following table shows the performance of the BERT model before and after fine-tuning on the IMDb dataset:

Stage	Description	Model Used	Accuracy
Baseline (Before Fine-Tuning)	Directly used bert-base-uncased pretrained model on raw dataset (no fine-tuning)	Bert (Pretrained)	52.4%
Fine-Tuning (Raw data)	Fine-tuned BERT on dataset without additional preprocessing	Bert (Fine-Tuned)	89.4%

These results demonstrate the significant improvement in accuracy after fine-tuning the model on the target dataset.

Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue if you have any suggestions or find any bugs.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
app		app
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU-Powered BERT Fine-Tuning for Sentiment Analysis

Features

Tech Stack

Getting Started

Prerequisites

Installation

Usage

Training the Model

Running the Streamlit App

Project Structure

Results

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU-Powered BERT Fine-Tuning for Sentiment Analysis

Features

Tech Stack

Getting Started

Prerequisites

Installation

Usage

Training the Model

Running the Streamlit App

Project Structure

Results

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages