Skip to content

NeoLafuente/garbage_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—‘οΈ Garbage Classifier

Python PyTorch Lightning License

A deep learning-based waste classification system using ResNet18 and PyTorch Lightning

Features β€’ Installation β€’ Usage β€’ Documentation β€’ Results


πŸ“‹ Overview

This project implements an image classification system to automatically categorize waste materials into six classes: cardboard, glass, metal, paper, plastic, and trash. Built with PyTorch Lightning and ResNet18, it provides a modular, scalable solution for waste management automation.

🎯 Key Features

  • ✨ ResNet18 backbone fine-tuned for garbage classification
  • ⚑ PyTorch Lightning framework for clean, scalable training
  • πŸ“Š Comprehensive EDA with visualization notebooks
  • πŸ”„ Batch and single-image prediction support
  • πŸ“ˆ Automatic metrics tracking with loss curves and performance reports
  • πŸ“š Auto-generated documentation using pdoc3
  • πŸŽ“ LaTeX report generation for academic documentation

πŸš€ Quick Start

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Activate uv
source $HOME/.local/bin/env

Interactive demo

# Create Virtual Environment
uv venv garbage-env --python 3.12

# Install PyPI package
uv pip install -U -i https://pypi.org/simple/ garbage-classifier

# Launch demo
garbage-app

Full project

# Clone the repository
git clone https://github.com/NeoLafuente/garbage_classifier.git
cd garbage_classifier

# Sync dependencies
uv sync

# Train the model
uv run source/train.py

# Make a prediction
uv run source/predict.py path/to/image.jpg

πŸ“¦ Dataset

The project uses the Garbage Classification Dataset from Kaggle.

Dataset Preparation

The notebook notebooks/create_sample_dataset.ipynb automatically prepares the dataset:

  • Downloads the Garbage Classification Dataset
  • Creates the sample_dataset folder inside data/raw
  • Reduces dataset size for lightweight experimentation

Dataset Structure:

  • 6 classes: cardboard, glass, metal, paper, plastic, trash
  • Location: data/raw/Garbage_Dataset_Classification/
  • Metadata: Class distributions and image statistics in metadata.csv

πŸ’» Usage

Training

Train the GarbageClassifier model (ResNet18 with PyTorch Lightning):

uv run source/train.py

Output:

  • Model checkpoint: models/weights/model_resnet18_garbage.ckpt
  • Loss curves: models/performance/loss_curves/
  • Training logs with metrics (accuracy, precision, recall, F1-score)

Configuration: Edit source/utils/config.py to customize:

  • Batch size
  • Learning rate
  • Number of epochs
  • Train/validation split ratio

Prediction

Load the trained model and classify new images.

πŸ“Έ Single Image Prediction

# Predict a specific image
uv run source/predict.py img.jpg

# Use default image from config
uv run source/predict.py

πŸ“ Batch Folder Prediction

Process all images in a folder:

uv run source/predict.py data/test_images/
uv run source/predict.py ../new_samples/

Features:

  • Auto-detects valid image files (.jpg, .jpeg, .png, .bmp, .gif, .tiff, .tif)
  • Progress indicators for batch processing
  • Summary table with all predictions

Example Output:

Predicting: cardboard_sample.jpg
Class: cardboard | Confidence: 98.5%

πŸ—οΈ Model Architecture

GarbageClassifier

  • Backbone: ResNet18 (pretrained on ImageNet)
  • Framework: PyTorch Lightning
  • Input: 224x224 RGB images
  • Output: 6-class probability distribution

Custom Components

  1. GarbageDataModule: PyTorch Lightning DataModule for efficient data loading
  2. LossCurveCallback: Custom callback for tracking and saving loss curves
  3. GarbageClassifier: Main model class with training/validation logic

πŸ“š Documentation

Auto-generated Documentation

HTML documentation is auto-generated from source code docstrings using pdoc3.

View Documentation:

  • Open docs/index.html in your browser

Regenerate Documentation:

uv run scripts/generate_docs.py

πŸ““ Notebooks

Interactive Jupyter notebooks for exploration and analysis:

Notebook Description
create_sample_dataset.ipynb Dataset preparation and sampling
dataset_exploration.ipynb EDA with class distribution and visualizations
performance_analysis.ipynb Model evaluation, confusion matrices, error analysis

πŸ“„ Reports

LaTeX Report

Academic-style report with methodology and results:

  • Source: reports/main.tex
  • Compiled PDF: reports/compiled/
  • Figures:
    • reports/figures/EDA/ - Exploratory data analysis
    • reports/figures/performance/ - Model metrics and evaluation

βš™οΈ Configuration

Central configuration in source/utils/config.py:

# Dataset configuration
CLASSES = ['cardboard', 'glass', 'metal', 'paper', 'plastic', 'trash']
DATA_PATH = 'data/raw/Garbage_Dataset_Classification'

# Model hyperparameters
BATCH_SIZE = 32
LEARNING_RATE = 0.001
NUM_EPOCHS = 50

# Data split
TRAIN_RATIO = 0.8
VAL_RATIO = 0.2

πŸ“ Project Organization

.
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ interim              # Intermediate data transformations
β”‚   β”œβ”€β”€ processed            # Final preprocessed data
β”‚   └── raw                  # Original unprocessed datasets
β”‚       └── Garbage_Dataset_Classification
β”‚           β”œβ”€β”€ images       # Image files by class
β”‚           └── metadata.csv # Dataset statistics
β”œβ”€β”€ docs                     # Auto-generated HTML documentation
β”œβ”€β”€ models
β”‚   β”œβ”€β”€ performance          # Loss curves and metrics
β”‚   β”‚   └── loss_curves
β”‚   └── weights              # Trained model checkpoints (.ckpt)
β”œβ”€β”€ notebooks                # Jupyter notebooks for EDA and analysis
β”‚   β”œβ”€β”€ create_sample_dataset.ipynb
β”‚   β”œβ”€β”€ dataset_exploration.ipynb
β”‚   └── performance_analysis.ipynb
β”œβ”€β”€ reports                  # LaTeX reports and figures
β”‚   β”œβ”€β”€ compiled             # PDF reports
β”‚   β”œβ”€β”€ figures
β”‚   β”‚   β”œβ”€β”€ EDA              # Exploratory analysis plots
β”‚   β”‚   └── performance      # Model evaluation plots
β”‚   └── main.tex             # Main report source
β”œβ”€β”€ scripts                  # Utility scripts
β”‚   └── generate_docs.py     # Documentation generator
β”œβ”€β”€ source                   # Main source code
β”‚   β”œβ”€β”€ predict.py           # Prediction script
β”‚   β”œβ”€β”€ train.py             # Training script
β”‚   └── utils
β”‚       β”œβ”€β”€ config.py        # Configuration file
β”‚       └── custom_classes   # Model implementations
β”‚           β”œβ”€β”€ GarbageClassifier.py
β”‚           β”œβ”€β”€ GarbageDataModule.py
β”‚           └── LossCurveCallback.py
β”œβ”€β”€ pyproject.toml           # Project dependencies (uv)
└── README.md

Note: dummy.txt files are placeholders to preserve empty folder structure in Git.


🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments


πŸ“§ Contact

Neo Lafuente - @NeoLafuente

Project Link: https://github.com/NeoLafuente/garbage_classifier


Made with ❀️ and ♻️ for a cleaner planet