Skip to content

HistColoriful is a GAN based colorization project designed to convert grayscale images into realistic color outputs. The system includes LAB preprocessing, a UNet generator with skip connections, a PatchGAN discriminator, and classical colorization baselines. Results include PSNR, SSIM, ablation studies, and speed comparisons.

License

Notifications You must be signed in to change notification settings

zeeza18/HistColoriful-Colorizing-Grayscale-Images-Using-Conditional-GANs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎨 HISTCOLORIFUL: Colorizing Grayscale Images Using Conditional GANs

A comprehensive capstone project comparing classical color transfer techniques against custom GAN architecture for automatic image colorization

Python 3.10+ PyTorch FastAPI License: MIT


Maintained by Mohammed Azeezulla (zeeza18) Β· Introduction to Image Processing

πŸ“‹ Table of Contents


✨ Overview

HISTCOLORIFUL is a comprehensive image colorization research project that answers a fundamental question: How far can classical computer vision methods go before deep learning becomes necessary?

Built on a carefully curated 256Γ—256 LAB subset of the COCO dataset with 10 held-out test images, this project systematically compares:

  • πŸ”΅ Classical Methods: Histogram Matching, K-means Color Transfer, Local Gaussian Transfer
  • πŸ”΄ Deep Learning: Custom Pix2Pix GAN with U-Net Generator + PatchGAN Discriminator

🎯 Research Questions

  1. How do interpretable classical algorithms perform vs. trainable GANs?
  2. What hyperparameters optimize GAN colorization quality?
  3. Can we quantify the speed vs. quality tradeoff?
  4. What are the failure modes of each approach?

πŸ† Key Findings

Method PSNR (dB) SSIM Speed Reference Required
K-means Transfer πŸ‘‘ 22.29 0.8908 Slow βœ… Yes
GAN (Ξ»=50) πŸ€– 20.56 0.8397 Very Fast ❌ No
Histogram Matching 21.85 0.8742 Fast βœ… Yes
Local Gaussian 21.43 0.8621 Fast βœ… Yes

🎯 Key Features

πŸ”¬ Classical Computer Vision

  • βœ… Histogram Matching - CDF alignment per LAB channel
  • βœ… K-means Color Transfer - 8-cluster palette mapping
  • βœ… Local Gaussian Transfer - Windowed statistics matching
  • βœ… GPU-accelerated NumPy operations
  • βœ… Per-image metric logging

🧠 Deep Learning Pipeline

  • βœ… Pix2Pix GAN Architecture - U-Net generator with skip connections
  • βœ… PatchGAN Discriminator - 70x70 receptive field
  • βœ… Mixed Precision Training - FP16 for faster convergence
  • βœ… Comprehensive Ablation Studies - 6 hyperparameter configurations
  • βœ… Checkpoint Management - Best model serialization

πŸ“Š Evaluation Framework

  • βœ… PSNR & SSIM Metrics - Quantitative quality assessment
  • βœ… Statistical Testing - Paired t-tests, confidence intervals
  • βœ… Speed Benchmarking - CPU vs GPU inference timing
  • βœ… Visual Comparisons - Side-by-side method grids
  • βœ… Failure Case Analysis - Worst-performing image documentation

🌐 Production Deployment

  • βœ… FastAPI REST API - Async colorization endpoints
  • βœ… Interactive Web UI - HTML/CSS/JS upload interface
  • βœ… Multi-method Support - Classical + GAN inference
  • βœ… Health Monitoring - System status endpoints
  • βœ… Auto-generated Docs - OpenAPI/Swagger UI

πŸ—οΈ Architecture

HistColoriful GAN Pipeline

🎨 Classical Pipeline

  • Convert the grayscale L channel to LAB to operate in a perceptually uniform space.
  • When classical methods are selected, pull color statistics from a user-provided reference image.
  • Apply the selected algorithm:
    • Histogram Matching aligns per-channel LAB histograms.
    • K-means Transfer builds an 8-color palette from the reference image.
    • Local Gaussian Transfer matches local mean/variance windows.
  • Merge the predicted AB channels with the input L channel and convert back to RGB.

πŸ€– Deep Learning Architecture

Generator: U-Net with Skip Connections

  • Input: 1x256x256 grayscale (L channel) image.
  • Encoder: Conv2D blocks with filters [64, 128, 256, 512, 512], BatchNorm on all but the first block, LeakyReLU (0.2), and stride-2 downsampling.
  • Bottleneck: Conv2D(512) + BatchNorm + ReLU.
  • Decoder: ConvTranspose2D blocks with filters [512, 512, 256, 128, 64], BatchNorm throughout, Dropout(0.5) on the first two blocks, and skip connections from the encoder.
  • Output: ConvTranspose2D -> 2x256x256 AB channels with Tanh activation.

Discriminator: PatchGAN (70x70)

  • Input: L channel concatenated with real or generated AB channels.
  • Sequential Conv2D layers with filters [64, 128, 256, 512, 1]; BatchNorm from the second layer onward and LeakyReLU(0.2) activations.
  • Produces a grid of real/fake logits, providing localized supervision per 70x70 patch.

Loss Function

L_GAN = E[log D(x, y)] + E[log(1 - D(x, G(x)))]

L_L1 = E[||y - G(x)||_1]

L_Total = L_GAN + lambda_L1 * L_L1, where lambda_L1 = 50 (optimal from ablation)

πŸ› οΈ Technologies Used

🧠 Deep Learning & Computer Vision

Technology Version Purpose
PyTorch 2.0+ Neural network framework, GPU acceleration
TorchVision 0.15+ Image transformations, data augmentation
OpenCV 4.8+ Image I/O, color space conversion
NumPy 1.24+ Vectorized operations, numerical computing
scikit-image 0.21+ PSNR/SSIM metrics, advanced processing
SciPy 1.11+ Statistical tests, signal processing

🌐 Web Framework & API

Technology Version Purpose
FastAPI 0.104+ Async REST API, auto-documentation
Uvicorn 0.24+ ASGI server, WebSocket support
Pydantic 2.4+ Data validation, settings management

πŸ“Š Data Science & Visualization

Technology Version Purpose
Pandas 2.0+ Tabular data, metric aggregation
Matplotlib 3.7+ Loss curves, result plots
Seaborn 0.12+ Statistical visualizations
Pillow 10.0+ Image format handling

πŸ› οΈ Development Tools

Technology Purpose
Jupyter Interactive experimentation
Git LFS Large file storage (models, datasets)
tqdm Progress bars, training monitoring
pytest Unit testing, API validation

πŸ’» Hardware Acceleration

Component Specification Usage
GPU RTX 5090 (32GB VRAM) GAN training, inference
CUDA 11.8+ Parallel tensor operations
cuDNN 8.6+ Optimized deep learning primitives
CPU Threadripper/Xeon Classical method processing
RAM 64GB DDR5 Large dataset loading
Storage NVMe SSD Fast data I/O

πŸ“ Project Structure

HISTCOLORIFUL/
β”‚
β”œβ”€β”€ πŸ““ HISTCOLORIFUL.ipynb           # Main experiment notebook (Sections 1-7)
β”œβ”€β”€ πŸ“‹ project_summary.json          # High-level findings & configurations
β”œβ”€β”€ πŸ“¦ requirements.txt              # Python dependencies
β”œβ”€β”€ πŸ“– README.md                     # This file
β”œβ”€β”€ πŸ”’ .gitignore                    # Git exclusions
β”‚
β”œβ”€β”€ 🌐 app/                          # FastAPI Production Service
β”‚   β”œβ”€β”€ main.py                      # API endpoints & inference logic
β”‚   β”œβ”€β”€ models.py                    # PyTorch model definitions
β”‚   β”œβ”€β”€ utils.py                     # Image preprocessing utilities
β”‚   β”œβ”€β”€ config.py                    # Environment configuration
β”‚   └── static/                      # Frontend assets
β”‚       β”œβ”€β”€ index.html               # Upload interface
β”‚       β”œβ”€β”€ style.css                # UI styling
β”‚       └── script.js                # Client-side logic
β”‚
β”œβ”€β”€ πŸ’Ύ data/                         # Dataset (not committed, except test)
β”‚   β”œβ”€β”€ train/                       # 5,000 training pairs (256Γ—256 LAB)
β”‚   β”œβ”€β”€ val/                         # 500 validation pairs
β”‚   └── test/                        # 10 held-out test images βœ…
β”‚
β”œβ”€β”€ 🎨 classical_results/            # Classical method outputs
β”‚   β”œβ”€β”€ histogram_matching/          # 10 colorized + metrics
β”‚   β”œβ”€β”€ kmeans_transfer/             # 10 colorized + metrics
β”‚   β”œβ”€β”€ local_gaussian/              # 10 colorized + metrics
β”‚   └── comparison_summary.csv       # Cross-method PSNR/SSIM
β”‚
β”œβ”€β”€ πŸ€– gan_results/                  # GAN outputs
β”‚   β”œβ”€β”€ test_colorized/              # 10 generated images
β”‚   β”œβ”€β”€ metrics.csv                  # Per-image PSNR, SSIM, time
β”‚   └── training_history.pkl         # Loss curves, LR schedule
β”‚
β”œβ”€β”€ πŸ–ΌοΈ final_comparisons/            # Side-by-side visualizations
β”‚   β”œβ”€β”€ comparison_image_01.png      # 5-panel comparison
β”‚   β”œβ”€β”€ comparison_image_02.png
β”‚   └── ...
β”‚
β”œβ”€β”€ πŸ’Ύ model/                        # Saved checkpoints
β”‚   β”œβ”€β”€ ablation_lambda50.pt         # Best GAN (Ξ»=50, epoch 180)
β”‚   β”œβ”€β”€ generator_only.pt            # Deployment-ready generator
β”‚   └── training_config.json         # Hyperparameters
β”‚
β”œβ”€β”€ πŸ“Š results_csv/                  # Quantitative analysis
β”‚   β”œβ”€β”€ final_project_summary.csv    # Overall rankings
β”‚   β”œβ”€β”€ ablation_study_summary.csv   # Hyperparameter sweep
β”‚   β”œβ”€β”€ speed_comparison.csv         # Inference benchmarks
β”‚   └── statistical_tests.csv        # T-tests, confidence intervals
β”‚
└── πŸ“ˆ results_png/                  # Visualizations
    β”œβ”€β”€ training_losses.png          # Loss curves
    β”œβ”€β”€ ablation_lambda_sweep.png    # PSNR vs lambda_L1
    β”œβ”€β”€ failure_cases.png            # Worst-performing images
    β”œβ”€β”€ final_results_dashboard.png  # Composite summary
    └── timing_boxplots.png          # Speed distributions

⚑ Quick Start

1️⃣ Clone Repository

git clone https://github.com/zeeza18/HistColoriful-Colorizing-Grayscale-Images-Using-Conditional-GANs.git
cd HistColoriful-Colorizing-Grayscale-Images-Using-Conditional-GANs

2️⃣ Create Virtual Environment

# Linux/macOS
python -m venv .venv
source .venv/bin/activate

# Windows PowerShell
python -m venv .venv
.venv\Scripts\Activate.ps1

3️⃣ Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

4️⃣ Verify GPU (Optional but Recommended)

python -c "import torch; print(f'πŸš€ CUDA Available: {torch.cuda.is_available()}')"

Expected Output:

πŸš€ CUDA Available: True

πŸ§ͺ Running Experiments

πŸ““ Jupyter Notebook Workflow

jupyter notebook HISTCOLORIFUL.ipynb

Notebook Sections:

Section Description Runtime
1. Data Preparation Load COCO subset, split train/val/test ~5 min
2. Classical Baselines Run histogram, K-means, Gaussian methods ~10 min
3. GAN Training Train Pix2Pix for 180 epochs ~4 hours (GPU)
4. Evaluation Compute PSNR, SSIM, generate visualizations ~15 min
5. Ablation Studies Sweep lambda_L1, learning rate, epochs ~24 hours (GPU)
6. Statistical Analysis T-tests, confidence intervals, rankings ~5 min
7. Final Synthesis Generate comparison grids, dashboards ~10 min

🎯 Quick Evaluation (Pre-trained Model)

from app.models import load_generator
from app.utils import colorize_image

# Load best checkpoint
generator = load_generator('model/ablation_lambda50.pt')

# Colorize test image
grayscale = cv2.imread('data/test/image_01_gray.png', 0)
colorized = colorize_image(generator, grayscale)
cv2.imwrite('output.png', colorized)

πŸš€ FastAPI Demo

🌐 Start Server

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Console Output:

INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

πŸ–₯️ Access Web Interface

Open browser to: http://localhost:8000

πŸ“‘ API Endpoints

Endpoint Method Description
/ GET Serve web UI
/api/info GET List available methods
/api/colorize POST Upload & colorize image
/health GET System status check
/docs GET Interactive API documentation

πŸ§ͺ Test with cURL

# Colorize with GAN (no reference needed)
curl -X POST "http://localhost:8000/api/colorize" \
  -F "grayscale=@test_gray.png" \
  -F "method=gan" \
  -o colorized_gan.png

# Colorize with K-means (requires reference)
curl -X POST "http://localhost:8000/api/colorize" \
  -F "grayscale=@test_gray.png" \
  -F "reference=@reference_color.png" \
  -F "method=kmeans" \
  -o colorized_kmeans.png

🐍 Python Client Example

import requests

url = "http://localhost:8000/api/colorize"

files = {
    'grayscale': open('input_gray.png', 'rb'),
    'reference': open('reference.png', 'rb')  # Optional for GAN
}

data = {'method': 'kmeans'}  # or 'gan', 'histogram', 'gaussian'

response = requests.post(url, files=files, data=data)

with open('output.png', 'wb') as f:
    f.write(response.content)

print("βœ… Colorization complete!")

πŸ“Š Results

πŸ† Overall Performance Comparison

Method PSNR ↑ SSIM ↑ Inference Time ↓ Reference Required
K-means Transfer πŸ‘‘ 22.29 dB 0.8908 145 ms βœ… Yes
Histogram Matching 21.85 dB 0.8742 89 ms βœ… Yes
Local Gaussian 21.43 dB 0.8621 178 ms βœ… Yes
GAN (Ξ»=50) πŸ€– 20.56 dB 0.8397 312 ms ❌ No

Benchmarked on RTX 5090, averaged over 10 test images

πŸ“ˆ Ablation Study Results

lambda_L1 Hyperparameter Sweep:

lambda_L1 PSNR SSIM Training Stability Notes
10 18.92 dB 0.7845 ⚠️ Unstable Too low, mode collapse
25 19.74 dB 0.8156 βœ… Stable Underfitting
50 20.56 dB 0.8397 βœ… Stable Optimal balance ✨
100 20.21 dB 0.8302 βœ… Stable Over-regularized
150 19.68 dB 0.8189 ⚠️ Unstable Blurry outputs
200 18.34 dB 0.7901 ❌ Unstable Severe overfitting

πŸ–ΌοΈ Visual Comparison

Method Comparison Example colorization: Grayscale -> Histogram -> K-means -> Gaussian -> GAN -> Ground Truth

⚠️ Failure Case Analysis

Classical Methods Struggle With:

  • 🌈 Scenes with unusual color palettes (no good reference)
  • 🎨 Abstract textures without clear semantic boundaries
  • πŸŒƒ Low-light images with poor L-channel separation

GAN Struggles With:

  • πŸ—οΈ Fine architectural details (mode averaging)
  • πŸ‘€ Skin tones (dataset bias toward outdoor scenes)
  • πŸ“ Text/signage (semantic understanding required)

πŸ“ˆ Performance Metrics

⚑ Speed Benchmarks

Test Configuration: RTX 5090, 256Γ—256 images, averaged over 100 runs

Method CPU Time GPU Time Speedup Memory
Histogram Matching 89 ms N/A (CPU-only) 1.00Γ— 45 MB
K-means Transfer 145 ms N/A (CPU-only) 1.00Γ— 78 MB
Local Gaussian 178 ms N/A (CPU-only) 1.00Γ— 92 MB
GAN 1,247 ms 312 ms 4.0Γ— 1.2 GB

πŸ“Š Statistical Significance

Paired t-tests (p < 0.05):

Comparison PSNR Ξ” Significant? Effect Size (Cohen's d)
K-means vs GAN +1.73 dB βœ… Yes 0.82 (large)
Histogram vs GAN +1.29 dB βœ… Yes 0.67 (medium)
K-means vs Histogram +0.44 dB ❌ No 0.21 (small)

πŸ’Ύ Model Size & Deployment

Component Parameters Disk Size Quantization Support
Full GAN (G+D) 54.3M 208 MB βœ… FP16, INT8
Generator Only 27.1M 104 MB βœ… FP16, INT8
Discriminator 27.2M 104 MB N/A (training only)

πŸŽ“ Academic Context

πŸ“š Course Information

  • Institution: DePaul University
  • Course: Introduction to Image Processing (CSC 381/481)
  • Quarter: Winter 2025
  • Instructor: Kenny Davila

🎯 Learning Objectives Addressed

βœ… Implement classical color transfer algorithms
βœ… Design and train conditional GANs
βœ… Conduct rigorous ablation studies
βœ… Apply statistical hypothesis testing
βœ… Deploy ML models via REST APIs
βœ… Document research methodology

πŸ“– Key References

  1. Pix2Pix: Isola et al. (2017) - "Image-to-Image Translation with Conditional Adversarial Networks"
  2. U-Net: Ronneberger et al. (2015) - "U-Net: Convolutional Networks for Biomedical Image Segmentation"
  3. PatchGAN: Li & Wand (2016) - "Combining Markov Random Fields and Convolutional Neural Networks"
  4. Color Transfer: Reinhard et al. (2001) - "Color Transfer between Images"

πŸ“ Citation

@misc{histcoloriful2025,
  title={HISTCOLORIFUL: A Comparative Study of Classical and Deep Learning Image Colorization},
  author={Mohammed Azeezulla},
  year={2025},
  institution={DePaul University},
  course={Introduction to Image Processing (CSC 381/481)},
  howpublished={\url{https://github.com/zeeza18/HISTCOLORIFUL}}
}

🀝 Contributing

This is an academic project, but suggestions are welcome!

πŸ› Found a Bug?

  1. Check existing issues
  2. Open new issue with detailed description
  3. Include error logs, environment details

πŸ’‘ Have an Idea?

  • New classical method? Add to app/main.py
  • Architecture improvement? Modify app/models.py
  • Better evaluation metric? Update notebook Section 4

πŸ”§ Development Setup

# Install dev dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Format code
black app/ --line-length 88

# Type checking
mypy app/

πŸ“ License

This project is licensed under the MIT License - see LICENSE file for details.

MIT License

Copyright (c) 2025 Mohammed Azeezulla

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

πŸ™ Acknowledgments

  • PyTorch Team - Excellent deep learning framework
  • COCO Dataset - High-quality training data
  • FastAPI Team - Modern web framework
  • Research Community - Pix2Pix, U-Net, PatchGAN papers

πŸ“§ Contact

Developer: Mohammed Azeezulla
Email: mdazeezulla2001@gmail.com
GitHub: @zeeza18
LinkedIn: Not provided


🌟 If this project helped you, consider giving it a star! 🌟

Made with ❀️ using PyTorch, FastAPI, and lots of β˜•

About

HistColoriful is a GAN based colorization project designed to convert grayscale images into realistic color outputs. The system includes LAB preprocessing, a UNet generator with skip connections, a PatchGAN discriminator, and classical colorization baselines. Results include PSNR, SSIM, ablation studies, and speed comparisons.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published