FastViT Hand Gesture Recognition

Real-time hand gesture recognition using state-of-the-art vision transformers.

🔑 Key Features

Real-time gesture recognition through webcam integration
19 different hand gestures recognized with high accuracy
Efficient model architecture using FastViT for low-latency predictions
Simple deployment through Google Colab (no local setup required)

🧠 Model & Architecture

This project uses the FastViT architecture, a hybrid vision transformer developed by Apple that offers an excellent balance between accuracy and computational efficiency:

Backbone: fastvit_t8.apple_in1k pretrained model
Training approach: Transfer learning with frozen backbone
Input size: 256×256 px
Classes: 19 hand gestures

FastViT was chosen for its efficiency advantages over other models like ConvNeXT, providing a fresh approach while maintaining high accuracy in a resource-constrained environment.

📊 Dataset

The model was trained on the Hand Gesture Recognition Image Dataset (HaGRID) 150k subset:

19 gesture classes including common gestures like "thumbs up", "peace sign", and "stop"
Used the more manageable 150k version as the full dataset is too large for training in Colab
Properly split between training and validation sets

🚀 Usage

Option 1: Run in Google Colab

Open the training notebook
Run all cells to train the model or load pretrained weights
Follow instructions for webcam integration

Option 2: Inference with Pretrained Model

Open the inference notebook
Upload the pretrained model file (sign_lang_model.pkl)
Run the webcam inference cell to start real-time detection

📈 Results

97.5% accuracy on the validation set
Robust performance across different lighting conditions
Real-time inference capability (>30 FPS on modern hardware)

🔮 Future Work

Expanded gesture vocabulary: Scale to cover the entire sign language alphabet and common phrases
Improved deployment: Create a standalone application for integration with video conferencing platforms
Sequence modeling: Incorporate temporal information for dynamic gesture recognition
Model optimization: Further quantization and pruning for edge device deployment

📚 References

⭐ If you find this project useful, please consider giving it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
Sign_Lang_Model_Deployment.ipynb		Sign_Lang_Model_Deployment.ipynb
Sign_Language_Detector_Model_Training.ipynb		Sign_Language_Detector_Model_Training.ipynb
sign_lang_model.pkl		sign_lang_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastViT Hand Gesture Recognition

🔑 Key Features

🧠 Model & Architecture

📊 Dataset

🚀 Usage

Option 1: Run in Google Colab

Option 2: Inference with Pretrained Model

📈 Results

🔮 Future Work

📚 References

About

Uh oh!

Releases

Packages

Languages

Rishab27279/Sign_Language_Detector

Folders and files

Latest commit

History

Repository files navigation

FastViT Hand Gesture Recognition

🔑 Key Features

🧠 Model & Architecture

📊 Dataset

🚀 Usage

Option 1: Run in Google Colab

Option 2: Inference with Pretrained Model

📈 Results

🔮 Future Work

📚 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages