Skip to content

A real-time face recognition system that processes RTSP video streams using YuNet for face detection and ArcFace for face recognition. The system can identify enrolled individuals and log detections with automatic image capture.

Notifications You must be signed in to change notification settings

maubaum/face-recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RTSP Face Recognition System

Python OpenCV ONNX NumPy License

A real-time face recognition system for RTSP video streams

Features β€’ Installation β€’ Usage β€’ Documentation β€’ Troubleshooting


πŸ“‹ Overview

A real-time face recognition system that processes RTSP video streams using YuNet for face detection and ArcFace for face recognition. The system can identify enrolled individuals and log detections with automatic image capture.

πŸ› οΈ Technologies

  • Python 3.8+ - Core programming language
  • OpenCV - Computer vision and face detection (YuNet)
  • ONNX Runtime - Efficient model inference
  • NumPy - Numerical operations and embeddings
  • ArcFace - Deep learning face recognition model

Features

  • Real-time Processing: Monitors RTSP video streams continuously
  • Face Detection: Uses YuNet (OpenCV) for robust face detection
  • Face Recognition: Employs ArcFace embeddings for accurate face matching
  • Auto-enrollment: Simple image-based enrollment system
  • Smart Logging: Captures detected faces with configurable cooldown periods
  • Unknown Face Detection: Optional logging of unrecognized faces
  • Performance Optimized: Configurable frame sampling and downscaling

Prerequisites

  • Python 3.8+
  • RTSP camera stream access
  • ONNX models (see Model Setup)

Installation

  1. Clone the repository:
git clone <repository-url>
cd <repository-name>
  1. Install required dependencies:
pip install opencv-python numpy onnxruntime python-dotenv
  1. Create required directories:
mkdir -p enroll models output/matches output/unknown

Model Setup

Download the required ONNX models and place them in the models/ directory:

  1. YuNet Face Detector: face_detection_yunet_2023mar.onnx

  2. ArcFace Recognition Model: w600k_r50.onnx

Configuration

Create a .env file in the project root:

RTSP_URL=rtsp://username:password@camera-ip:port/stream

Configurable Parameters

Edit the following constants in cam.py:

Parameter Default Description
DETECT_EVERY_N_FRAMES 10 Process every Nth frame for performance
DOWNSCALE_DETECT 0.5 Scale factor for detection (0.5 = 50% size)
MIN_FACE_SIZE 40 Minimum face size in pixels to process
RECOG_THRESHOLD 0.45 Recognition threshold (lower = stricter)
SAVE_COOLDOWN_SEC 3.0 Seconds between saves for same person
SAVE_UNKNOWN True Whether to save unknown faces
PAD_FACTOR 0.1 Padding around detected faces (10%)

Usage

1. Enroll Faces

Add images to the enroll/ directory with the naming convention:

enroll/
β”œβ”€β”€ john_1.jpg
β”œβ”€β”€ john_2.jpeg
β”œβ”€β”€ jane_001.png
└── bob_photo.jpg

Naming Rules:

  • The person's name is everything before the first underscore
  • Example: john_1.jpg β†’ Person name: "john"
  • Supported formats: .jpg, .jpeg, .png
  • Include multiple photos per person for better accuracy

2. Run the System

python cam.py

The system will:

  1. Load and process enrollment images
  2. Connect to the RTSP stream
  3. Detect and recognize faces in real-time
  4. Save matched and unknown faces to output/

3. Review Results

Output images are saved in:

  • output/matches/ - Recognized faces with names
  • output/unknown/ - Unrecognized faces

File naming format:

{name}_{timestamp}_d{distance}.jpg

How It Works

Detection Pipeline

  1. Stream Capture: Connects to RTSP stream with minimal buffering
  2. Frame Sampling: Processes every Nth frame to optimize performance
  3. Downscaling: Reduces frame size for faster detection
  4. Face Detection: YuNet identifies faces and bounding boxes
  5. Face Extraction: Crops and pads detected faces
  6. Embedding: ArcFace generates 512-dimensional embeddings
  7. Matching: Compares embeddings with enrolled gallery
  8. Logging: Saves annotated frames and face crops

Recognition Process

  • Uses cosine distance between normalized embeddings
  • Distance threshold determines match/unknown classification
  • Lower distance = higher similarity (0 = identical, 2 = opposite)
  • Cooldown prevents duplicate saves of the same person

Troubleshooting

No Faces Detected in Enrollment

Symptoms: "Sem rosto" messages during enrollment

Solutions:

  • Ensure faces are clearly visible and well-lit
  • Use higher resolution images (recommended: 640px minimum)
  • Check that faces occupy at least 20% of the image
  • Verify image files are not corrupted

RTSP Connection Failed

Symptoms: "NΓ£o abriu RTSP" error

Solutions:

  • Verify RTSP URL format and credentials
  • Test stream with VLC or ffplay first
  • Check network connectivity to camera
  • Ensure camera supports RTSP protocol

Poor Recognition Accuracy

Symptoms: Wrong matches or too many unknowns

Solutions:

  • Adjust RECOG_THRESHOLD (lower = stricter, higher = looser)
  • Add more enrollment photos per person (5-10 recommended)
  • Use varied angles and lighting in enrollment photos
  • Increase MIN_FACE_SIZE to filter distant faces

Performance Issues

Symptoms: Lag or high CPU usage

Solutions:

  • Increase DETECT_EVERY_N_FRAMES to process fewer frames
  • Reduce DOWNSCALE_DETECT further (try 0.3 or 0.25)
  • Use GPU acceleration with CUDA providers in ONNXRuntime
  • Lower camera stream resolution at source

Advanced Configuration

GPU Acceleration

To use GPU acceleration, modify the ONNX Runtime provider:

arc_sess = ort.InferenceSession(
    ARCFACE_PATH, 
    providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)

Requires: onnxruntime-gpu and NVIDIA GPU with CUDA support

Custom Detection Parameters

Adjust YuNet detector settings:

detector = cv2.FaceDetectorYN.create(
    YUNET_PATH,
    "",
    (320, 320),
    score_threshold=0.6,  # Higher = fewer false positives
    nms_threshold=0.3,    # Non-maximum suppression
    top_k=5000            # Max faces per frame
)

Output Format

Saved Images

Each detection saves two images:

  1. Annotated Frame: Full frame with bounding box and label
  2. Face Crop: Extracted face region (commented out by default)

Console Output

[ENROLL] OK: john <- john_1.jpg
[GALLERY] embeddings: 3
RTSP aberto com sucesso!
[SAVE] output/matches/john_20260115_143022_123456_d0.234.jpg | ... (name=john, 0, d=0.234, bbox=120,80,150,180)

Project Structure

.
β”œβ”€β”€ cam.py                  # Main application
β”œβ”€β”€ .env                    # Configuration (not in git)
β”œβ”€β”€ enroll/                 # Enrollment images
β”œβ”€β”€ models/                 # ONNX model files
β”‚   β”œβ”€β”€ face_detection_yunet_2023mar.onnx
β”‚   └── w600k_r50.onnx
└── output/                 # Detection results
    β”œβ”€β”€ matches/           # Recognized faces
    └── unknown/           # Unrecognized faces

Security Considerations

  • Store .env file securely (never commit to git)
  • Use strong RTSP credentials
  • Implement access controls for output directory
  • Consider data retention policies for saved images
  • Ensure compliance with local privacy regulations

Acknowledgments

  • OpenCV YuNet face detection model
  • InsightFace ArcFace recognition model
  • ONNXRuntime for efficient inference

Support

For issues and questions:

  • Open an issue on GitHub
  • Check troubleshooting section above
  • Review debug logs in console output

About

A real-time face recognition system that processes RTSP video streams using YuNet for face detection and ArcFace for face recognition. The system can identify enrolled individuals and log detections with automatic image capture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages