A real-time face recognition system for RTSP video streams
Features β’ Installation β’ Usage β’ Documentation β’ Troubleshooting
A real-time face recognition system that processes RTSP video streams using YuNet for face detection and ArcFace for face recognition. The system can identify enrolled individuals and log detections with automatic image capture.
- Python 3.8+ - Core programming language
- OpenCV - Computer vision and face detection (YuNet)
- ONNX Runtime - Efficient model inference
- NumPy - Numerical operations and embeddings
- ArcFace - Deep learning face recognition model
- Real-time Processing: Monitors RTSP video streams continuously
- Face Detection: Uses YuNet (OpenCV) for robust face detection
- Face Recognition: Employs ArcFace embeddings for accurate face matching
- Auto-enrollment: Simple image-based enrollment system
- Smart Logging: Captures detected faces with configurable cooldown periods
- Unknown Face Detection: Optional logging of unrecognized faces
- Performance Optimized: Configurable frame sampling and downscaling
- Python 3.8+
- RTSP camera stream access
- ONNX models (see Model Setup)
- Clone the repository:
git clone <repository-url>
cd <repository-name>- Install required dependencies:
pip install opencv-python numpy onnxruntime python-dotenv- Create required directories:
mkdir -p enroll models output/matches output/unknownDownload the required ONNX models and place them in the models/ directory:
-
YuNet Face Detector:
face_detection_yunet_2023mar.onnx -
ArcFace Recognition Model:
w600k_r50.onnx
Create a .env file in the project root:
RTSP_URL=rtsp://username:password@camera-ip:port/streamEdit the following constants in cam.py:
| Parameter | Default | Description |
|---|---|---|
DETECT_EVERY_N_FRAMES |
10 | Process every Nth frame for performance |
DOWNSCALE_DETECT |
0.5 | Scale factor for detection (0.5 = 50% size) |
MIN_FACE_SIZE |
40 | Minimum face size in pixels to process |
RECOG_THRESHOLD |
0.45 | Recognition threshold (lower = stricter) |
SAVE_COOLDOWN_SEC |
3.0 | Seconds between saves for same person |
SAVE_UNKNOWN |
True | Whether to save unknown faces |
PAD_FACTOR |
0.1 | Padding around detected faces (10%) |
Add images to the enroll/ directory with the naming convention:
enroll/
βββ john_1.jpg
βββ john_2.jpeg
βββ jane_001.png
βββ bob_photo.jpg
Naming Rules:
- The person's name is everything before the first underscore
- Example:
john_1.jpgβ Person name: "john" - Supported formats:
.jpg,.jpeg,.png - Include multiple photos per person for better accuracy
python cam.pyThe system will:
- Load and process enrollment images
- Connect to the RTSP stream
- Detect and recognize faces in real-time
- Save matched and unknown faces to
output/
Output images are saved in:
output/matches/- Recognized faces with namesoutput/unknown/- Unrecognized faces
File naming format:
{name}_{timestamp}_d{distance}.jpg
- Stream Capture: Connects to RTSP stream with minimal buffering
- Frame Sampling: Processes every Nth frame to optimize performance
- Downscaling: Reduces frame size for faster detection
- Face Detection: YuNet identifies faces and bounding boxes
- Face Extraction: Crops and pads detected faces
- Embedding: ArcFace generates 512-dimensional embeddings
- Matching: Compares embeddings with enrolled gallery
- Logging: Saves annotated frames and face crops
- Uses cosine distance between normalized embeddings
- Distance threshold determines match/unknown classification
- Lower distance = higher similarity (0 = identical, 2 = opposite)
- Cooldown prevents duplicate saves of the same person
Symptoms: "Sem rosto" messages during enrollment
Solutions:
- Ensure faces are clearly visible and well-lit
- Use higher resolution images (recommended: 640px minimum)
- Check that faces occupy at least 20% of the image
- Verify image files are not corrupted
Symptoms: "NΓ£o abriu RTSP" error
Solutions:
- Verify RTSP URL format and credentials
- Test stream with VLC or ffplay first
- Check network connectivity to camera
- Ensure camera supports RTSP protocol
Symptoms: Wrong matches or too many unknowns
Solutions:
- Adjust
RECOG_THRESHOLD(lower = stricter, higher = looser) - Add more enrollment photos per person (5-10 recommended)
- Use varied angles and lighting in enrollment photos
- Increase
MIN_FACE_SIZEto filter distant faces
Symptoms: Lag or high CPU usage
Solutions:
- Increase
DETECT_EVERY_N_FRAMESto process fewer frames - Reduce
DOWNSCALE_DETECTfurther (try 0.3 or 0.25) - Use GPU acceleration with CUDA providers in ONNXRuntime
- Lower camera stream resolution at source
To use GPU acceleration, modify the ONNX Runtime provider:
arc_sess = ort.InferenceSession(
ARCFACE_PATH,
providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)Requires: onnxruntime-gpu and NVIDIA GPU with CUDA support
Adjust YuNet detector settings:
detector = cv2.FaceDetectorYN.create(
YUNET_PATH,
"",
(320, 320),
score_threshold=0.6, # Higher = fewer false positives
nms_threshold=0.3, # Non-maximum suppression
top_k=5000 # Max faces per frame
)Each detection saves two images:
- Annotated Frame: Full frame with bounding box and label
- Face Crop: Extracted face region (commented out by default)
[ENROLL] OK: john <- john_1.jpg
[GALLERY] embeddings: 3
RTSP aberto com sucesso!
[SAVE] output/matches/john_20260115_143022_123456_d0.234.jpg | ... (name=john, 0, d=0.234, bbox=120,80,150,180)
.
βββ cam.py # Main application
βββ .env # Configuration (not in git)
βββ enroll/ # Enrollment images
βββ models/ # ONNX model files
β βββ face_detection_yunet_2023mar.onnx
β βββ w600k_r50.onnx
βββ output/ # Detection results
βββ matches/ # Recognized faces
βββ unknown/ # Unrecognized faces
- Store
.envfile securely (never commit to git) - Use strong RTSP credentials
- Implement access controls for output directory
- Consider data retention policies for saved images
- Ensure compliance with local privacy regulations
- OpenCV YuNet face detection model
- InsightFace ArcFace recognition model
- ONNXRuntime for efficient inference
For issues and questions:
- Open an issue on GitHub
- Check troubleshooting section above
- Review debug logs in console output