Skip to content

YounesBensafia/VisionCompressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisionCompressor

Block-matching motion estimation for image sequences. Takes a source/reference image and a target image, splits the target into NxN blocks, and finds the best matching block for each in the source using a three-step search. Outputs motion vectors and residuals; can reconstruct the target from source + vectors + residuals.

How it works

Video codecs (MPEG-1, H.261, H.264) avoid storing every frame independently. Instead they encode the difference between a frame and a prediction from a previously-decoded frame. This prediction is improved by compensating for motion — each block in the current frame is predicted by shifting a block from the reference frame by some (dx, dy) called a motion vector. The difference after compensation is the residual.

This project implements that motion estimation step for two images:

  1. Pad both images so dimensions are divisible by block_size
  2. For each block_size×block_size block in the target:
    • Extract a search window (±search_padding px) from the source
    • Convert both to YCrCb; matching is done on the Y (luminance) channel only since the human eye is less sensitive to colour detail
    • Run a three-step search to find the best-matching block:
      • Start at the centre of the search window, check 9 positions spaced by step
      • Move the centre to the best match, halve step, repeat until step < 1
    • Compute the motion vector and the residual (pixel differences)
    • If MSE < threshold, the match is good enough — store zero residual instead
  3. Reconstruct: for each block, take source block at motion-vector offset, add its residual

The three-step search visits O(log n) positions instead of O(n²) for exhaustive search. It does not guarantee the global minimum MSE but works well for typical motion magnitudes.

Modes

Mode Match metric Residual stored on Use case
ycbcr (default) Y-channel MSE YCrCb difference Standard, good compression
bgr Y-channel MSE BGR difference Full-color reconstruction
y_only Y-channel MSE Y-channel only Grayscale, most efficient

Usage

GUI

uv run python main.py

Drag & drop images onto the drop zones, or click to browse. Configure block size, threshold, and mode in the controls bar.

CLI

# Single frame pair
uv run python -m visioncompressor.cli single source.png target.png -o output/

# Batch process numbered frames (0.png, 1.png, 2.png ...)
uv run python -m visioncompressor.cli batch input_folder/ -o output/

# Create video from reconstructed frames
uv run python -m visioncompressor.cli video output/reconstructed/ -o video.avi

# Options
uv run python -m visioncompressor.cli single --mode bgr --block-size 16 --threshold 24 a.png b.png

Project Structure

VisionCompressor/
├── main.py                              # GUI entry point
├── visioncompressor/
│   ├── cli.py                           # CLI entry point
│   ├── core/
│   │   ├── block_matcher.py             # Three-step search, MSE matching
│   │   ├── reconstructor.py             # Frame reconstruction, batch, video
│   │   └── types.py                     # Default constants
│   ├── gui/
│   │   ├── main_window.py               # PyQt6 GUI
│   │   └── workers.py                   # QThread workers
│   └── utils/
│       └── mse.py                       # Mean squared error
└── tests/
    └── test_block_matcher.py

Install

Requires Python ≥ 3.13 and uv.

git clone https://github.com/YounesBensafia/VisionCompressor.git
cd VisionCompressor
uv sync

This creates a virtual environment and installs all dependencies (opencv-python, matplotlib, PyQt6) pinned from uv.lock. All commands below should be run with uv run python ... or after source .venv/bin/activate.

About

Block-matching motion estimation inspired by the one used in Twitch, YouTube, and Zoom. Feed two images, inspect motion vectors and residual heatmaps. Three-step search, batch processing, YCrCb/BGR/Y-only modes. Interactive PyQt6 GUI and CLI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages