Skip to content

A production-ready, config-driven Structure from Motion (SfM) pipeline using GTSAM for bundle adjustment. Features incremental camera registration, robust optimization, and achieves sub-pixel accuracy (0.001px RMS).

Notifications You must be signed in to change notification settings

deep-zspace/Structure_from_motion_GTSAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structure from Motion Pipeline

A configurable Structure from Motion (SfM) pipeline using OpenCV for feature detection and GTSAM for bundle adjustment. This implementation performs incremental 3D reconstruction from multiple images with robust optimization.

Example Reconstruction

Pipeline Flow

Pipeline Diagram

Installation

Clone the repository

git clone git@github.com:deep-zspace/Structure_from_motion_GTSAM.git
cd sfm

Install dependencies

pip install -r requirements.txt

Required packages:

  • numpy: Numerical computations
  • opencv-python: Image processing and feature detection
  • opencv-contrib-python: Additional OpenCV algorithms (SIFT)
  • matplotlib: Static visualization
  • PyYAML: Configuration file parsing
  • gtsam: Bundle adjustment optimization
  • open3d: Interactive 3D visualization

GTSAM Installation

GTSAM can be challenging to install. If pip installation fails, try:

# Ubuntu/Debian
sudo apt-get install libgtsam-dev
pip install gtsam

# Or build from source
git clone https://github.com/borglab/gtsam.git
cd gtsam
mkdir build && cd build
cmake -DGTSAM_BUILD_PYTHON=ON ..
make install

Quick Start

1. Prepare your images

Place your images in a folder (default: ./images):

mkdir images
cp /path/to/your/images/*.jpg images/

Requirements:

  • Use sequential images with overlapping views
  • Images should be from the same calibrated camera
  • JPEG or PNG format
  • At least 10 images recommended for good reconstruction

2. Configure camera parameters

Edit config.yaml with your camera calibration:

camera:
  image_width: 1080      # Your image width
  image_height: 1920     # Your image height
  focal_length: 1484.5   # Your focal length in pixels

To estimate focal length from EXIF data:

focal_length_pixels = (focal_length_mm / sensor_width_mm) * image_width_pixels

Common sensor widths: 36mm (full frame), 23.6mm (APS-C), 5.7mm (phone)

3. Run the pipeline

python run_sfm.py --config config.yaml

The pipeline will:

  1. Load and detect features in all images
  2. Find the best initial image pair
  3. Initialize reconstruction with two-view geometry
  4. Incrementally register remaining cameras
  5. Triangulate 3D points between views
  6. Run bundle adjustment to optimize cameras and points
  7. Save results to output folder

4. Visualize results

Interactive 3D viewer:

python tools/visualize_pointcloud.py --reconstruction output/reconstruction.pkl

Static visualization:

python tools/visualize.py --reconstruction output/reconstruction.pkl

Configuration Guide

The config.yaml file controls all pipeline parameters. Key sections:

Paths

paths:
  images: "./images"           # Input images folder
  output: "./output"           # Output directory
  point_cloud: "sparse_point_cloud.ply"
  reconstruction: "reconstruction.pkl"

Feature Detection

features:
  method: "SIFT"               # SIFT or ORB

  sift:
    n_features: 5000           # Max features per image
    contrast_threshold: 0.04   # Lower = more features, less stable

  matching:
    ratio_test: 0.75           # Lowe's ratio (0.7-0.8 typical)
    min_matches: 150           # Minimum matches between pairs

Camera Calibration

camera:
  image_width: 1080
  image_height: 1920
  focal_length: 1484.5         # In pixels
  distortion: [0.0, 0.0, 0.0, 0.0, 0.0]  # [k1, k2, p1, p2, k3]

Bundle Adjustment Parameters

The bundle adjustment parameters control optimization quality and convergence:

bundle_adjustment:
  # Measurement noise (most important parameter)
  noise_sigma: 500.0           # Pixel measurement uncertainty
                               # Higher = more flexible optimization
                               # Lower = stricter fitting to measurements
                               # Range: 100-1000 typical

  # Robust loss
  use_robust: true             # Enable Huber loss (recommended)
  huber_threshold: 1.345       # Standard value, rarely needs tuning

  # Optimization convergence
  max_iterations: 150          # Maximum optimizer iterations
  relative_tolerance: 1.0e-10  # Stop when relative change is small
  absolute_tolerance: 1.0e-10  # Stop when absolute error is small

  # Levenberg-Marquardt damping
  lambda_initial: 1.0          # Starting damping factor
  lambda_upper_bound: 1.0e10   # Max damping (prevents overshooting)
  lambda_lower_bound: 0.0      # Min damping (Gauss-Newton)

  # Outlier removal (optional)
  iterative_outlier_removal: true
  max_outlier_passes: 100
  outlier_threshold: 5.0       # Reject observations above N pixels error

Tuning Bundle Adjustment

Most important parameter to tune is noise_sigma:

If reconstruction is failing or unstable:

  • Increase noise_sigma to 800-1000
  • This makes optimization more flexible and helps convergence
  • Good for noisy data or imprecise calibration

If reconstruction is good but you want better accuracy:

  • Decrease noise_sigma to 200-400
  • This enforces stricter fitting to observations
  • Only use with good calibration and clean data

Other parameters:

  • max_iterations: Increase to 200-300 if not converging
  • lambda_initial: Increase to 10.0 if optimization is unstable
  • outlier_threshold: Decrease to 3.0 for stricter outlier rejection

Leave other parameters at default unless you understand Levenberg-Marquardt optimization.

SfM Settings

sfm:
  initialization:
    min_matches: 100           # Matches required for initialization
    max_num_image_pairs: 18    # How many pairs to test

  registration:
    min_points: 5              # Min 3D points for PnP
    ransac_reproj_error: 2.0   # RANSAC threshold in pixels

  run_ba_every: 5              # Run BA every N cameras (0 = only at end)

Output Files

After running the pipeline, check the output/ folder:

  • sparse_point_cloud.ply - 3D point cloud (can open in MeshLab, CloudCompare)
  • reconstruction.pkl - Complete reconstruction data (cameras, points, tracks)
  • reconstruction_3d.png - Static visualization image

Visualization Options

Basic visualization

# Interactive viewer (Open3D)
python tools/visualize_pointcloud.py --reconstruction output/reconstruction.pkl

# Static matplotlib plot
python tools/visualize.py --reconstruction output/reconstruction.pkl

Running on Custom Dataset

  1. Prepare images from the same camera with overlapping views
  2. Calibrate your camera or estimate focal length
  3. Update config.yaml with camera parameters
  4. Place images in ./images folder
  5. Run python run_sfm.py
  6. Check logs for any errors
  7. If reconstruction fails:
    • Increase noise_sigma to 800-1000
    • Increase max_num_image_pairs to test more initial pairs
    • Reduce min_matches to 100 if images have few features
    • Check that focal length is correct

License

This project is provided as-is for educational and research purposes.

References

About

A production-ready, config-driven Structure from Motion (SfM) pipeline using GTSAM for bundle adjustment. Features incremental camera registration, robust optimization, and achieves sub-pixel accuracy (0.001px RMS).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages