A configurable Structure from Motion (SfM) pipeline using OpenCV for feature detection and GTSAM for bundle adjustment. This implementation performs incremental 3D reconstruction from multiple images with robust optimization.
git clone git@github.com:deep-zspace/Structure_from_motion_GTSAM.git
cd sfmpip install -r requirements.txtRequired packages:
- numpy: Numerical computations
- opencv-python: Image processing and feature detection
- opencv-contrib-python: Additional OpenCV algorithms (SIFT)
- matplotlib: Static visualization
- PyYAML: Configuration file parsing
- gtsam: Bundle adjustment optimization
- open3d: Interactive 3D visualization
GTSAM can be challenging to install. If pip installation fails, try:
# Ubuntu/Debian
sudo apt-get install libgtsam-dev
pip install gtsam
# Or build from source
git clone https://github.com/borglab/gtsam.git
cd gtsam
mkdir build && cd build
cmake -DGTSAM_BUILD_PYTHON=ON ..
make installPlace your images in a folder (default: ./images):
mkdir images
cp /path/to/your/images/*.jpg images/Requirements:
- Use sequential images with overlapping views
- Images should be from the same calibrated camera
- JPEG or PNG format
- At least 10 images recommended for good reconstruction
Edit config.yaml with your camera calibration:
camera:
image_width: 1080 # Your image width
image_height: 1920 # Your image height
focal_length: 1484.5 # Your focal length in pixelsTo estimate focal length from EXIF data:
focal_length_pixels = (focal_length_mm / sensor_width_mm) * image_width_pixelsCommon sensor widths: 36mm (full frame), 23.6mm (APS-C), 5.7mm (phone)
python run_sfm.py --config config.yamlThe pipeline will:
- Load and detect features in all images
- Find the best initial image pair
- Initialize reconstruction with two-view geometry
- Incrementally register remaining cameras
- Triangulate 3D points between views
- Run bundle adjustment to optimize cameras and points
- Save results to output folder
Interactive 3D viewer:
python tools/visualize_pointcloud.py --reconstruction output/reconstruction.pklStatic visualization:
python tools/visualize.py --reconstruction output/reconstruction.pklThe config.yaml file controls all pipeline parameters. Key sections:
paths:
images: "./images" # Input images folder
output: "./output" # Output directory
point_cloud: "sparse_point_cloud.ply"
reconstruction: "reconstruction.pkl"features:
method: "SIFT" # SIFT or ORB
sift:
n_features: 5000 # Max features per image
contrast_threshold: 0.04 # Lower = more features, less stable
matching:
ratio_test: 0.75 # Lowe's ratio (0.7-0.8 typical)
min_matches: 150 # Minimum matches between pairscamera:
image_width: 1080
image_height: 1920
focal_length: 1484.5 # In pixels
distortion: [0.0, 0.0, 0.0, 0.0, 0.0] # [k1, k2, p1, p2, k3]The bundle adjustment parameters control optimization quality and convergence:
bundle_adjustment:
# Measurement noise (most important parameter)
noise_sigma: 500.0 # Pixel measurement uncertainty
# Higher = more flexible optimization
# Lower = stricter fitting to measurements
# Range: 100-1000 typical
# Robust loss
use_robust: true # Enable Huber loss (recommended)
huber_threshold: 1.345 # Standard value, rarely needs tuning
# Optimization convergence
max_iterations: 150 # Maximum optimizer iterations
relative_tolerance: 1.0e-10 # Stop when relative change is small
absolute_tolerance: 1.0e-10 # Stop when absolute error is small
# Levenberg-Marquardt damping
lambda_initial: 1.0 # Starting damping factor
lambda_upper_bound: 1.0e10 # Max damping (prevents overshooting)
lambda_lower_bound: 0.0 # Min damping (Gauss-Newton)
# Outlier removal (optional)
iterative_outlier_removal: true
max_outlier_passes: 100
outlier_threshold: 5.0 # Reject observations above N pixels errorMost important parameter to tune is noise_sigma:
If reconstruction is failing or unstable:
- Increase
noise_sigmato 800-1000 - This makes optimization more flexible and helps convergence
- Good for noisy data or imprecise calibration
If reconstruction is good but you want better accuracy:
- Decrease
noise_sigmato 200-400 - This enforces stricter fitting to observations
- Only use with good calibration and clean data
Other parameters:
max_iterations: Increase to 200-300 if not converginglambda_initial: Increase to 10.0 if optimization is unstableoutlier_threshold: Decrease to 3.0 for stricter outlier rejection
Leave other parameters at default unless you understand Levenberg-Marquardt optimization.
sfm:
initialization:
min_matches: 100 # Matches required for initialization
max_num_image_pairs: 18 # How many pairs to test
registration:
min_points: 5 # Min 3D points for PnP
ransac_reproj_error: 2.0 # RANSAC threshold in pixels
run_ba_every: 5 # Run BA every N cameras (0 = only at end)After running the pipeline, check the output/ folder:
sparse_point_cloud.ply- 3D point cloud (can open in MeshLab, CloudCompare)reconstruction.pkl- Complete reconstruction data (cameras, points, tracks)reconstruction_3d.png- Static visualization image
# Interactive viewer (Open3D)
python tools/visualize_pointcloud.py --reconstruction output/reconstruction.pkl
# Static matplotlib plot
python tools/visualize.py --reconstruction output/reconstruction.pkl- Prepare images from the same camera with overlapping views
- Calibrate your camera or estimate focal length
- Update
config.yamlwith camera parameters - Place images in
./imagesfolder - Run
python run_sfm.py - Check logs for any errors
- If reconstruction fails:
- Increase
noise_sigmato 800-1000 - Increase
max_num_image_pairsto test more initial pairs - Reduce
min_matchesto 100 if images have few features - Check that focal length is correct
- Increase
This project is provided as-is for educational and research purposes.
- OpenCV: https://opencv.org/
- GTSAM: https://gtsam.org/
- Open3D: http://www.open3d.org/
- Snavely et al., "Photo Tourism: Exploring Photo Collections in 3D", SIGGRAPH 2006
- Levenberg-Marquardt: https://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm
