Modelling-Buildings-with-vision-language-models

A lightweight pipeline to extract building materials, heights, and roof types/conditions from street-level imagery using modern vision–language models (VLMs) plus classic CNN backbones.

Introduction

Urban planners, insurers, and emergency responders increasingly rely on up-to-date information about the materials, heights, and roof conditions of buildings, since these attributes drive decisions around safety, maintenance, and risk. This project explores whether modern vision–language models (VLMs) and lightweight fine-tuning pipelines can extract such attributes efficiently from street-level imagery.

We adopt a segmentation-first workflow that combines GroundedSAM (object-level precision) with CLIPSeg (region semantics). The merged masks isolate building regions, which then feed:

Material / roof type & condition classification via CLIP fine-tuning or a YOLOv8 classification head.
Height estimation from depth cues and calibrated geometry, achieving average relative error around ~23% on our internal benchmarks.

This design blends zero-shot transfer with lightweight adaptation, keeping the system practical for city-scale deployment. Future work targets adaptive ensembles that route images to CLIP/YOLO/SAM variants based on image quality and task difficulty.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
dataset		dataset
.DS_Store		.DS_Store
README.md		README.md
data_preprocerss.ipynb		data_preprocerss.ipynb
filtered_buildings_heading_distance_height.json		filtered_buildings_heading_distance_height.json
lora_finetune_clip.ipynb		lora_finetune_clip.ipynb
yolo_training.ipynb		yolo_training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modelling-Buildings-with-vision-language-models

Table of Contents

Introduction

Datasets

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

MatchLab-Imperial/Modelling-Buildings-with-vision-language-models-

Folders and files

Latest commit

History

Repository files navigation

Modelling-Buildings-with-vision-language-models

Table of Contents

Introduction

Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages