GitHub - code-wizard123/image-segmentation

Abstract

We have performed semantic segmentation on Dubai's Satellite Imagery Dataset by using transfer learning on an InceptionResNetV2 encoder-based UNet CNN model. In order to artificially increase the amount of data and avoid overfitting, We preferred using data augmentation on the training set. The model has achieved ~81% dice coefficient and ~86% accuracy on the validation set.

Dataset Link: https://www.kaggle.com/datasets/humansintheloop/semantic-segmentation-of-aerial-imagery

Model Link: https://drive.google.com/file/d/1Y5yWuJVVVFAnKjsC6z_RRxf1z955eH0D/view

Approach

Data Augmentation using Albumentations Library

Albumentations is a Python library for fast and flexible image augmentations. Albumentations efficiently implements a rich variety of image transform operations that are optimized for performance, and does so while providing a concise, yet powerful image augmentation interface for different computer vision tasks, including object classification, segmentation, and detection.

There are only 72 images (having different resolutions) in the dataset, out of which We have used 56 images (~78%) for training set and remaining 16 images (~22%) for validation set. It is a very small amount of data, in order to artificially increase the amount of data and avoid overfitting, We preferred using data augmentation. By doing so We have increased the training data upto 9 times. So, the total number of images in the training set is 504 (56+448), and 16 (original) images in the validation set, after data augmentation.

Data augmentation is done by the following techniques:

Random Cropping
Horizontal Flipping
Vertical Flipping
Rotation
Random Brightness & Contrast
Contrast Limited Adaptive Histogram Equalization (CLAHE)
Grid Distortion
Optical Distortion

InceptionResNetV2 Encoder based UNet Model

InceptionResNetV2 Architecture

Source: https://arxiv.org/pdf/1602.07261v2.pdf

UNet Architecture

Source: https://arxiv.org/pdf/1505.04597.pdf

InceptionResNetV2-UNet Architecture

InceptionResNetV2 model pre-trained on the ImageNet dataset has been used as an encoder network.
A decoder network has been extended from the last layer of the pre-trained model, and it is concatenated to the consecutive layers.

A detailed layout of the model is available here.

References

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning,” arXiv.org, 23-Aug-2016. [Online]. Available: https://arxiv.org/abs/1602.07261.
O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” arXiv.org, 18-May-2015. [Online]. Available: https://arxiv.org/abs/1505.04597.

Contributors

Raunak Singh Kalsi

Dhruv Sapra

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
readme_files		readme_files
.gitignore		.gitignore
README.md		README.md
image_segmentation.ipynb		image_segmentation.ipynb
inference.ipynb		inference.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Approach

Data Augmentation using Albumentations Library

InceptionResNetV2 Encoder based UNet Model

InceptionResNetV2 Architecture

UNet Architecture

InceptionResNetV2-UNet Architecture

References

Contributors

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

code-wizard123/image-segmentation

Folders and files

Latest commit

History

Repository files navigation

Abstract

Approach

Data Augmentation using Albumentations Library

InceptionResNetV2 Encoder based UNet Model

InceptionResNetV2 Architecture

UNet Architecture

InceptionResNetV2-UNet Architecture

References

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages