Diffusion Model based Data Augmentation for Remote Sensing Imagery

Master Thesis of Hubert Kriebitzsch at the TU Berlin Faculty IV Computer Vision and Remote Sensing Department (GitHub repository)

Abstract

Data augmentation is a crucial challenge in deep learning and especially in remote sensing where data is often more difficult and costly to acquire especially when collecting data of rare events such as natural disasters. Many solutions have been proposed to this problem and data augmentation using synthetic data, mainly generated using Generative Adversarial Networks, is one of the most recent and efficient approaches to counter the effects of class imbalance. In this thesis, we further study data augmentation with synthetic data using state-of-the-art generative models. We use diffusion models to generate new remote sensing images for data augmentation purposes. To generate high-fidelity satellite images of active fire, we finetune the foundation model Stable Diffusion using Dreambooth and existing wildfire images. We apply it to the task of active fire detection by inpainting synthetic wildfires into existing satellite images. This allows us to augment semantic segmentation datasets and not only image classification datasets. We conduct a series of experiments to measure the efficiency of the methods proposed and compare different pretrained and finetuned diffusion models as well as different inpainting masks. We evaluate this approach on a small manually annotated active fire detection dataset and achieve an improvement of the dice coefficient from 58.5% up to 72.7%. This work provides new insights on remote sensing data generation with diffusion models, as well as the efficiency of data augmentation using synthetic data generated with them. It presents a novel way to generate semantic segmentation data in remote sensing.

Example of active fire inpainting

Remote Sensing Active Fire Inpainting (RSAFI) 1.5

Stable Diffusion Inpainting v1.5 model finetuned using Dreambooth. The U-Net and the text encoder have been finetuned using a dataset of active fire satellite images.