This repository contains the dataset and code for the paper "FLOGA: A machine learning ready dataset, a benchmark and a novel deep learning model for burnt area mapping with Sentinel-2" (Sdraka et al., 2024).
You can download the FLOGA dataset from Dropbox.
In order to read the downloaded .hdf files, the hdf5plugin
python module is needed because data have been compressed using BZip2. These files include the raw Sentinel-2 and MODIS imagery aligned to a common grid, along with the labels and the various masks.
After downloading the .hdf files, you can create an analysis-ready dataset with the create_dataset.py
script. This python script reads the .hdf files, crops the images into smaller patches and then performs a train/val/test split on the patches.
For example,
python create_dataset.py --floga_path path/to/hdf/files --out_path data/ --out_size 256 256 --sample 1
The above command will crop the images into 256x256 patches and export 3 pickle files with the train, validation and test splits respectively. The option --sample
dictates that for each positive patch (i.e. patch that contains at least 1 burnt pixel) a negative one (i.e. a patch with no burnt pixels) will be included. Run python create_dataset.py --help
for more information on the various options.
FLOGA contains aligned Sentinel-2 and MODIS imagery for 326 wildfire events in Greece over the period 2017-2021, along with high-resolution burnt area mappings produced by the Hellenic Fire Service. For each event, the dataset offers:
- Pre-fire Sentinel-2/MODIS imagery
- Post-fire Sentinel-2/MODIS imagery
- Cloud masks for the pre-fire imagery
- Cloud masks for the post-fire imagery
- Water mask
- Corine Land Cover mask
- Ground truth label
The labels can contain the following values: 0 for non-burnt pixels, 1 for burnt pixels, and 2 for pixels burnt in other fire events of the same year. The pixels marked with 2 may or may not contain burnt areas (this depends on the timestamps of the fires as well as the timestamps of the selected satellite imagery), so we have marked them with this unique value in order to facilitate their exclusion from the training/evaluation process.
A useful notebook with an exploration of the dataset can be found in Data_exploration.ipynb
.
The train/val/test splits used in the paper can be found here. A ratio of 1:1 was selected (1 negative patch for each positive patch) and sea and cloud patches were removed.
This repo also contains the code for the proposed BAM-CD model for burnt area mapping with bitemporal Sentinel-2 imagery. The model can be found inside the folder models/bam_cd/
. The implementation is heavily based on segmentation_models.pytorch.
V1
You can find the weights of the pretrained BAM-CD (version 1) model here. This model is the one described in the corresponding publication.
V2
A new BAM-CD (version 2) model has been trained and achieves more robust results. The new model adopts a pseudo-siamese scheme and extensive data augmentation is applied during training. The weights can be found here. To use this model, change the appropriate lines in the configs/method/bam_cd.json
file as denoted in the comments.
- 9/4/2024: An issue with MODIS imagery on the 2019 data has been fixed.
- 17/10/2024: Added new model weights.
- Upload dataset to HuggingFace.
- Interpolate Sentinel-2 cloud masks for GSD 10m.
- Provide better cloud masks.
If you would like to use our work, please cite our paper:
@ARTICLE{10479972,
author={Sdraka, Maria and Dimakos, Alkinoos and Malounis, Alexandros and Ntasiou, Zisoula and Karantzalos, Konstantinos and Michail, Dimitrios and Papoutsis, Ioannis},
journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
title={FLOGA: A Machine-Learning-Ready Dataset, a Benchmark, and a Novel Deep Learning Model for Burnt Area Mapping With Sentinel-2},
year={2024},
volume={17},
number={},
pages={7801-7824},
doi={10.1109/JSTARS.2024.3381737}
}