This repo was forked from MLULC. The main purpose of this repository is provide the source code that was used to produce the ECOCLIMAP-SG-ML land cover map, which is described in Rieutord et al. (2024). Land cover maps are translated thanks to auto-encoders, as illustrated in the following figure. ECOCLIMAP-SG-ML is obtained by map translation from ESA World Cover to ECOCLIMAP-SG+.
The main dependencies of this repository are Pytorch, TorchGeo, Numpy, Pandas, h5py, netCDF4 and Matplotlib. This code has been used with Conda environments.
- Create a new environment with
conda create -n mmt python=3.11; conda activate mmt
- Clone the package locally and install it with
pip install -e .
The program data-download.sh
is provided to help downloading and unpacking the data.
Copy it and execute it in the directory that will receive the data (the data
directory or another that will be linked as data
).
bash data-download.sh
The data original to this work is accessible in this Zenodo archive. It contains the TIF files of ECOCLIMAP-SG-ML, the HDF5 files for training and testing and the weights of the neural network.
Note that the ECOCLIMAP-SG land cover is downloaded and extracted with a Python program.
From the package root directory, and after having installed the software, the command is as follows (also given at the end of data-download.sh
):
python scripts/download_ecoclimapsg.py --landingdir data/tiff_data/ECOCLIMAP-SG
All data is assumed to be found in the data
folder of the repository.
We recommend to use symbolic links to adapt it to your file system.
The data
folder should be organised as follows:
data
├── outputs -> where the inference output will be stored
|
├── saved_models -> where the model checkpoints are stored.
|
├── tiff_data -> where the original land cover maps are stored in TIF format
| ├── ECOCLIMAP-SG
| ├── ECOCLIMAP-SG-ML
| ├── ECOCLIMAP-SG-plus
| └── ESA-WorldCover-2021
|
└── hdf5_data -> where the training data is stored
├── ecosg.hdf5
├── ecosg-train.hdf5
├── ecosg-test.hdf5
├── ecosg-val.hdf5
├── esawc.hdf5
└── ...
The full program takes approximately 4 hours to run. The volume downloaded (for all data) is approximately 56 GB. Once unzipped, the volume occupied by the data is approximately 370GB, distributed as follows
0 ./outputs
12M ./saved_models
266G ./tiff_data
103G ./hdf5_data
369G .
The amount of data can be reduced depending on the use you want to have of this repository.
Please remove the part you don't need in data-download.sh
to reduce the amount of data.
To check the software installation:
python tests/import_test.py
To check the data installation:
python tests/is_data_there.py [--tiff] [--weights] [--hdf5] [--all]
Once the landcovers are available in the data/tiff_data
folder, they can be visualized using the look_at_map.py
program.
For example, to look at ECOCLIMAP-SG-ML over the EURAT domain with a resolution of 0.1 degrees, the command is:
python -i scripts/look_at_map.py --lcname=EcoclimapSGML --domainname=eurat --res=0.1
See the header of look_at_map.py
for more examples.
Alternatively, you can export maps in various formats (netCDF, DIR/HDR), using the export
method of the land cover classes.
See the documentation of the method for more information.
Once the landcover and the weights are correctly installed, you can perform inference on any domain for which ESA World Cover is available.
The program to make the inference is scripts/inference_and_merging.py
.
python drafts/inference_and_merging.py
python -i scripts/look_at_map.py --lcname=<path given by the previous program>
See the documentation inside to run it.
The results presented in the manuscript can be reproduces thanks to the programs scripts/scores_from_inference.py
and scripts/qualitative_evaluation.py
.
python -i scripts/qualitative_evaluation.py
python -i scripts/scores_from_inference.py
See the documentation and variables inside.
To train the model, make sure you have set the correct parameters in a config file (a template is provide in the config
directory).
Point to this config file in the run.sh
program.
Then, just launch ./run.sh
.
The repository has the following directories:
assets
: contains images for the documentationconfigs
: contains the various configuration (YAML files) for the trainingdata
: contains all the data, as described earlier in this READMEdrafts
: contains draft programs using the packageexperiments
: contains all the files created when training a model (logs, checkpoints, visualizations...)mmt
: contains the source code of the MMT packagetests
: contains programs to test the installationscripts
: contains programs ready for use
Specifically, the mmt
folder will set the organisation of the MMT package in modules and sub-modules which are as follows:
mmt
├── agents
│ ├── __init__.py
│ ├── base.py
│ └── multiLULC.py
├── datasets
│ ├── __init__.py
│ ├── landcovers.py
│ ├── landcover_to_landcover.py
│ └── transforms.py
├── graphs
│ ├── __init__.py
│ └── models
│ ├── __init__.py
│ ├── custom_layers
│ │ ├── __init__.py
│ │ ├── double_conv.py
│ │ ├── down_block.py
│ │ └── up_block.py
│ ├── attention_autoencoder.py
│ ├── position_encoding.py
│ └── universal_embedding.py
├── inference
│ ├── __init__.py
│ ├── io.py
│ └── translators.py
└── utils
├── __init__.py
├── aliases.py
├── config.py
├── domains.py
├── misc.py
├── plt_utils.py
└── scores.py
The modules agents
, graphs
, datasets
and utils
are mostly inherited from the MLULC repository.
The other modules are specific additions for the ECOCLIMAP-SG-ML generation.
Two modules contain customised families of classes for which we provide the inheritance diagram here.
Landcovers are used to access the data from multiple TIF files:
mmt.datasets.landcovers
└── torchgeo.datasets.RasterDataset (-> https://torchgeo.readthedocs.io/en/v0.4.1/api/datasets.html#rasterdataset)
├── _TorchgeoLandcover
| ├── ESAWorldCover
| ├── EcoclimapSG
| | ├── SpecialistLabelsECOSGplus
| | ├── InferenceResults
| | └── EcoclimapSGML
| └── _CompositeMap
| ├── EcoclimapSGplus
| └── EcoclimapSGMLcomposite
|
├── _ScoreMap
| └── ScoreECOSGplus
|
└── _ProbaLandcover
└── InferenceResultsProba
Translators are used to perform map translation in inference mode:
mmt.inference.translators
└── _MapTranslator
├── MapMerger
└── EsawcToEsgp
├── EsawcToEsgpMembers
├── EsawcToEsgpProba
└── EsawcToEsgpAsMap -- landcovers.InferenceResults
└── EsawcToEsgpShowEnsemble
Thanks to
- Geoffrey Bessardon for creating the ECOCLIMAP-SG+ map and providing early releases, used as a reference in this work.
- Luc Baudoux for the initial implementation of the map translation network and the training data.
- Met 'Eireann for providing the computing facilities for this work.
This project is licensed under MIT License. See the LICENSE.txt file for details