Skip to content

Latest commit

 

History

History
165 lines (140 loc) · 5.99 KB

README.md

File metadata and controls

165 lines (140 loc) · 5.99 KB

A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes1, Tuan-Hung Vu1,2, Andrei Bursuc1,2, Patrick Pérez3, Raoul de Charette1
1 Inria, 2 valeo.ai, 3 Kyutai

Project page: https://astra-vision.github.io/FAMix/
Paper: https://arxiv.org/abs/2311.17922

TL; DR: FAMix (for Freeze, Augment, and Mix) is a simple method for domain generalized semantic segmentation, based on minimal fine-tuning, language-driven patch-wise style augmentation, and patch-wise style mixing of original and augmented styles.

Citation

@InProceedings{fahes2024simple,
  title={A Simple Recipe for Language-guided Domain Generalized Segmentation},
  author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
  booktitle={CVPR},
  year={2024}
}

Demo

Test on unseen youtube videos in different cities
Training dataset: GTA5
Backbone: ResNet-50
Segmenter: DeepLabv3+

Watch the full video on YouTube

Table of Content

Installation

Dependencies

First create a new conda environment with the required packages:

conda env create --file environment.yml

Then activate environment using:

conda activate famix_env

Datasets

  • ACDC: Download ACDC images and labels from ACDC. Please follow the dataset directory structure:

    <ACDC_DIR>/                   % ACDC dataset root
    ├── rbg_anon/                 % input image (rgb_anon_trainvaltest.zip)
    └── gt/                       % semantic segmentation labels (gt_trainval.zip)
  • BDD100K: Download BDD100K images and labels from BDD100K. Please follow the dataset directory structure:

    <BDD100K_DIR>/              % BDD100K dataset root
    ├── images/                 % input image
    └── labels/                 % semantic segmentation labels
  • Cityscapes: Follow the instructions in Cityscapes to download the images and semantic segmentation labels. Please follow the dataset directory structure:

    <CITYSCAPES_DIR>/             % Cityscapes dataset root
    ├── leftImg8bit/              % input image (leftImg8bit_trainvaltest.zip)
    └── gtFine/                   % semantic segmentation labels (gtFine_trainvaltest.zip)
  • GTA5: Download GTA5 images and labels from GTA5. Please follow the dataset directory structure:

    <GTA5_DIR>/                   % GTA5 dataset root
    ├── images/                   % input image 
    └── labels/                   % semantic segmentation labels
  • Mapillary: Download Mapillary images and labels from Mapillary. Please follow the dataset directory structure:

    <MAPILLARY_DIR>/              % Mapillary dataset root
    ├── training                  % Training subset 
     └── images                     % input image
     └── labels                     % semantic segmentation labels
    ├── validation                % Validation subset
     └── images                     % input image
     └── labels                     % semantic segmentation labels
  • Synthia: Download Synthia images and labels from SYNTHIA-RAND-CITYSCAPES and split it following SPLIT-DATA. Please follow the dataset directory structure:

    <SYNTHIA>/                 % Synthia dataset root
    ├── RGB/                   % input image 
    └── GT/                    % semantic segmentation labels

Trained models

The trained models are available here.

Running FAMix

Style mining

python3 patch_PIN.py \
  --dataset <dataset_name> \
  --data_root <dataset_root> \
  --resize_feat \
  --save_dir <path_for_learnt_parameters_saving>

Training

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--total_itrs  40000 \
--batch_size 8 \
--val_interval 750 \
--transfer \
--data_aug \
--ckpts_path <path_to_save_checkpoints> \
--path_for_stats <path_for_mined_styles>

Evaluation

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--ckpt <path_to_tested_model> \
--test_only \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>   

Inference & Visualization

To test any model on any image and visualize the output, please add the images to predict_test directory and run:

python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>

License

FAMix is released under the Apache 2.0 license.

Acknowledgement

The code is based on this implementation of DeepLabv3+, and uses code from CLIP, PODA and RobustNet.


↑ back to top