This repository contains the code for our CVPR 2024 paper HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation
. [Paper] [Website]
We test our codebase with PyTorch 1.12.0 with CUDA 11.6. Please install corresponding PyTorch and CUDA versions according to your computational resources.
Then install the rest of required packages by running pip install -r requirements.txt
. This includes jupyter, as you need it to run the notebooks.
We use the Visual Genome dataset in this work, which consists of 108,077 images, each annotated with objects and relations. Following previous work, we filter the dataset to use the most frequent 150 object classes and 50 predicate classes for experiments.
You can download the images here, then extract the two zip files and put all the images in a single folder:
Part I: https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip
Part II: https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip
Then download VG metadata preprocessed by IMP: annotations, class info,and image metadata and copy those three files in a single folder.
Finally, update config.py
to with a path to the aforementioned data, as well as the absolute path to this directory.
We also provide two pre-trained weights:
-
The pre-trained Faster-RCNN checkpoint trained by MotifNet from https://www.dropbox.com/s/cfyqhskypu7tp0q/vg-24.tar?dl=0 and place in
checkpoints/vgdet/vg-24
. -
The pre-trained GB-Net checkpoint
vgrel-11
from https://github.com/alirezazareian/gbnet and place incheckpoints/vgdet/vgrel-11
.
If you want to train from scratch, you can pre-train the model using Faster-RCNN checkpoint. However, we recommend to train from the GB-Net checkpoint.
You can simply follow the instructions in the notebooks to run HiKER-SGG experiments:
- For the PredCls task:
train: ipynb/train_predcls/hikersgg_predcls_train.ipynb
,evaluate: ipynb/eval_predcls/hikersgg_predcls_test.ipynb
. - For the SGCls task:
train: ipynb/train_sgcls/hikersgg_sgcls_train.ipynb
,evaluate: ipynb/eval_sgcls/hikersgg_sgcls_train.ipynb
.
Note that for the PredCls task, we start training from the GB-Net checkpoint; and for the SGCls task, we start training from the best PredCls checkpoint.
In our paper, we introduce a new synthetic VG-C benchmark for SGG, containing 20 challenging image corruptions, including simple transformations and severe weather conditions.
We include the code for generating these 20 corruptions in dataloaders/corruptions.py
. To use it, you also need to modify the codes in dataloaders/visual_genome.py
, and also enable -test_n
in the evaluation notebook file.
Our codebase is adapted from GB-Net and EB-Net. We thank the authors for releasing their code!
If you have any questions, please contact at [email protected].
If you find this code useful, please consider citing our work:
@inproceedings{zhang2024hiker,
title={HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation},
author={Zhang, Ce and Stepputtis, Simon and Campbell, Joseph and Sycara, Katia and Xie, Yaqi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={28233--28243},
year={2024}
}