Pytorch code of Reasoning with Heterogeneous Graph Alignment for Video Question Answering.
Python 3.6 Pytorch 1.1
- TGIF-QA. Please cite Link.
- MSVD-QA and MSRVTT-QA. Please cite Link.
- Extracted feature of the two datasets is Here. Please cite Link.
We provide four pre-trained models of TGIF-QA dataset. Google Drive. The file path is HGA/saved_models/MMModel/Count_4.092.pth
.
- Trans_80.95.pth
- FrameQA_54.99.pth
- Count_4.092.pth
- Action_75.5.pth
The model test is carried out by loading the above pre-trained models. We provide the pre-trained models that achieve similar performance reported in the paper.
Task of TGIF-QA | Performance |
---|---|
Count | 4.092 |
Action | 75.5 |
Trans | 80.95 |
FrameQA | 54.99 |
CUDA_VISIBLE_DEVICES=0 python main.py --test --task Count --num_workers 2 --batch_size 64
We give a base example of the subtask Action on TGIF-QA dataset (removing training tricks). You can modify the parameters at will on the corresponding datasets. Please note that we have not tested the performance of the base model.
CUDA_VISIBLE_DEVICES=0 python main.py --task Action --num_workers 2 --batch_size 64 --lr 0.0001 --model 7 --dropout 0.3 --change_lr none --ablation none
@inproceedings{jiang2020reasoning,
title={Reasoning with Heterogeneous Graph Alignment for Video Question Answering},
author={Jiang, Pin and Han, Yahong},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2020}
}