-
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
[Paper][Homepage]
10,080 in-the-wild videos and annotated 62,535 QA pairs -
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence (CVPR 2019)
[Paper][Homepage]
1,250 videos, 7500 questions, 30, 000 correct answers and 22,500 incorrect answers -
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering (CVPR 2017)
[Paper][Homepage]
165K QA pairs for the animated GIFs from the TGIF dataset -
MovieQA: Story Understanding Benchmark (CVPR 2016)
[Paper][Homepage]
14,944 questions, 408 movies -
MarioQA: Answering Questions by Watching Gameplay Videos (ICCV 2017)
[Paper][Homepage]
13 hours of gameplays, 187,757 examples with automatically generated QA pairs; 92,874 unique QA pairs and each video clip contains 11.3 events in average -
TVQA: Localized, Compositional Video Question Answering (EMNLP 2018)
[Paper][Homepage]
152,545 QA pairs from 21,793 clips, spanning over 460 hours of video