Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) #191

WEIYanbin1999 · 2024-11-03T07:49:05Z

Dear Authors,
We'd like to add "GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning" to this repository, which has been accepted by NeurIPS 2024. Paper.
GITA is the first work to explore and establish the promising vision-language question answering on graph-related reasoning area. It systematically enable VLMs for general language-based graph reasoning tasks.
In this paper, we provide new pre-trained VLM model weights for graph reasoning:
Model: GITA-7B/13B, the model weights are in both Github repo and Model weight huggingface.
We also proposed the first dataset GVLQA for vision-language graph reasoning, they are VQA image-text-query-answer pairs for graph reasoning. GVLQA Datasets.
Wish your research smoothly. Looking forward to your reply!

WEIYanbin1999 · 2024-11-07T16:26:08Z

Update

xjtupanda · 2024-11-09T11:14:55Z

Thanks for sharing! We've incorporated the work into our repo.
Please also consider citing our works:

@article{fu2023mme,
  title={MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models},
  author={Fu, Chaoyou and Chen, Peixian and Shen, Yunhang and Qin, Yulei and Zhang, Mengdan and Lin, Xu and Yang, Jinrui and Zheng, Xiawu and Li, Ke and Sun, Xing and others},
  journal={arXiv preprint arXiv:2306.13394},
  year={2023}
}

@article{fu2024vita,
  title={VITA: Towards Open-Source Interactive Omni Multimodal LLM},
  author={Fu, Chaoyou and Lin, Haojia and Long, Zuwei and Shen, Yunhang and Zhao, Meng and Zhang, Yifan and Wang, Xiong and Yin, Di and Ma, Long and Zheng, Xiawu and He, Ran and Ji, Rongrong and Wu, Yunsheng and Shan, Caifeng and Sun, Xing},
  journal={arXiv preprint arXiv:2408.05211},
  year={2024}
}

@article{fu2024video,
  title={Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis},
  author={Fu, Chaoyou and Dai, Yuhan and Luo, Yondong and Li, Lei and Ren, Shuhuai and Zhang, Renrui and Wang, Zihan and Zhou, Chenyu and Shen, Yunhang and Zhang, Mengdan and others},
  journal={arXiv preprint arXiv:2405.21075},
  year={2024}
}

@article{yin2023survey,
  title={A survey on multimodal large language models},
  author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Li, Ke and Sun, Xing and Xu, Tong and Chen, Enhong},
  journal={arXiv preprint arXiv:2306.13549},
  year={2023}
}

@article{yin2023woodpecker,
  title={Woodpecker: Hallucination correction for multimodal large language models},
  author={Yin, Shukang and Fu, Chaoyou and Zhao, Sirui and Xu, Tong and Wang, Hao and Sui, Dianbo and Shen, Yunhang and Li, Ke and Sun, Xing and Chen, Enhong},
  journal={arXiv preprint arXiv:2310.16045},
  year={2023}
}

WEIYanbin1999 · 2024-11-09T12:11:58Z

Thanks and have a good day.
Btw, could you please also consider adding GVLQA to the dataset/benchmark, it has 529K VQA samples, comprising 5 subsets with special visual graph augmentations to perform 4 representative graph VQA tasks.
The Usages of it include:

As alignment VQA dataset to pretrain/finetune model
As high-quality evaluation benchmark to evaluate the reasoning ability on structural data for various VLMs.
Best wishes

xjtupanda · 2024-11-29T06:46:06Z

Thanks and have a good day. Btw, could you please also consider adding GVLQA to the dataset/benchmark, it has 529K VQA samples, comprising 5 subsets with special visual graph augmentations to perform 4 representative graph VQA tasks. The Usages of it include:

As alignment VQA dataset to pretrain/finetune model

As high-quality evaluation benchmark to evaluate the reasoning ability on structural data for various VLMs.
Best wishes

Sure. We've added it to the benchmark section.

WEIYanbin1999 changed the title ~~May you add NeurIPS GITA & GVLQA~~ May you add GITA & GVLQA (Accepted by NeurIPS 2024) Nov 3, 2024

WEIYanbin1999 changed the title ~~May you add GITA & GVLQA (Accepted by NeurIPS 2024)~~ May you add GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) Nov 7, 2024

WEIYanbin1999 changed the title ~~May you add GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024)~~ Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) Nov 7, 2024

WEIYanbin1999 closed this as completed Nov 9, 2024

WEIYanbin1999 reopened this Nov 9, 2024

Repository owner deleted a comment from awangbrace Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) #191

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) #191

WEIYanbin1999 commented Nov 3, 2024

WEIYanbin1999 commented Nov 7, 2024

xjtupanda commented Nov 9, 2024

WEIYanbin1999 commented Nov 9, 2024 •

edited

Loading

xjtupanda commented Nov 29, 2024

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) #191

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 2024) #191

Comments

WEIYanbin1999 commented Nov 3, 2024

WEIYanbin1999 commented Nov 7, 2024

xjtupanda commented Nov 9, 2024

WEIYanbin1999 commented Nov 9, 2024 • edited Loading

xjtupanda commented Nov 29, 2024

WEIYanbin1999 commented Nov 9, 2024 •

edited

Loading