Data Preparation

Dataset card

Prepare different data according to the following dataset card.

Stage	Data Name	Data Path
MiniGPT-4 DPO Training	Visual Genome, HA-DPO data, CCSBUAlign	`ha_dpo/data/VG`, `ha_dpo/data/hadpo/minigpt4`, `ha_dpo/data/cc_sbu_align`
LLaVA-1.5 DPO Training	Visual Genome, HA-DPO data	`ha_dpo/data/VG`, `ha_dpo/data/ha_dpo/llava-v1.5`
InstructBLIP DPO Training	Visual Genome, HA-DPO data	`ha_dpo/data/VG`, `ha_dpo/data/ha_dpo/InstructBLIP`
SHR Evaluation	Visual Genome, SHR annotation	`ha_dpo/data/VG`
POPE Evaluation	COCO val2014, POPE annotation	`ha_dpo/data/coco2014`, `ha_dpo/data/POPE`

For example, if you want to perform MiniGPT-4 model training, following data are needed:

MiniGPT-4 hallucination-aware positive-negative data
Visual Genome
CCSBUAlign

Hallucination-aware positive-negative data

We provide hallucination-aware positive-negative data targeted at each LVLM. Download data from the following and put data under ha_dpo/data/hadpo:

MiniGPT-4	LLaVA-1.5	InstructBLIP
huggingface, opendatalab	huggingface, opendatalab	huggingface, opendatalab

data structure

ha_dpo/data/hadpo
├── llava-v1.5
│   ├── desc_data.json
│   └── pope_data.json
├── InstructBLIP
│   ├── desc_data.json
│   └── pope_data.json
└── minigpt4
    ├── desc_data.json
    └── pope_data.json

Visual Genome

Download these data from Visual Genome v2 and put data under ha_dpo/data/VG:

images (both part1 and part2)
image meta-data
region descriptions

data structure

ha_dpo/data/VG
├── image_data.json
├── region_descriptions.json
├── VG_100K
│    └──...
└── VG_100K_2
     └──...

SHR annotation

Download SHR human-annotated factual annotation from download and put data under ha_dpo/data/shr.

data structure

ha_dpo/data/shr
├── shr_factual_part1.jsonl
├── shr_factual_part2.jsonl
└── val_images_final.json

POPE

Clone POPE repo from official website and put data under ha_dpo/data.

data structure

ha_dpo/data/
└── POPE
     └── ...

COCO2014 val images

For POPE evaluation, coco2014 images are required. Download COCO2014 validation images from COCO and put data under ha_dpo/data/shr.

data structure

ha_dpo/data/coco2014
└── val2014
     └── ...

CCSBUAlign

CCSBUAlign is the SFT data used in MiniGPT-4. We use this data only during MiniGPT-4 model training to ensure the stability of preference learning. Refer to official website to obtain CCSBUAlign data. Put data under ha_dpo/data/cc_sbu_align.

data structure

ha_dpo/data/cc_sbu_align
├── filter_cap.json
└── image
     └── ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_preparation.md

data_preparation.md

Data Preparation

Dataset card

Hallucination-aware positive-negative data

Visual Genome

SHR annotation

POPE

COCO2014 val images

CCSBUAlign

Files

data_preparation.md

Latest commit

History

data_preparation.md

File metadata and controls

Data Preparation

Dataset card

Hallucination-aware positive-negative data

Visual Genome

SHR annotation

POPE

COCO2014 val images

CCSBUAlign