Data preparation #13

sunhope54 · 2025-01-09T00:43:44Z

Thank you for your exordinary work! I want to know how to download the right dataset when occurring the various choises in the official websites.

sunrainyg · 2025-01-09T15:00:19Z

Thank you for your interest. We used the 2014 training, validation, and test images and the corresponding annotations.

sunhope54 · 2025-01-10T00:57:17Z

Thank you very much for your timely reply. Excuse me again. I would like to ask how to obtain the multi-modal and multi-task datasets in your training process. Aren't the storage formats of each dataset different? My main problem is that I didn't quite understand the content of DATASET.md. I'm sorry to have taken up your time. Please accept my apologies again!

sunrainyg · 2025-01-10T02:25:55Z

You can gather all the necessary multi-modal data for various tasks by following the instructions in DATASET.md to execute the scripts. Once the process is complete, all training data will be stored in the data/image_pairs_train directory.

This data must be generated before starting the training. During the training phase, the model will utilize data from different tasks for training.

To begin, you can run the following command:

python build_data/format_dataset_rp.py --save_root './image_pairs_train' --tasks ['det'] --data_root './data/coco'

Afterwards, you can modify the --tasks or --data_root parameters to generate data for other tasks.

Let me know if you have any further questions.

sunhope54 · 2025-01-16T02:52:42Z

Thank you very much for your previous answers, and I apologize again for my questions. I am still having some issues with building a multi-task instruction-tuning dataset. Can I build the dataset by executing the following code:
python build_data/format_dataset_rp.py --save_root './image_pairs' --tasks ['det'] --data_root './data/coco'
python build_data/format_dataset_rp.py --save_root './image_pairs' --tasks ['seg'] --data_root './data/ADE20k'
python build_data/format_dataset_rp.py --save_root './image_pairs' --tasks ['cls'] --data_root './data/Oxford-IIIT'
python build_data/format_dataset_rp.py --save_root './image_pairs' --tasks ['depes'] --data_root './data/NYUV2'
Also, when I process datasets other than coco, the following errors occur:

It seems that the code still deals with the coco dataset. How to sovle the problem?
Finally, thank you for taking the time to look at my problem. Best regards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data preparation #13

Data preparation #13

sunhope54 commented Jan 9, 2025

sunrainyg commented Jan 9, 2025

sunhope54 commented Jan 10, 2025

sunrainyg commented Jan 10, 2025

sunhope54 commented Jan 16, 2025

Data preparation #13

Data preparation #13

Comments

sunhope54 commented Jan 9, 2025

sunrainyg commented Jan 9, 2025

sunhope54 commented Jan 10, 2025

sunrainyg commented Jan 10, 2025

sunhope54 commented Jan 16, 2025