The following datasets can be loaded with the current codes after downloaded (see example scripts):
FR Dataset | Description | NR Dataset | Description |
---|---|---|---|
PIPAL | 2AFC | FLIVE(PaQ-2-PiQ) | Tech & Aesthetic |
BAPPS | 2AFC | SPAQ | Mobile |
PieAPP | 2AFC | AVA | Aesthetic |
KADID-10k | KonIQ-10k(++) | ||
LIVEM | LIVEChallange | ||
LIVE | PIQ2023 | Portrait dataset | |
TID2013 | GFIQA | Face IQA Dataset | |
TID2008 | |||
CSIQ |
Please see more details at Awesome Image Quality Assessment
Here are some other resources to download the dataset:
We create general interfaces for FR and NR datasets in pyiqa/data/general_fr_dataset.py
and pyiqa/data/general_nr_dataset.py
. The main arguments are
opt
contains all dataset options, includingdataroot_target
: path of target image folder.dataroot_ref [optional]
: path of reference image folder.meta_info_file
: file containing meta information of images, including relative image paths, mos labels and other labels.augment [optional]
data augmentation transform listhflip
: flip input images or pairsrandom_crop
: int or tuple, random crop input images or pairs
split_file [optional]
:train/val/test
split file*.pkl
. If not specified, will load the whole dataset.split_index [optional]
: default1
, which split to use, only valid whensplit_file
is specified.dmos max
: some dataset use difference of mos. Set this to non-zero will change dmos to mos withmos = dmos_max - dmos
.phase
: phase labels [train, val, test]
The above interface requires two files to provide the dataset information, i.e., the meta_info_file
and split_file
. The meta_info_file
are .csv
files, and has the following general format
- For NR datasets: name, mos(mean), std
```
100.bmp 32.56107532210109 19.12472638223644
```
- For FR datasets: ref_name, dist_name, mos(mean), std
```
I01.bmp I01_01_1.bmp 5.51429 0.13013
```
The split_file
are .pkl
files which contains the train/val/test
information with python dictionary in the following format:
{
train_index: {
train: [train_index_list]
val: [val_index_list] # blank if no validation split
test: [test_index_list] # blank if no test split
}
}
The train_index starts from 1
. And the sample indexes correspond to the row index of meta_info_file
, starting from 0
. We already generate the files for mainstream public datasets with scripts in folder ./scripts/.
Note that we generate train/val/test
splits follow the principles below:
- For datasets which has official splits, we follow their splits.
- For official split which has no
val
part, e.g., AVA dataset, we random separate 5% from training data as validation. - For small datasets which requires n-split results, we use
train:val=8:2
ratio. - All random seeds are set to
123
when needed.
Some of the supported datasets have different label formats and file organizations, and we create specific dataloader for them:
- Live Challenge. The first 7 samples are usually removed in the related works.
- AVA. Different label formats.
- PieAPP. Different label formats.
- BAPPS. Different label formats.
You may use tests/test_datasets.py
to test whether a dataset can be correctly loaded.