Pixel based Video Quality Models for 4K

The following repository consists of in total four pixel based video quality prediction models, including no-reference, full-reference and for each baseline type a hybrid mode 0 based version.

All models are able to be used for video quality prediction of short video sequences (up to 10 seconds) in case of UHD-1/4K resolution with 60fps.

This repository also ships pre-trained models (see pixelmodels/models) that are trained on dedicated performed subjective video quality tests. Moreover, all tools required to build new models are included in this repository, thus beside each models prediction code, e.g. nofu, also a train_{modelname} script is provided. The pixelmodels use quat for features and processing.

Requirements

The models and software is only tested on linux systems (e.g. Ubuntu 19.04, 20.04, 22.04)

The following software is required:

python3, python3-pip, python3-venv
poetry >= 1.0.3 (e.g. pip3 install poetry==1.0.3)
ffmpeg
git

To install all python3 dependencies, run the following command in the folder of this repository:

poetry install

(if you have problems with pip, run pip3 install --user -U pip)

poetry will manage a local virtual environment with suitable versions of all dependencies of the models.

Included video quality models

In total in this repository four video quality models are included:

nofu: no-reference pixel based model
hyfu: hybrid no-reference pixel based model
fume: full-reference pixel based model
hyfr: hybrid full-reference model

Both hybrid models require access to bitrate, resolution, codec and framerate. This meta-data will be automatically extracted from the given video files. A full description of the models is presented in the mentioned paper goering2020pixel, see Acknowledgments.

Usage nofu

To use the provided tool, e.g. run

poetry run nofu predict test_videos/test_video_h264.mkv

Retraining the models

To retrain the models it is required to have CSV files according to the used format of AVT-VQDB-UHD-1

To enable the rating distribution training additional data is required, this is not official part of the AVT-VQDB-UHD-1 dataset, in case you need it contact the authors of this database.

For each model a train_{modelname} tool is provided that can be started, e.g. for nofu with the following command line:

poetry run train_nofu data/4k_databases_full/test_1/per_user.csv

Important for training all features stored in the feature directory will be used, this ensures that multiple databases can be used for an incremental and overall training.

The file per_user.csv needs to have the following structure:

video_name,user1,...,userX,mos,ci
american_football_harmonic_200kbps_360p_59.94fps_h264.mp4,1,...,1,1.0,0.0

The userY columns are the ratings of the individual users of the subjective test, these columns are optional, and if not existing will be skipped during training.

In case of a full references model, the corresponding source video is automatically deduced by the filename (longest matching prefix, thus in the example it will be american_football_harmonic.mkv selected as source video), the lookup folder for source videos is for the given example data/4k_databases_full/src_videos. In case a different structure or format is used, then pixelmodels.train_common.read_database must be adjusted.

General Architecture

The general architecture of all models consists of performing a center cropping (default value is a 360p center crop) on the re-scaled videos (e.g. reference and distorted videos are unified to have the same resolutions/framerate/pixel format). Afterwards for each video (e.g. only the distorted or distorted and reference) features for each frame and motion related features are estimated, here depending on the model the number of features varies. After the feature extraction is performed, all extracted values for a given input video are temporally pooled to remove time dependencies. The pooled feature values are later feed into a machine learning pipeline, that is random forest based.

All developed models can be used to predict:

mos as classification
mos as regression
rating distribution as multi instance regression

For more details about the features and pipeline, checkout the paper goering2020pixel.

Acknowledgments

If you use this software in your research, please include a link to the repository and reference the following paper.

@inproceedings{goering2021pixel,
  title={Modular Framework and Instances of Pixel-based Video Quality Models for UHD-1/4K},
  author={Steve G\"oring and Rakesh {Rao Ramachandra Rao} and Bernhard Feiten and Alexander Raake},
  journal={IEEE Access},
  volume={9},
  pages={31842-31864},
  year={2021},
  publisher={IEEE},
  doi={10.1109/ACCESS.2021.3059932},
  url={https://ieeexplore.ieee.org/document/9355144}
}

License

GNU General Public License v3. See LICENSE.md file in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
pixelmodels		pixelmodels
test_videos		test_videos
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.md		LICENSE.md
README.md		README.md
development_guide.md		development_guide.md
general_architecture.png		general_architecture.png
pyproject.toml		pyproject.toml
release.py		release.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pixel based Video Quality Models for 4K

Requirements

Included video quality models

Usage nofu

Retraining the models

General Architecture

Acknowledgments

License

About

Releases

Packages

Languages

License

Telecommunication-Telemedia-Assessment/pixelmodels

Folders and files

Latest commit

History

Repository files navigation

Pixel based Video Quality Models for 4K

Requirements

Included video quality models

Usage nofu

Retraining the models

General Architecture

Acknowledgments

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages