IEEE Access (Jan 2021)
Modular Framework and Instances of Pixel-Based Video Quality Models for UHD-1/4K
Abstract
The popularity of video on-demand streaming services increased tremendously over the last years. Most services use http-based adaptive video streaming methods. Today's movies and TV shows are typically recorded in UHD-1/4K and streamed using settings attuned to the end-device and current network conditions. Video quality prediction models can be used to perform an extensive analysis of video codec settings to ensure high quality. Hence, we present a framework for the development of pixel-based video quality models. We instantiate four different model variants (hyfr, hyfu, fume and nofu) for short-term video quality estimation targeting various use cases. Our models range from a no-reference video quality model to a full-reference model including hybrid model extensions that incorporate client accessible meta-data. All models share a similar architecture and the same core features, depending on their mode of operation. Besides traditional mean opinion score prediction, we tackle quality estimation as a classification and multi-output regression problem. Our performance evaluation is based on the publicly available AVT-VQDB-UHD-1 dataset. We further evaluate the introduced center-cropping approach to speed up calculations. Our analysis shows that our hybrid full-reference model (hyfr) performs best, e.g. 0.92 PCC for MOS prediction, followed by the hybrid no-reference model (hyfu), full-reference model (fume) and no-reference model (nofu). We further show that our models outperform popular state-of-the-art models. The introduced features and machine-learning pipeline are publicly available for use by the community for further research and extension.
Keywords