Blind Video Quality Assessment for Ultra-High-Definition Video Based on Super-Resolution and Deep Reinforcement Learning

Zefeng Ying; Da Pan; Ping Shi

doi:10.3390/s23031511

Sensors (Jan 2023)

Blind Video Quality Assessment for Ultra-High-Definition Video Based on Super-Resolution and Deep Reinforcement Learning

Zefeng Ying,
Da Pan,
Ping Shi

Affiliations

Zefeng Ying: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Da Pan: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Ping Shi: School of Information and Communication Engineering, Communication University of China, Beijing 100024, China

DOI: https://doi.org/10.3390/s23031511
Journal volume & issue: Vol. 23, no. 3
p. 1511

Abstract

Read online

Ultra-high-definition (UHD) video has brought new challenges to objective video quality assessment (VQA) due to its high resolution and high frame rate. Most existing VQA methods are designed for non-UHD videos—when they are employed to deal with UHD videos, the processing speed will be slow and the global spatial features cannot be fully extracted. In addition, these VQA methods usually segment the video into multiple segments, predict the quality score of each segment, and then average the quality score of each segment to obtain the quality score of the whole video. This breaks the temporal correlation of the video sequences and is inconsistent with the characteristics of human visual perception. In this paper, we present a no-reference VQA method, aiming to effectively and efficiently predict quality scores for UHD videos. First, we construct a spatial distortion feature network based on a super-resolution model (SR-SDFNet), which can quickly extract the global spatial distortion features of UHD videos. Then, to aggregate the spatial distortion features of each UHD frame, we propose a time fusion network based on a reinforcement learning model (RL-TFNet), in which the actor network continuously combines multiple frame features extracted by SR-SDFNet and outputs an action to adjust the current quality score to approximate the subjective score, and the critic network outputs action values to optimize the quality perception of the actor network. Finally, we conduct large-scale experiments on UHD VQA databases and the results reveal that, compared to other state-of-the-art VQA methods, our method achieves competitive quality prediction performance with a shorter runtime and fewer model parameters.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords