Efficient Partition Decision Based on Visual Perception and Machine Learning for H.266/Versatile Video Coding

Mei-Juan Chen; Cheng-An Lee; Yu-Hsiang Tsai; Chieh-Ming Yang; Chia-Hung Yeh; Lih-Jen Kau; Chuan-Yu Chang

doi:10.1109/ACCESS.2022.3168155

IEEE Access (Jan 2022)

Efficient Partition Decision Based on Visual Perception and Machine Learning for H.266/Versatile Video Coding

Mei-Juan Chen,
Cheng-An Lee,
Yu-Hsiang Tsai,
Chieh-Ming Yang,
Chia-Hung Yeh,
Lih-Jen Kau,
Chuan-Yu Chang

Affiliations

Mei-Juan Chen: ORCiD; Department of Electrical Engineering, National Dong Hwa University, Hualien, Taiwan
Cheng-An Lee: Department of Electrical Engineering, National Dong Hwa University, Hualien, Taiwan
Yu-Hsiang Tsai: Department of Electrical Engineering, National Dong Hwa University, Hualien, Taiwan
Chieh-Ming Yang: Department of Electrical Engineering, National Dong Hwa University, Hualien, Taiwan
Chia-Hung Yeh: ORCiD; Department of Electrical Engineering, National Taiwan Normal University, Taipei, Taiwan
Lih-Jen Kau: ORCiD; Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan
Chuan-Yu Chang: ORCiD; Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin, Taiwan

DOI: https://doi.org/10.1109/ACCESS.2022.3168155
Journal volume & issue: Vol. 10
pp. 42141 – 42150

Abstract

Read online

H.266/Versatile Video Coding (VVC) is the latest international video coding standard to encode ultra-high-definition video effectively. The quadtree with nested multi-type tree (QT-MTT) structure provides various sizes of coding tree partitioning and allows the nested binary tree (BT) split and ternary tree (TT) split at each QT level. Furthermore, numerous advanced coding tools are equipped in the H.266/VVC encoder. However, the encoding time increases tremendously. Previous researches regarding the fast coding algorithm of H.266/VVC seldom mention perceptual redundancy. This paper utilizes the human vision model of just noticeable difference to extract the visually distinguishable pixels that may affect the visual perception. We observe that the distributions acquired by the horizontal and vertical projections of visually distinguishable pixels within the coding unit are related to their corresponding MTT splitting modes. Therefore, the distributions representing the perceptual information of human vision are used to be the input features of machine learning. Fast MTT decision determined by the random forest models of machine learning is proposed to quickly select the partition for intra coding. Experimental results demonstrate that the proposed method can effectively accelerate intra coding process while maintaining good bitrate and video quality based on the properties of the visual perception. The proposed algorithm provides better performance than the previous work.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords