Reducing Video Coding Complexity Based on CNN-CBAM in HEVC

Huayu Li; Geng Wei; Ting Wang; ThiOanh Bui; Qian Zeng; Ruliang Wang

doi:10.3390/app131810135

Applied Sciences (Sep 2023)

Reducing Video Coding Complexity Based on CNN-CBAM in HEVC

Huayu Li,
Geng Wei,
Ting Wang,
ThiOanh Bui,
Qian Zeng,
Ruliang Wang

Affiliations

Huayu Li: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China
Geng Wei: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China
Ting Wang: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China
ThiOanh Bui: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China
Qian Zeng: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China
Ruliang Wang: School of Physics and Electronics, Nanning Normal University, Nanning 530100, China

DOI: https://doi.org/10.3390/app131810135
Journal volume & issue: Vol. 13, no. 18
p. 10135

Abstract

Read online

High-efficiency video coding (HEVC) outperforms H.264 in coding efficiency. However, the rate–distortion optimization (RDO) process in coding tree unit (CTU) partitioning requires an exhaustive exploration of all possible quad-tree partitions, resulting in high encoding complexity. To simplify this process, this paper proposed a convolution neural network (CNN) based optimization algorithm combined with a hybrid attention mechanism module. Firstly, we designed a CNN compatible with the current coding unit (CU) size to accurately predict the CU partitions. In addition, we also designed a convolution block to enhance the information interaction between CU blocks. Then, we introduced the convolution block attention module (CBAM) into CNN, called CNN-CBAM. This module concentrates on important regions in the image and attends to the target object correctly. Finally, we integrated the CNN-CBAM into the HEVC coding framework for CU partition prediction in advance. The proposed network was trained, validated, and tested using a large scale dataset covering various scenes and objects, which provides extensive samples for intra-frame CU partition prediction in HEVC. The experimental findings demonstrate that our scheme can reduce the coding time by up to 64.05% on average compared to a traditional HM16.5 encoder, with only 0.09 dB degradation in BD-PSNR and a 1.94% increase in BD-BR.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords