IEEE Access (Jan 2023)

Attention-Based Inter-Prediction for Versatile Video Coding

  • Quang Nhat Tran,
  • Shih-Hsuan Yang

DOI
https://doi.org/10.1109/ACCESS.2023.3303510
Journal volume & issue
Vol. 11
pp. 84313 – 84322

Abstract

Read online

Versatile Video Coding (VVC) is the latest video coding standard, which provides significant coding efficiency to its successors based on new coding tools and flexibility. In this paper, we propose a generative adversarial network-based inter-picture prediction approach for VVC. The proposed method involves two major parts, deep attention map estimation and deep frame interpolation. Adjacent VVC-coded frames in every other frame are taken as the reference data for the proposed inter-picture prediction. The deep attention map classifies pixels into high-interest and low-interest. The low-interest pixels are replaced by the generated data from frame interpolation without extra coded bits, while the other pixels are encoded using the conventional VVC coding tools. The generation of the attention map and interpolated frame can be incorporated into the VVC encoding algorithm under a unified framework. Experimental results show that the proposed method improves the coding efficiency of VVC with a moderate increase (26.7%) in runtime. A BD-rate savings of 1.91% on average was achieved compared to the VVC reference software in the Random-Access configuration. A significant reduction in bitrate was observed for chroma components (U and V).

Keywords