An Efficient Multi-Scale Attention Feature Fusion Network for 4K Video Frame Interpolation

Xin Ning; Yuhang Li; Ziwei Feng; Jinhua Liu; Youdong Ding

doi:10.3390/electronics13061037

Electronics (Mar 2024)

An Efficient Multi-Scale Attention Feature Fusion Network for 4K Video Frame Interpolation

Xin Ning,
Yuhang Li,
Ziwei Feng,
Jinhua Liu,
Youdong Ding

Affiliations

Xin Ning: College of Shanghai Film, Shanghai University, 788 Guangzhong Road, Shanghai 200072, China
Yuhang Li: College of Shanghai Film, Shanghai University, 788 Guangzhong Road, Shanghai 200072, China
Ziwei Feng: College of Shanghai Film, Shanghai University, 788 Guangzhong Road, Shanghai 200072, China
Jinhua Liu: College of Shanghai Film, Shanghai University, 788 Guangzhong Road, Shanghai 200072, China
Youdong Ding: College of Shanghai Film, Shanghai University, 788 Guangzhong Road, Shanghai 200072, China

DOI: https://doi.org/10.3390/electronics13061037
Journal volume & issue: Vol. 13, no. 6
p. 1037

Abstract

Read online

Video frame interpolation aims to generate intermediate frames in a video to showcase finer details. However, most methods are only trained and tested on low-resolution datasets, lacking research on 4K video frame interpolation problems. This limitation makes it challenging to handle high-frame-rate video processing in real-world scenarios. In this paper, we propose a 4K video dataset at 120 fps, named UHD4K120FPS, which contains large motion. We also propose a novel framework for solving the 4K video frame interpolation task, based on a multi-scale pyramid network structure. We introduce self-attention to capture long-range dependencies and self-similarities in pixel space, which overcomes the limitations of convolutional operations. To reduce computational cost, we use a simple mapping-based approach to lighten self-attention, while still allowing for content-aware aggregation weights. Through extensive quantitative and qualitative experiments, we demonstrate the excellent performance achieved by our proposed model on the UHD4K120FPS dataset, as well as illustrate the effectiveness of our method for 4K video frame interpolation. In addition, we evaluate the robustness of the model on low-resolution benchmark datasets.

Published in Electronics

ISSN: 2079-9292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: http://www.mdpi.com/journal/electronics

About the journal

Abstract

Keywords