Video frame interpolation via spatial multi‐scale modelling

Zhe Qu; Weijing Liu; Lizhen Cui; Xiaohui Yang

doi:10.1049/cvi2.12281

IET Computer Vision (Jun 2024)

Video frame interpolation via spatial multi‐scale modelling

Zhe Qu,
Weijing Liu,
Lizhen Cui,
Xiaohui Yang

Affiliations

Zhe Qu: School of Software Shandong University Jinan China
Weijing Liu: School of Information Science and Engineering University of Jinan Jinan China
Lizhen Cui: School of Software Shandong University Jinan China
Xiaohui Yang: School of Information Science and Engineering University of Jinan Jinan China

DOI: https://doi.org/10.1049/cvi2.12281
Journal volume & issue: Vol. 18, no. 4
pp. 458 – 472

Abstract

Read online

Abstract Video frame interpolation (VFI) is a technique that synthesises intermediate frames between adjacent original video frames to enhance the temporal super‐resolution of the video. However, existing methods usually rely on heavy model architectures with a large number of parameters. The authors introduce an efficient VFI network based on multiple lightweight convolutional units and a Local three‐scale encoding (LTSE) structure. In particular, the authors introduce a LTSE structure with two‐level attention cascades. This design is tailored to enhance the efficient capture of details and contextual information across diverse scales in images. Secondly, the authors introduce recurrent convolutional layers (RCL) and residual operations, designing the recurrent residual convolutional unit to optimise the LTSE structure. Additionally, a lightweight convolutional unit named separable recurrent residual convolutional unit is introduced to reduce the model parameters. Finally, the authors obtain the three‐scale decoding features from the decoder and warp them for a set of three‐scale pre‐warped maps. The authors fuse them into the synthesis network to generate high‐quality interpolated frames. The experimental results indicate that the proposed approach achieves superior performance with fewer model parameters.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords