Adaptive Multi-Modal Ensemble Network for Video Memorability Prediction

Jing Li; Xin Guo; Fumei Yue; Fanfu Xue; Jiande Sun

doi:10.3390/app12178599

Applied Sciences (Aug 2022)

Adaptive Multi-Modal Ensemble Network for Video Memorability Prediction

Jing Li,
Xin Guo,
Fumei Yue,
Fanfu Xue,
Jiande Sun

Affiliations

Jing Li: School of Jouralism and Communication, Shandong Normal University, Jinan 250061, China
Xin Guo: Shandong Haiyi Digital Technology Co., Ltd., Zibo 256410, China
Fumei Yue: School of Information Science and Engineering, Shandong Normal University, Jinan 250061, China
Fanfu Xue: School of Information Science and Engineering, Shandong Normal University, Jinan 250061, China
Jiande Sun: School of Information Science and Engineering, Shandong Normal University, Jinan 250061, China

DOI: https://doi.org/10.3390/app12178599
Journal volume & issue: Vol. 12, no. 17
p. 8599

Abstract

Read online

Video memorability prediction aims to quantify the credibility of being remembered according to the video content, which provides significant value in advertising design, social media recommendation, and other applications. However, the main attributes that affect the memorability prediction have not been determined so that making the design of the prediction model more challenging. Therefore, in this study, we analyze and experimentally verify how to select the most impact factors to predict video memorability. Furthermore, we design a new framework, Adaptive Multi-modal Ensemble Network, based on the chosen vital impact factors to predict video memorability efficiently. Specifically, we first conduct three main impact factors that affect video memorability, i.e., temporal 3D information, spatial information and semantics derived from video, image and caption, respectively. Then, the Adaptive Multi-modal Ensemble Network integrates the three individual base learners (i.e., ResNet3D, Deep Random Forest and Multi-Layer Perception) into a weighted ensemble framework to score the video memorability. In addition, we also design an adaptive learning strategy to update the weights based on the importance of memorability, which is predicted by the base learners rather than assigning weights manually. Finally, the experiments on the public VideoMem dataset demonstrate that the proposed method provides competitive results and high efficiency for video memorability prediction.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords