Unleashing the Power of Contrastive Learning for Zero-Shot Video Summarization

Zongshang Pang; Yuta Nakashima; Mayu Otani; Hajime Nagahara

doi:10.3390/jimaging10090229

Journal of Imaging (Sep 2024)

Unleashing the Power of Contrastive Learning for Zero-Shot Video Summarization

Zongshang Pang,
Yuta Nakashima,
Mayu Otani,
Hajime Nagahara

Affiliations

Zongshang Pang: Intelligence and Sensing Lab, Osaka University, Suita 565-0871, Japan
Yuta Nakashima: Intelligence and Sensing Lab, Osaka University, Suita 565-0871, Japan
Mayu Otani: CyberAgent, Inc., Tokyo 150-0042, Japan
Hajime Nagahara: Intelligence and Sensing Lab, Osaka University, Suita 565-0871, Japan

DOI: https://doi.org/10.3390/jimaging10090229
Journal volume & issue: Vol. 10, no. 9
p. 229

Abstract

Read online

Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing. Past efforts have invariantly involved training summarization models with annotated summaries or heuristic objectives. In this work, we reveal that features pre-trained on image-level tasks contain rich semantic information that can be readily leveraged to quantify frame-level importance for zero-shot video summarization. Leveraging pre-trained features and contrastive learning, we propose three metrics featuring a desirable keyframe: local dissimilarity, global consistency, and uniqueness. We show that the metrics can well-capture the diversity and representativeness of frames commonly used for the unsupervised generation of video summaries, demonstrating competitive or better performance compared to past methods when no training is needed. We further propose a contrastive learning-based pre-training strategy on unlabeled videos to enhance the quality of the proposed metrics and, thus, improve the evaluated performance on the public benchmarks TVSum and SumMe.

Published in Journal of Imaging

ISSN: 2313-433X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Photography; Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.mdpi.com/journal/jimaging

About the journal

Abstract

Keywords