Scientific Reports (Sep 2023)

A lightweight weak semantic framework for cinematographic shot classification

  • Yuzhi Li,
  • Tianfeng Lu,
  • Feng Tian

DOI
https://doi.org/10.1038/s41598-023-43281-w
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Shot is one of the fundamental unit in the content structure of a film, which can provide insights into the film-director’s ideas. By analyzing the properties and types of shots, we can gain a better understanding of a film’s visual language. In this paper, we delve deeply into the task of shot type classification, proposing that utilizing multimodal video inputs can effectively improve the accuracy of the task, and that shot type classification is closely related to low-level spatiotemporal semantic features. To this end, we propose a Lightweight Weak Semantic Relevance Framework (LWSRNet) for classifying cinematographic shot types. Our framework comprises two modules: a Linear Modalities Fusion module (LMF Module) capable of fusing an arbitrary number of video modalities, and a Weak Semantic 3D-CNN based Feature Extraction Backbone (WSFE Module) for classifying shot movement and scale, respectively. Moreover, to support practical cinematographic analysis, we collect FullShots, a large film shot dataset containing 27K shots from 19 movies with professionally annotations for movement and scale information. Following experimental results validate the correctness of our proposed hypotheses, while our framework also outperforms previous methods in terms of accuracy with fewer parameters and computations, on both FullShots and MovieShots datasets. Our code is available at ( https://github.com/litchiar/ShotClassification ).