Video summarisation by deep visual and categorical diversity

Pedro Atencio; Sánchez‐Torres German; John William Branch; Claudio Delrieux

doi:10.1049/iet-cvi.2018.5436

IET Computer Vision (Sep 2019)

Video summarisation by deep visual and categorical diversity

Pedro Atencio,
Sánchez‐Torres German,
John William Branch,
Claudio Delrieux

Affiliations

Pedro Atencio: Faculty of EngineeringInstituto Tecnológico MetropolitanoMedellinColombia
Sánchez‐Torres German: Faculty of EngineeringUniversidad del MagdalenaSanta MartaColombia
John William Branch: Faculty of MinesUniversidad Nacional de ColombiaMedellinColombia
Claudio Delrieux: Electric and Computing Engineering DepartmentUniversidad Nacional del SurBahia BlancaArgentina

DOI: https://doi.org/10.1049/iet-cvi.2018.5436
Journal volume & issue: Vol. 13, no. 6
pp. 569 – 577

Abstract

Read online

The authors propose a video‐summarisation method based on visual and categorical diversities using pre‐trained deep visual and categorical models. Their method extracts visual and categorical features from a pre‐trained deep convolutional network (DCN) and a pre‐trained word‐embedding matrix. Using visual and categorical information they obtain a video diversity estimation, which is used as an importance score to select segments from the input video that best describes it. Their method also allows performing queries during the search process, in this way personalising the resulting video summaries according to the particular intended purposes. The performance of the method is evaluated using different pre‐trained DCN models in order to select the architecture with the best throughput. They then compare it with other state‐of‐the‐art proposals in video summarisation using a data‐driven approach with the public dataset SumMe, which contains annotated videos with per‐fragment importance. The results show that their method outperforms other proposals in most of the examples. As an additional advantage, their method requires a simple and direct implementation that does not require a training stage.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords