IEEE Access (Jan 2019)

User-Guided Clustering for Video Segmentation on Coarse-Grained Feature Extraction

  • Xinhui Peng,
  • Rui Li,
  • Jilong Wang,
  • Hao Shang

DOI
https://doi.org/10.1109/ACCESS.2019.2946889
Journal volume & issue
Vol. 7
pp. 149820 – 149832

Abstract

Read online

Video segmentation is the task of temporally dividing a video into semantic sections, which are typically based on a specific concept or a theme that is usually defined by the user's intention. However, previous studies of video segmentation have that far not taken a user's intention into consideration. In this paper, a two-stage user-guided video segmentation framework has been presented, including dimension reduction and temporal clustering. During the dimension reduction stage, a coarse granularity feature extraction is conducted by a deep convolutional neural network pre-trained on ImageNet. In the temporal clustering stage, the information of the user's intention is utilized to segment videos on time domain with a proposed operator, which calculates the similarity distance between dimension reduced frames. To provide more insight into the videos, a hierarchical clustering method that allows users to segment videos at different granularities is proposed. Evaluation on Open Video Scene Detection(OVSD) dataset shows that the average F-score achieved by the proposed method is 0.72, even coarse-grained feature extraction is adopted. The evaluation also demonstrated that the proposed method can not only produce different segmentation results according to the user's intention, but it also produces hierarchical segmentation results from a low level to a higher abstraction level.

Keywords