Machine Learning with Applications (Sep 2022)

Compressed video ensemble based pseudo-labeling for semi-supervised action recognition

  • Hayato Terao,
  • Wataru Noguchi,
  • Hiroyuki Iizuka,
  • Masahito Yamamoto

Journal volume & issue
Vol. 9
p. 100336

Abstract

Read online

Some recent studies have focused on deep learning based semi-supervised learning for action recognition. However, it is difficult to scale up their training because their input is RGB frames, the obtainment of which incurs computational and storage costs. In this paper, we propose a semi-supervised action recognition method that makes it easy to scale up the training by using features stored in compressed videos. Our method directly extracts multiple types of input features from compressed videos without any decoding and generates artificial labels of unlabeled videos through the ensembling of the predictions from these features. In addition to the standard supervised training on labeled videos, our models are trained to predict the artificial labels from strongly augmented features in unlabeled compressed videos. We show that our method is more efficient and achieves a better classification performance on some widely used datasets than conventional semi-supervised learning methods applying RGB frames.

Keywords