IEEE Access (Jan 2024)

Dynamic Texture Classification Using AutoEncoder-Based Local Features and Fisher Vector Encoding

  • Zhe Li,
  • Xiaochao Zhao,
  • Tianfan Zhang,
  • Xiao Jing,
  • Wei Shi,
  • Qian Chen

DOI
https://doi.org/10.1109/ACCESS.2024.3421666
Journal volume & issue
Vol. 12
pp. 90768 – 90781

Abstract

Read online

Dynamic texture classification has been widely studied because of its applications in various computer vision tasks. The key to classifying dynamic textures lies in describing them, i.e., extracting features from them. A variety of traditional dynamic texture descriptors have been carefully designed in many research studies. And some researchers directly use pre-trained deep models for feature extraction. However, training a deep model from scratch for dynamic texture description is rarely explored due to the lack of a large-scale dynamic texture dataset. In this paper, we propose to train a deep model on existing small-scale dynamic texture datasets for feature extraction. We first randomly sample a number of 3D cubes from each training video. Then a simple AutoEncoder network is trained with the cubes, and the encoder will serve as a local feature extractor. The features extracted from all the training cubes are used to fit a Gaussian mixture model, which will later be used for Fisher vector encoding. Finally, given a video, we densely sample cubes, feed them into the encoder, and encode the output local features into a global feature vector using the learned Gaussian mixture model. The proposed method is evaluated on three benchmark datasets with various evaluation protocols and its effectiveness is verified by the obtained competitive results.

Keywords