Applied Sciences (Jul 2023)

Enhanced Spatial Stream of Two-Stream Network Using Optical Flow for Human Action Recognition

  • Shahbaz Khan,
  • Ali Hassan,
  • Farhan Hussain,
  • Aqib Perwaiz,
  • Farhan Riaz,
  • Maazen Alsabaan,
  • Wadood Abdul

DOI
https://doi.org/10.3390/app13148003
Journal volume & issue
Vol. 13, no. 14
p. 8003

Abstract

Read online

Introduction: Convolutional neural networks (CNNs) have maintained their dominance in deep learning methods for human action recognition (HAR) and other computer vision tasks. However, the need for a large amount of training data always restricts the performance of CNNs. Method: This paper is inspired by the two-stream network, where a CNN is deployed to train the network by using the spatial and temporal aspects of an activity, thus exploiting the strengths of both networks to achieve better accuracy. Contributions: Our contribution is twofold: first, we deploy an enhanced spatial stream, and it is demonstrated that models pre-trained on a larger dataset, when used in the spatial stream, yield good performance instead of training the entire model from scratch. Second, a dataset augmentation technique is presented to minimize overfitting of CNNs, where we increase the dataset size by performing various transformations on the images such as rotation and flipping, etc. Results: UCF101 is a standard benchmark dataset for action videos, and our architecture has been trained and validated on it. Compared with the other two-stream networks, our results outperformed them in terms of accuracy.

Keywords