Applied Artificial Intelligence (Jul 2021)

Scene Recognition by Joint Learning of DNN from Bag of Visual Words and Convolutional DCT Features

  • Abdul Rehman,
  • Summra Saleem,
  • Usman Ghani Khan,
  • Saira Jabeen,
  • M. Omair Shafiq

DOI
https://doi.org/10.1080/08839514.2021.1881296
Journal volume & issue
Vol. 35, no. 9
pp. 623 – 641

Abstract

Read online

Scene recognition is used in many computer vision and related applications, including information retrieval, robotics, real-time monitoring, and event-classification. Due to the complex nature of the task of scene recognition, it has been greatly improved by deep learning architectures that can be trained by utilizing large and comprehensive datasets. This paper presents a scene classification method in which local and global features are used and are concatenated with the DCT-Convolutional features of AlexNet. The features are fed into AlexNet's fully connected layers for classification. The local and global features are made efficient by selecting the correct size of Bag of Visual Words (BOVW) and feature selection techniques, which are evaluated in the experimentation section. We used AlexNet with the modification of adding additional dense fully connected layers and compared its result with the model previously trained on the Places365 dataset. Our model is also compared with other scene recognition methods, and it clearly outperforms in terms of accuracy.