IET Computer Vision (Sep 2022)

A novel visual classification framework on panoramic attention mechanism network

  • Wenshu Li,
  • Shenhao Li,
  • Lingzhi Yin,
  • Xiaoying Guo,
  • Xu Yang

DOI
https://doi.org/10.1049/cvi2.12105
Journal volume & issue
Vol. 16, no. 6
pp. 479 – 488

Abstract

Read online

Abstract Fine‐grained classification is a challenging task due to the difficulty of finding discriminative features and the localization of feature regions. To handle these challenges, a novel visual classification framework on panoramic attention mechanism that combines multiple attention networks to locate and identify features with more semantic interest is proposed. Firstly, based on the classical convolutional neural network, the global information of the image feature is expressed by linear fusion. Secondly, the foreground attention branch is used to further extract the distinguishing details of the salient features. Then, more features are mined from the complementary object area through the background attention branch to learn more perfect fine‐grained feature expression. Finally, three network branches are trained together to enhance the network's ability to express representative features of fine‐grained images. Our model can be viewed as a multi‐branch network, which benefits each other and optimizes the network together. Experiments were conducted on CUB‐200‐2011, Stanford Dogs and FGVC‐Aircraft datasets, and the accuracy was used as the quantitative measurement. Experimental results show that the proposed method has the highest accuracy; the average accuracy is 89.8%. It is effective and superior to the current advanced methods.

Keywords