Weakly Supervised Fine-Grained Image Classification via Salient Region Localization and Different Layer Feature Fusion

Fangxiong Chen; Guoheng Huang; Jiaying Lan; Yanhui Wu; Chi-Man Pun; Wing-Kuen Ling; Lianglun Cheng

doi:10.3390/app10134652

Applied Sciences (Jul 2020)

Weakly Supervised Fine-Grained Image Classification via Salient Region Localization and Different Layer Feature Fusion

Fangxiong Chen,
Guoheng Huang,
Jiaying Lan,
Yanhui Wu,
Chi-Man Pun,
Wing-Kuen Ling,
Lianglun Cheng

Affiliations

Fangxiong Chen: School of Automation, Guangdong University of Technology, Guangzhou 510006, China
Guoheng Huang: School of Computers, Guangdong University of Technology, Guangzhou 510006, China
Jiaying Lan: School of Computers, Guangdong University of Technology, Guangzhou 510006, China
Yanhui Wu: School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
Chi-Man Pun: Department of Computer and Information Science, University of Macau, Macau SAR 999078, China
Wing-Kuen Ling: School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
Lianglun Cheng: School of Computers, Guangdong University of Technology, Guangzhou 510006, China

DOI: https://doi.org/10.3390/app10134652
Journal volume & issue: Vol. 10, no. 13
p. 4652

Abstract

Read online

The fine-grained image classification task is about differentiating between different object classes. The difficulties of the task are large intra-class variance and small inter-class variance. For this reason, improving models’ accuracies on the task heavily relies on discriminative parts’ annotations and regional parts’ annotations. Such delicate annotations’ dependency causes the restriction on models’ practicability. To tackle this issue, a saliency module based on a weakly supervised fine-grained image classification model is proposed by this article. Through our salient region localization module, the proposed model can localize essential regional parts with the use of saliency maps, while only image class annotations are provided. Besides, the bilinear attention module can improve the performance on feature extraction by using higher- and lower-level layers of the network to fuse regional features with global features. With the application of the bilinear attention architecture, we propose the different layer feature fusion module to improve the expression ability of model features. We tested and verified our model on public datasets released specifically for fine-grained image classification. The results of our test show that our proposed model can achieve close to state-of-the-art classification performance on various datasets, while only the least training data are provided. Such a result indicates that the practicality of our model is incredibly improved since fine-grained image datasets are expensive.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords