Learning Attention-Aware Interactive Features for Fine-Grained Vegetable and Fruit Classification

Yimin Wang; Zhifeng Xiao; Lingguo Meng

doi:10.3390/app11146533

Applied Sciences (Jul 2021)

Learning Attention-Aware Interactive Features for Fine-Grained Vegetable and Fruit Classification

Yimin Wang,
Zhifeng Xiao,
Lingguo Meng

Affiliations

Yimin Wang: School of Microelectronics, Shandong University, Jinan 250101, China
Zhifeng Xiao: School of Engineering, Penn State Erie, The Behrend College, Erie, PA 16563, USA
Lingguo Meng: School of Microelectronics, Shandong University, Jinan 250101, China

DOI: https://doi.org/10.3390/app11146533
Journal volume & issue: Vol. 11, no. 14
p. 6533

Abstract

Read online

Vegetable and fruit recognition can be considered as a fine-grained visual categorization (FGVC) task, which is challenging due to the large intraclass variances and small interclass variances. A mainstream direction to address the challenge is to exploit fine-grained local/global features to enhance the feature extraction and representation in the learning pipeline. However, unlike the human visual system, most of the existing FGVC methods only extract features from individual images during training. In contrast, human beings can learn discriminative features by comparing two different images. Inspired by this intuition, a recent FGVC method, named Attentive Pairwise Interaction Network (API-Net), takes as input an image pair for pairwise feature interaction and demonstrates superior performance in several open FGVC data sets. However, the accuracy of API-Net on VegFru, a domain-specific FGVC data set, is lower than expected, potentially due to the lack of spatialwise attention. Following this direction, we propose an FGVC framework named Attention-aware Interactive Features Network (AIF-Net) that refines the API-Net by integrating an attentive feature extractor into the backbone network. Specifically, we employ a region proposal network (RPN) to generate a collection of informative regions and apply a biattention module to learn global and local attentive feature maps, which are fused and fed into an interactive feature learning subnetwork. The novel neural structure is verified through extensive experiments and shows consistent performance improvement in comparison with the SOTA on the VegFru data set, demonstrating its superiority in fine-grained vegetable and fruit recognition. We also discover that a concatenation fusion operation applied in the feature extractor, along with three top-scoring regions suggested by an RPN, can effectively boost the performance.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords