Jisuanji kexue (Jan 2022)

Dual-stream Reconstruction Network for Multi-label and Few-shot Learning

  • FANG Zhong-li, WANG Zhe, CHI Zi-qiu

DOI
https://doi.org/10.11896/jsjkx.201100143
Journal volume & issue
Vol. 49, no. 1
pp. 212 – 218

Abstract

Read online

The multi-label image classification problem is one of the most important problems in the field of computer vision,which needs to predict and output all the labels in an image.However,the number of labels to be classified in an image is often more than one,and the changeable size,posture,and position of objects in the image will increase the difficulty of classification.Therefore,how to effectively improve the accurate expression ability of image features is an urgent problem to be solved.In response to the above-mentioned problem,a novel dual-stream reconstruction network is proposed to extract features from images.Specifically,the model first proposes a dual-stream attention network to extract features based on channel information and spatial information,and uses feature stitching to make image features have both channel detail information and spatial detail information.Secondly,a reconstruction loss function is introduced to constrain the features of the dual-stream network,forcing the above two divergent features to have the same feature expression ability,thereby promoting the extracted dual-stream features to approach the ground-truth features.Experimental results on multi-label image datasets based on VOC 2007 and MS COCO show that the proposed dual-stream reconstruction network can accurately and effectively extract salient features and produce better classification accuracy.At the same time,in view of the sparse effect of reconstruction loss on model features,the proposed method is also applied to few-shot learning.The experimental results show thatthe proposed model also has good classification accuracy for few-shot learning.

Keywords