Electronics (Feb 2023)

An Improved Vision Transformer Network with a Residual Convolution Block for Bamboo Resource Image Identification

  • Qing Zou,
  • Xiu Jin,
  • Yi Song,
  • Lianglong Wang,
  • Shaowen Li,
  • Yuan Rao,
  • Xiaodan Zhang,
  • Qijuan Gao

DOI
https://doi.org/10.3390/electronics12041055
Journal volume & issue
Vol. 12, no. 4
p. 1055

Abstract

Read online

Bamboo is an important economic crop with up to a large number of species. The distribution of bamboo species is wide; therefore, it is difficult to collect images and make the recognition model of a bamboo species with few amount of images. In this paper, nineteen species of bamboo with a total of 3220 images are collected and divided into a training dataset, a validation dataset and a test dataset. The main structure of a residual vision transformer algorithm named ReVI is improved by combining the convolution and residual mechanisms with a vision transformer network (ViT). This experiment explores the effect of reducing the amount of bamboo training data on the performance of ReVI and ViT on the bamboo dataset. The ReVI has a better generalization of a deep model with small-scale bamboo training data than ViT. The performances of each bamboo species under the ReVI, ViT, ResNet18, VGG16, Densenet121, Xception were then compared, which showed that ReVI performed the best, with an average accuracy of 90.21%, and the reasons for the poor performance of some species are discussed. It was found that ReVI offered the efficient identification of bamboo species with few images. Therefore, the ReVI algorithm proposed in this manuscript offers the possibility of accurate and intelligent classification and recognition of bamboo resource images.

Keywords