Frontiers in Plant Science (Jul 2024)

CTHNet: a network for wheat ear counting with local-global features fusion based on hybrid architecture

  • Qingqing Hong,
  • Qingqing Hong,
  • Wei Liu,
  • Wei Liu,
  • Yue Zhu,
  • Yue Zhu,
  • Tianyu Ren,
  • Tianyu Ren,
  • Changrong Shi,
  • Changrong Shi,
  • Zhixin Lu,
  • Zhixin Lu,
  • Yunqin Yang,
  • Yunqin Yang,
  • Ruiting Deng,
  • Ruiting Deng,
  • Jing Qian,
  • Jing Qian,
  • Changwei Tan,
  • Changwei Tan

DOI
https://doi.org/10.3389/fpls.2024.1425131
Journal volume & issue
Vol. 15

Abstract

Read online

Accurate wheat ear counting is one of the key indicators for wheat phenotyping. Convolutional neural network (CNN) algorithms for counting wheat have evolved into sophisticated tools, however because of the limitations of sensory fields, CNN is unable to simulate global context information, which has an impact on counting performance. In this study, we present a hybrid attention network (CTHNet) for wheat ear counting from RGB images that combines local features and global context information. On the one hand, to extract multi-scale local features, a convolutional neural network is built using the Cross Stage Partial framework. On the other hand, to acquire better global context information, tokenized image patches from convolutional neural network feature maps are encoded as input sequences using Pyramid Pooling Transformer. Then, the feature fusion module merges the local features with the global context information to significantly enhance the feature representation. The Global Wheat Head Detection Dataset and Wheat Ear Detection Dataset are used to assess the proposed model. There were 3.40 and 5.21 average absolute errors, respectively. The performance of the proposed model was significantly better than previous studies.

Keywords