Multi-Modal Deep Learning for Weeds Detection in Wheat Field Based on RGB-D Images

Ke Xu; Ke Xu; Ke Xu; Ke Xu; Ke Xu; Yan Zhu; Yan Zhu; Yan Zhu; Yan Zhu; Yan Zhu; Weixing Cao; Weixing Cao; Weixing Cao; Weixing Cao; Weixing Cao; Xiaoping Jiang; Xiaoping Jiang; Xiaoping Jiang; Xiaoping Jiang; Xiaoping Jiang; Zhijian Jiang; Shuailong Li; Jun Ni; Jun Ni; Jun Ni; Jun Ni; Jun Ni

doi:10.3389/fpls.2021.732968

Frontiers in Plant Science (Nov 2021)

Multi-Modal Deep Learning for Weeds Detection in Wheat Field Based on RGB-D Images

Ke Xu,
Ke Xu,
Ke Xu,
Ke Xu,
Ke Xu,
Yan Zhu,
Yan Zhu,
Yan Zhu,
Yan Zhu,
Yan Zhu,
Weixing Cao,
Weixing Cao,
Weixing Cao,
Weixing Cao,
Weixing Cao,
Xiaoping Jiang,
Xiaoping Jiang,
Xiaoping Jiang,
Xiaoping Jiang,
Xiaoping Jiang,
Zhijian Jiang,
Shuailong Li,
Jun Ni,
Jun Ni,
Jun Ni,
Jun Ni,
Jun Ni

Affiliations

Ke Xu: College of Agriculture, Nanjing Agricultural University, Nanjing, China
Ke Xu: National Engineering and Technology Center for Information Agriculture, Nanjing, China
Ke Xu: Engineering Research Center of Smart Agriculture, Ministry of Education, Nanjing, China
Ke Xu: Jiangsu Key Laboratory for Information Agriculture, Nanjing, China
Ke Xu: Jiangsu Collaborative Innovation Center for the Technology and Application of Internet of Things, Nanjing, China
Yan Zhu: College of Agriculture, Nanjing Agricultural University, Nanjing, China
Yan Zhu: National Engineering and Technology Center for Information Agriculture, Nanjing, China
Yan Zhu: Engineering Research Center of Smart Agriculture, Ministry of Education, Nanjing, China
Yan Zhu: Jiangsu Key Laboratory for Information Agriculture, Nanjing, China
Yan Zhu: Jiangsu Collaborative Innovation Center for the Technology and Application of Internet of Things, Nanjing, China
Weixing Cao: College of Agriculture, Nanjing Agricultural University, Nanjing, China
Weixing Cao: National Engineering and Technology Center for Information Agriculture, Nanjing, China
Weixing Cao: Engineering Research Center of Smart Agriculture, Ministry of Education, Nanjing, China
Weixing Cao: Jiangsu Key Laboratory for Information Agriculture, Nanjing, China
Weixing Cao: Jiangsu Collaborative Innovation Center for the Technology and Application of Internet of Things, Nanjing, China
Xiaoping Jiang: College of Agriculture, Nanjing Agricultural University, Nanjing, China
Xiaoping Jiang: National Engineering and Technology Center for Information Agriculture, Nanjing, China
Xiaoping Jiang: Engineering Research Center of Smart Agriculture, Ministry of Education, Nanjing, China
Xiaoping Jiang: Jiangsu Key Laboratory for Information Agriculture, Nanjing, China
Xiaoping Jiang: Jiangsu Collaborative Innovation Center for the Technology and Application of Internet of Things, Nanjing, China
Zhijian Jiang: College of Artificial Intelligence, Nanjing Agricultural University, Nanjing, China
Shuailong Li: College of Artificial Intelligence, Nanjing Agricultural University, Nanjing, China
Jun Ni: College of Agriculture, Nanjing Agricultural University, Nanjing, China
Jun Ni: National Engineering and Technology Center for Information Agriculture, Nanjing, China
Jun Ni: Engineering Research Center of Smart Agriculture, Ministry of Education, Nanjing, China
Jun Ni: Jiangsu Key Laboratory for Information Agriculture, Nanjing, China
Jun Ni: Jiangsu Collaborative Innovation Center for the Technology and Application of Internet of Things, Nanjing, China

DOI: https://doi.org/10.3389/fpls.2021.732968
Journal volume & issue: Vol. 12

Abstract

Read online

Single-modal images carry limited information for features representation, and RGB images fail to detect grass weeds in wheat fields because of their similarity to wheat in shape. We propose a framework based on multi-modal information fusion for accurate detection of weeds in wheat fields in a natural environment, overcoming the limitation of single modality in weeds detection. Firstly, we recode the single-channel depth image into a new three-channel image like the structure of RGB image, which is suitable for feature extraction of convolutional neural network (CNN). Secondly, the multi-scale object detection is realized by fusing the feature maps output by different convolutional layers. The three-channel network structure is designed to take into account the independence of RGB and depth information, respectively, and the complementarity of multi-modal information, and the integrated learning is carried out by weight allocation at the decision level to realize the effective fusion of multi-modal information. The experimental results show that compared with the weed detection method based on RGB image, the accuracy of our method is significantly improved. Experiments with integrated learning shows that mean average precision (mAP) of 36.1% for grass weeds and 42.9% for broad-leaf weeds, and the overall detection precision, as indicated by intersection over ground truth (IoG), is 89.3%, with weights of RGB and depth images at α = 0.4 and β = 0.3. The results suggest that our methods can accurately detect the dominant species of weeds in wheat fields, and that multi-modal fusion can effectively improve object detection performance.

Published in Frontiers in Plant Science

ISSN: 1664-462X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Agriculture: Plant culture
Website: https://www.frontiersin.org/journals/plant-science

About the journal

Abstract

Keywords