Information Processing in Agriculture (Jun 2024)
An improved lightweight network based on deep learning for grape recognition in unstructured environments
Abstract
In unstructured environments, dense grape fruit growth and the presence of occlusion cause difficult recognition problems, which will seriously affect the performance of grape picking robots. To address these problems, this study improves the YOLOX-Tiny model and proposes a new grape detection model, YOLOX-RA, which can quickly and accurately identify densely growing and occluded grape bunches. The proposed YOLOX-RA model uses a 3 × 3 convolutional layer with a step size of 2 to replace the focal layer to reduce the computational burden. The CBS layer in the ResBlock_Body module of the second, third, and fourth layers of the backbone layer is removed, and the CSPLayer module is replaced by the ResBlock-M module to speed up the detection. An auxiliary network (AlNet) with the remaining network blocks was added after the ResBlock-M module to improve the detection accuracy. Two depth-separable convolutions (DSC) are used in the neck module layer to replace the normal convolution to reduce the computational cost. We evaluated the detection performance of SSD, YOLOv4 SSD, YOLOv4-Tiny, YOLO-Grape, YOLOv5-X, YOLOX-Tiny, and YOLOX-RA on a grape test set. The results show that the YOLOX-RA model has the best detection performance, achieving 88.75 % mAP, a recognition speed of 84.88 FPS, and model size of 17.53 MB. It can accurately detect densely grown and shaded grape bunches, which can effectively improve the performance of the grape picking robot.