Smart Agricultural Technology (Mar 2025)
Data generation using Pix2Pix to improve YOLO v8 performance in UAV-based Yuzu detection
Abstract
Unmanned aerial vehicle (UAV) detection using deep learning techniques plays a crucial role in the pre-harvest estimation of yuzu (Citrus Junos) yield. However, the detection performance of deep learning models heavily depends on the quantity and quality of training data. One of the current challenges is that the work of labeling data is difficult and expensive, because of the high density of fruits, the similarity in color between fruits and leaves, and the varying lighting conditions in the captured images of fruit trees. To address these challenges, we propose to use generative adversarial networks (GANs) for data generation, and then utilize the generated data to improve the yuzu detection performance of YOLO (You Only Look Once) v8 models.In this study, the experimental images were photographed using UAVs from two orchards of Kochi agricultural research center between 2020 and 2022. In our approach, we first trained a conditional GAN called Pix2Pix using pairs of images, where the training inputs are the images of fruit trees with all fruits removed, and the training targets are the original images. Subsequently, we created new regions of interest on the images of fruit trees and used the trained Pix2Pix network to generate yuzu fruits within these regions, thereby generating new labeled images. In the experiments, we merged real and generated images to train YOLO v8-series models and explored to reduce the dependency on real training images through the proposed data augmentation approach.The results showed that the combined training of these generated and real images can significantly improve the detection performance of YOLO v8-series models, with the maximum improvements of 5.4% in F1-scores, 5.6% in mAP50, and 7.1% in mAP50–90, respectively. Moreover, the proposed data augmentation approach allowed for up to a 50% reduction in the amount of real training images while still achieving improved detection results.