Information Processing in Agriculture (Sep 2024)
Detection of tiger puffer using improved YOLOv5 with prior knowledge fusion
Abstract
Tiger puffer is a commercially important fish cultured in high-density environments, and its accurate detection is indispensable for determining growth conditions and realizing accurate feeding. However, the detection precision and recall of farmed tiger puffer are low due to target blurring and occlusion in real farming environments. The farmed tiger puffer detection model, called knowledge aggregation YOLO (KAYOLO), fuses prior knowledge with improved YOLOv5 and was proposed to solve this problem. To alleviate feature loss caused by target blurring, we drew on the human practice of using prior knowledge for reasoning when recognizing blurred targets and used prior knowledge to strengthen the tiger puffer's features and improve detection precision. To address missed detection caused by mutual occlusion in high-density farming environments, a prediction box aggregation method, aggregating prediction boxes of the same object, was proposed to reduce the influence among different objects to improve detection recall. To validate the effectiveness of the proposed methods, ablation experiments, model performance experiments, and model robustness experiments were designed. The experimental results showed that KAYOLO's detection precision and recall results reached 94.92% and 92.21%, respectively. The two indices were improved by 1.29% and 1.35%, respectively, compared to those of YOLOv5. Compared with the recent state-of-the-art underwater object detection models, such as SWIPENet, RoIMix, FERNet, and SK-YOLOv5, KAYOLO achieved 2.09%, 1.63%, 1.13% and 0.85% higher precision and 1.2%, 0.18%, 1.74% and 0.39% higher recall, respectively. Experiments were conducted on different datasets to verify the model's robustness, and the precision and recall of KAYOLO were improved by approximately 1.3% compared to those of YOLOv5. The study showed that KAYOLO effectively enhanced farmed tiger puffer detection by reducing blurring and occlusion effects. Additionally, the model had a strong generalization ability on different datasets, indicating that the model can be adapted to different environments, and it has strong robustness.