Scientific Reports (Jul 2024)

Underwater small target detection under YOLOv8-LA model

  • Shenming Qu,
  • Can Cui,
  • Jiale Duan,
  • Yongyong Lu,
  • Zilong Pang

DOI
https://doi.org/10.1038/s41598-024-66950-w
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract In the realm of marine environmental engineering, the swift and accurate detection of underwater targets is of considerable significance. Recently, methods based on Convolutional Neural Networks (CNN) have been applied to enhance the detection of such targets. However, deep neural networks usually require a large number of parameters, resulting in slow processing speed. Meanwhile, existing methods present challenges in accurate detection when facing small and densely arranged underwater targets. To address these issues, we propose a new neural network model, YOLOv8-LA, for improving the detection performance of underwater targets. First, we design a Lightweight Efficient Partial Convolution (LEPC) module to optimize spatial feature extraction by selectively processing input channels to improve efficiency and significantly reduce redundant computation and storage requirements. Second, we developed the AP-FasterNet architecture for small targets that are commonly found in underwater datasets. By integrating depth-separable convolutions with different expansion rates into FasterNet, AP-FasterNet enhances the model’s ability to capture detailed features of small targets. Finally, we integrate the lightweight and efficient content-aware reorganization (CARAFE) up-sampling operation into YOLOv8 to enhance the model performance by aggregating contextual information over a large perceptual field and mitigating information loss during up-sampling.Evaluation results on the URPC2021 dataset show that the YOLOv8-LA model achieves 84.7% mean accuracy (mAP) on a single Nvidia GeForce RTX 3090 and operates at 189.3 frames per second (FPS), demonstrating that it outperforms existing state-of-the-art methods in terms of performance. This result demonstrates the model’s ability to ensure high detection accuracy while maintaining real-time processing capabilities.

Keywords