Journal of Marine Science and Engineering (Mar 2023)

A Marine Organism Detection Framework Based on Dataset Augmentation and CNN-ViT Fusion

  • Xiao Jiang,
  • Yaxin Zhang,
  • Mian Pan,
  • Shuaishuai Lv,
  • Gang Yang,
  • Zhu Li,
  • Jingbiao Liu,
  • Haibin Yu

DOI
https://doi.org/10.3390/jmse11040705
Journal volume & issue
Vol. 11, no. 4
p. 705

Abstract

Read online

Underwater vision-based detection plays an important role in marine resources exploration, marine ecological protection and other fields. Due to the restricted carrier movement and the clustering effect of some marine organisms, the size of some marine organisms in the underwater image is very small, and the samples in the dataset are very unbalanced, which aggravate the difficulty of vision detection of marine organisms. To solve these problems, this study proposes a marine organism detection framework with a dataset augmentation strategy and Convolutional Neural Networks (CNN)-Vision Transformer (ViT) fusion model. The proposed framework adopts two data augmentation methods, namely, random expansion of small objects and non-overlapping filling of scarce samples, to significantly improve the data quality of the dataset. At the same time, the framework takes YOLOv5 as the baseline model, introduces ViT, deformable convolution and trident block in the feature extraction network, and extracts richer features of marine organisms through multi-scale receptive fields with the help of the fusion of CNN and ViT. The experimental results show that, compared with various one-stage detection models, the mean average precision (mAP) of the proposed framework can be improved by 27%. At the same time, it gives consideration to both performance and real-time, so as to achieve high-precision real-time detection of the marine organisms on the underwater mobile platform.

Keywords