Heliyon (Feb 2024)

A lightweight CNN for multi-source infrared ship detection from unmanned marine vehicles

  • Liqian Wang,
  • Yakui Dong,
  • Cheng Fei,
  • Junliang Liu,
  • Shuzhen Fan,
  • Yunxia Liu,
  • Yongfu Li,
  • Zhaojun Liu,
  • Xian Zhao

Journal volume & issue
Vol. 10, no. 4
p. e26229

Abstract

Read online

Infrared ship detection is of great significance due to its broad applicability in maritime surveillance, traffic safety and security. Multiple infrared sensors with different spectral sensitivity provide enhanced sensing capabilities, facilitating ship detection in complex environments. Nevertheless, current researches lack discussion and exploration of infrared imagers in different spectral ranges for marine objects detection. Furthermore, for unmanned marine vehicles (UMVs), e.g., unmanned surface vehicles (USVs) and unmanned ship (USs), detection and perception are usually performed in embedded devices with limited memory and computation resource, which makes traditional convolutional neural network (CNN)-based detection methods struggle to leverage their advantages. Aimed at the task of sea surface object detection on USVs, this paper provides lightweight CNNs with high inference speed that can be deployed on embedded devices. It also discusses the advantages and disadvantages of using different sensors in marine object detection, providing a reference for the perception and decision-making modules of USVs. The proposed method can detect ships in short-wave infrared (SWIR), long-wave infrared (LWIR) and fused images with high-performance and high-inference speed on an embedded device. Specifically, the backbone is built from bottleneck depth-separable convolution with residuals. Generating redundant feature maps by using cheap linear operation in neck and head networks. The learning and representation capacities of the network are promoted by introducing the channel and spatial attention, redesigning the sizes of anchor boxes. Comparative experiments are conducted on the infrared ship dataset that we have released which contains SWIR, LWIR and the fused images. The results indicate that the proposed method can achieve high accuracy but with fewer parameters, and the inference speed is nearly 60 frames per second (FPS) on an embedded device.

Keywords