IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

Step-by-Step: Efficient Ship Detection in Large-Scale Remote Sensing Images

  • Wei Cao,
  • Guangluan Xu,
  • Yingchao Feng,
  • Hongqi Wang,
  • Siyu Hu,
  • Min Li

DOI
https://doi.org/10.1109/JSTARS.2024.3429395
Journal volume & issue
Vol. 17
pp. 13426 – 13438

Abstract

Read online

In the field of object detection in large-scale remote sensing images, achieving a good tradeoff between model accuracy and speed has been a long-standing challenge. The majority of inference time is spent on background regions without objects, making real-time detection difficult in practical applications. Common approaches involve partitioning large-scale remote sensing images into smaller patches, followed by using additional classification networks or detectors on the final layer of the backbone's feature map to identify and filter out patches devoid of objects, ultimately enhancing detection efficiency. This article proposes a novel model, called OPD-Swin-Transformer, for ship detection in large-scale remote sensing images. This model integrates a simple and lightweight object presence detector (OPD) at each stage of the Swin-transformer and uses a step-by-step, progressively challenging strategy to filter out background image patches, achieving an overall improvement in detection speed. The model optimizes the entire network end-to-end using a multitask loss function, leading to simultaneous improvements in detection accuracy. By employing an optimal threshold generation strategy based on the weighted Youden index, the model effectively maintains a higher recall rate for ships while filtering out background images, achieving an optimal balance between speed and accuracy. Our OPD-Swin-Transformer is integrated into two mainstream detectors and evaluated on two popular benchmarks for ship detection. The experiments demonstrate that, when compared to other state-of-the-art methods, this approach increases inference speed by more than 40% while also improving detection accuracy.

Keywords