IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)
An Intelligent System for Outfall Detection in UAV Images Using Lightweight Convolutional Vision Transformer Network
Abstract
Unmanned aerial vehicle aerial photography technology has become a crucial tool for detecting outfalls that discharge into rivers and oceans. However, the current retrieval process in aerial images relies heavily on visual interpretation by skilled experts, which is time-consuming and inefficient. To address this issue, we propose a lightweight deep-learning model for detecting outfall objects in aerial images. Specifically, the backbone of our proposed model is a lightweight convolutional vision transformer network, which consists of two novel blocks: separated downsampled self-attention and convolutional feedforward network with a shortcut. These blocks are designed to capture information at different granularities in the feature map and build both local and global representations. The model utilizes a path aggregation feature pyramid network as the neck and a lightweight decoupled network as the head. The experiments demonstrate that our model achieves the highest accuracy of 81.5% while utilizing only 2.47 M parameters and 3.95 GFLOPs. Visualization analysis shows that our model pays more attention to true outfall objects. Additionally, we have developed an intelligent outfall detection system based on the proposed model, and the experimental results show that it performs well in the task of outfall detection.
Keywords