IEEE Access (Jan 2024)
UW-DETR: Feature Fusion Enhanced RT-DETR for Improving Underwater Object Detection
Abstract
Deep learning models for object detection have shown favorable results in controlled environments. However, owing to the complexity of underwater environments, these models suffer from target blurring caused by noise, as well as visual disturbances caused by the protective coloration of underwater organisms. These methods are not sufficiently equipped to effectively address the challenge of underwater object detection. To address the challenges in detection caused by underwater blurriness and the camouflage of objects. In this study, a method called Spatial Semantic Encoding Fusion(SSEF) is proposed. Pooling and bilinear interpolation are utilized by SSEF to standardize features across different scales. The standardized features are then fused using the Hadamard product. The fused features are encoded separately for spatial information and channel semantic information through a dual-branch structure. Finally, the dual-branch features are combined by point-wise addition. SSEF is utilized to enhance the multi-scale feature fusion approach of RT-DETR-r18, proposing an underwater object detection framework named Underwater-DETR(UW-DETR). The experiments are conducted using the UTDAC2020 and Brackish datasets. The experimental results show that UW-DETR outperforms other underwater object detection methods and UW-DETR meets the requirements for underwater applications.
Keywords