IEEE Access (Jan 2024)
IST-DETR: Improved DETR for Infrared Small Target Detection
Abstract
In this study, an Infrared Small-Target Detection Transformer (IST-DETR) is introduced, a novel model specifically designed to tackle the challenges of low-resolution infrared images and small target scales. IST-DETR integrates a backbone network, a hybrid encoder featuring, a Mutual Feature Screening (MFS) module, and a decoder with auxiliary prediction heads. The hybrid encoder employs Learning Position Encoding to reduce information redundancy and employs a mutual feature screening mechanism to enhance the interaction between high-level semantic features and low-level positional features, facilitating more accurate detection of small infrared targets. What’s more, a customized IoU metric and a novel sample weighting function are employed to effectively address dataset imbalance, significantly improving detection performance. Experiments conducted on the FLIR Dataset, HIT-UAV Dataset, and IVFlying Dataset yielded an average precision (AP) of 44.1%, 34.0%, and 58.5%, respectively, with a processing speed of 74 frames per second. IST-DETR outperforms contemporary algorithms such as Yolov8, CO-DETR, and DINO, demonstrating a superior balance of speed and accuracy, particularly in recognizing small infrared targets across diverse and complex scenarios.
Keywords