Learning Motion Constraint-Based Spatio-Temporal Networks for Infrared Dim Target Detections

Jie Li; Pengxi Liu; Xiayang Huang; Wennan Cui; Tao Zhang

doi:10.3390/app122211519

Applied Sciences (Nov 2022)

Learning Motion Constraint-Based Spatio-Temporal Networks for Infrared Dim Target Detections

Jie Li,
Pengxi Liu,
Xiayang Huang,
Wennan Cui,
Tao Zhang

Affiliations

Jie Li: Key Laboratory of Intelligent Infrared Perception, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China
Pengxi Liu: Key Laboratory of Intelligent Infrared Perception, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China
Xiayang Huang: Key Laboratory of Intelligent Infrared Perception, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China
Wennan Cui: Key Laboratory of Intelligent Infrared Perception, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China
Tao Zhang: Key Laboratory of Intelligent Infrared Perception, Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China

DOI: https://doi.org/10.3390/app122211519
Journal volume & issue: Vol. 12, no. 22
p. 11519

Abstract

Read online

Efficient infrared dim object detection has been challenged by low signal-to-noise ratios (SNRs). Traditional methods rely on the gradient difference and fixed-parameter model. These methods fail to adapt to sophisticated and variable situations in the real world. To tackle the issue, a deep learning method based on the spatio-temporal network is proposed in this paper. The model is established by the Convolutional Long Short-Term Memory cell (Conv-LSTM) and the 3D Convolution cell (3D-Conv). It is trained to learn the motion constraint of moving targets (spatio-temporal constraint module, called STM) and to fuse the multiscale local feature between the target and background (deep spatial features module, called DFM). In addition, a variable interval search module (state-aware module, called STAM) is added to the inference. The submodule decides to conduct a global search for images only if the target is lost due to fast motion, uncertain obstruction, and frame loss. Comprehensive experiments indicate that the proposed method achieves better performance over all baseline methods. On the mid-wave infrared datasets collected by the authors, the proposed method achieves a 95.87% detection rate. The SNR of the dataset is around 1–3 dB, and the background of the sequence includes sky, asphalt road, and buildings.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords