RODFormer: High-Precision Design for Rotating Object Detection with Transformers

Yaonan Dai; Jiuyang Yu; Dean Zhang; Tianhao Hu; Xiaotao Zheng

doi:10.3390/s22072633

Sensors (Mar 2022)

RODFormer: High-Precision Design for Rotating Object Detection with Transformers

Yaonan Dai,
Jiuyang Yu,
Dean Zhang,
Tianhao Hu,
Xiaotao Zheng

Affiliations

Yaonan Dai: Hubei Provincial Engineering Technology Research Center of Green Chemical Equipment, School of Mechanical and Electrical Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Jiuyang Yu: Hubei Provincial Engineering Technology Research Center of Green Chemical Equipment, School of Mechanical and Electrical Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Dean Zhang: Hubei Provincial Engineering Technology Research Center of Green Chemical Equipment, School of Mechanical and Electrical Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Tianhao Hu: Hubei Provincial Engineering Technology Research Center of Green Chemical Equipment, School of Mechanical and Electrical Engineering, Wuhan Institute of Technology, Wuhan 430205, China
Xiaotao Zheng: Hubei Provincial Engineering Technology Research Center of Green Chemical Equipment, School of Mechanical and Electrical Engineering, Wuhan Institute of Technology, Wuhan 430205, China

DOI: https://doi.org/10.3390/s22072633
Journal volume & issue: Vol. 22, no. 7
p. 2633

Abstract

Read online

Aiming at the problem of Transformers lack of local spatial receptive field and discontinuous boundary loss in rotating object detection, in this paper, we propose a Transformer-based high-precision rotating object detection model (RODFormer). Firstly, RODFormer uses a structured transformer architecture to collect feature information of different resolutions to improve the collection range of feature information. Secondly, a new feed-forward network (spatial-FFN) is constructed. Spatial-FFN fuses the local spatial features of 3 × 3 depthwise separable convolutions with the global channel features of multilayer perceptron (MLP) to solve the deficiencies of FFN in local spatial modeling. Finally, based on the space-FFN architecture, a detection head is built using the CIOU-smooth L1 loss function and only returns to the horizontal frame when the rotating frame is close to the horizontal, so as to alleviate the loss discontinuity of the rotating frame. Ablation experiments of RODFormer on the DOTA dataset show that the Transformer-structured module, the spatial-FFN module and the CIOU-smooth L1 loss function module are all effective in improving the detection accuracy of RODFormer. Compared with 12 rotating object detection models on the DOTA dataset, RODFormer has the highest average detection accuracy (up to 75.60%), that is, RODFormer is more competitive in rotating object detection accuracy.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords