HVConv: Horizontal and Vertical Convolution for Remote Sensing Object Detection

Jinhui Chen; Qifeng Lin; Haibin Huang; Yuanlong Yu; Daoye Zhu; Gang Fu

doi:10.3390/rs16111880

Remote Sensing (May 2024)

HVConv: Horizontal and Vertical Convolution for Remote Sensing Object Detection

Jinhui Chen,
Qifeng Lin,
Haibin Huang,
Yuanlong Yu,
Daoye Zhu,
Gang Fu

Affiliations

Jinhui Chen: College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Qifeng Lin: College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Haibin Huang: College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Yuanlong Yu: College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Daoye Zhu: College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Gang Fu: Department of Computing, The Hong Kong Polytechnic University, Hong Kong 999077, China

DOI: https://doi.org/10.3390/rs16111880
Journal volume & issue: Vol. 16, no. 11
p. 1880

Abstract

Read online

Generally, the interesting objects in aerial images are completely different from objects in nature, and the remote sensing objects in particular tend to be more distinctive in aspect ratio. The existing convolutional networks have equal aspect ratios of the receptive fields, which leads to receptive fields either containing non-relevant information or being unable to fully cover the entire object. To this end, we propose Horizontal and Vertical Convolution, which is a plug-and-play module to address different aspect ratio problems. In our method, we introduce horizontal convolution and vertical convolution to expand the receptive fields in the horizontal and vertical directions, respectively, to reduce redundant receptive fields, so that remote sensing objects with different aspect ratios can achieve better receptive fields coverage, thereby achieving more accurate feature representation. In addition, we design an attention module to dynamically aggregate these two sub-modules to achieve more accurate feature coverage. Extensive experimental results on the DOTA and HRSC2016 datasets show that our HVConv achieves accuracy improvements in diverse detection architectures and obtains SOTA accuracy (mAP score of 77.60% with DOTA single-scale training and mAP score of 81.07% with DOTA multi-scale training). Various ablation studies were conducted as well, which is enough to verify the effectiveness of our model.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords