PanoDetNet: Multi-Resolution Panoramic Object Detection With Adaptive Feature Attention

Wenhao Liu; Taijie Zhang; Shiting Xu; Qingling Chang; Yan Cui

doi:10.1109/ACCESS.2024.3435764

IEEE Access (Jan 2024)

PanoDetNet: Multi-Resolution Panoramic Object Detection With Adaptive Feature Attention

Wenhao Liu,
Taijie Zhang,
Shiting Xu,
Qingling Chang,
Yan Cui

Affiliations

Wenhao Liu: ORCiD; Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, China
Taijie Zhang: ORCiD; Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, China
Shiting Xu: ORCiD; Zhuhai 4Dage Network Technology, Zhuhai, China
Qingling Chang: ORCiD; Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, China
Yan Cui: ORCiD; Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, China

DOI: https://doi.org/10.1109/ACCESS.2024.3435764
Journal volume & issue: Vol. 12
pp. 104300 – 104316

Abstract

Read online

Panoramic image object detection has significant applications in autonomous driving, robotic navigation, and security monitoring. However, most current object detection algorithms are trained on pinhole images and cannot be directly applied to panoramic images, which have a large field-of-view (FOV) and distortion. Additionally, research on panoramic image object detection lacks dedicated dataset support, and these images face challenges such as target distortion, occlusion, and multi-scale variations. Existing methods for panoramic image object detection have not yielded satisfactory performance. To address these issues, we propose PanoDetNet, an object detection model based on YOLOv7. We introduce two new modules: the Multi-Scale Feature Fusion (MSFF) module and the Adaptive Panoramic Feature Attention (APFA) module. The MSFF module enhances detection precision for targets of different scales by fusing feature maps of various sizes, while also reducing the number of parameters and simplifying the model structure. The APFA module adaptively addresses the distortion in panoramic images, improving the model’s ability to locate and recognize objects in complex backgrounds and under occlusion. We trained PanoDetNet on our self-built panoramic image object detection dataset, PanoDet. This dataset was collected using a self-developed panoramic camera and manually annotated with the Labelme tool. Experimental results show that PanoDetNet achieves [email protected], [email protected]:.95, and accuracy scores of 95.3%, 75.2%, and 94.1%, respectively. These results represent improvements of 1.6%, 2.5%, and 3.3% over YOLOv7. Our code is available at https://github.com/github98317/PanoDetNet.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords