MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment

Lang Zhang; Zhan Ao Huang; Canghong Shi; Hongjiang Ma; Xiaojie Li; Xi Wu

doi:10.1007/s40747-024-01580-3

Complex & Intelligent Systems (Aug 2024)

MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment

Lang Zhang,
Zhan Ao Huang,
Canghong Shi,
Hongjiang Ma,
Xiaojie Li,
Xi Wu

Affiliations

Lang Zhang: School of Computer, Chengdu University of Information Technology
Zhan Ao Huang: School of Computer, Chengdu University of Information Technology
Canghong Shi: School of Computer and Software Engineering, Xihua University
Hongjiang Ma: School of Computer, Chengdu University of Information Technology
Xiaojie Li: School of Computer, Chengdu University of Information Technology
Xi Wu: School of Computer, Chengdu University of Information Technology

DOI: https://doi.org/10.1007/s40747-024-01580-3
Journal volume & issue: Vol. 10, no. 6
pp. 8095 – 8108

Abstract

Read online

Abstract Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest mAP (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords