FSA-FPN reconstruction method that fused self-attention mechanism based on YOLOX

An Henan; Guan Cong; Deng Wucai; Yang Jiazhou; Ma Chao

doi:10.16157/j.issn.0258-7998.223139

Dianzi Jishu Yingyong (Mar 2023)

FSA-FPN reconstruction method that fused self-attention mechanism based on YOLOX

An Henan,
Guan Cong,
Deng Wucai,
Yang Jiazhou,
Ma Chao

Affiliations

An Henan: (1.College of Electronics and Information Engineering，Shenzhen University，Shenzhen 518000，China
Guan Cong: Institute of Microscale Optoelectronics，Shenzhen University，Shenzhen 518000，China)
Deng Wucai: (1.College of Electronics and Information Engineering，Shenzhen University，Shenzhen 518000，China
Yang Jiazhou: Institute of Microscale Optoelectronics，Shenzhen University，Shenzhen 518000，China)
Ma Chao: Institute of Microscale Optoelectronics，Shenzhen University，Shenzhen 518000，China)

DOI: https://doi.org/10.16157/j.issn.0258-7998.223139
Journal volume & issue: Vol. 49, no. 3
pp. 61 – 66

Abstract

Read online

Abstract： With the increasing resolution of the input image of the current target detection task，the feature information extracted from the feature extraction network will become more and more limited under the condition that the receptive field of the feature extraction network remains unchanged，and the information coincidence degree between adjacent feature points will also become higher and higher.This paper proposes an FSA(fusion self-attention)-FPN，and designs SAU(self-attention upsample) module.The internal structure of SAU performs cross calculation with self-attention mechanism and CNN to further Feature fusion，and reconstructs FCU(feature coupling unit) to eliminate feature dislocation between them and bridge semantic gap. In this paper，a comparative experiment is carried out on Pascal VOC2007 data set using YOLOX-Darknet 53 as the main dry network. The experimental results show that compared with the FPN of the original network，the average accuracy of MAP@ [.5:.95] after replacing FSA-FPN is improved by 1.5%，and the position of the prediction box is also more accurate.It has better application value in detection scenarios requiring higher accuracy.

Published in Dianzi Jishu Yingyong

ISSN: 0258-7998 (Print)
Publisher: National Computer System Engineering Research Institute of China
Country of publisher: China
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: http://journal.chinaaet.com/en

About the journal

Abstract

Keywords