Lightweight transformer image feature extraction network

Wenfeng Zheng; Siyu Lu; Youshuai Yang; Zhengtong Yin; Lirong Yin

doi:10.7717/peerj-cs.1755

PeerJ Computer Science (Jan 2024)

Lightweight transformer image feature extraction network

Wenfeng Zheng,
Siyu Lu,
Youshuai Yang,
Zhengtong Yin,
Lirong Yin

Affiliations

Wenfeng Zheng: School of Automation, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Siyu Lu: School of Automation, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Youshuai Yang: School of Automation, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Zhengtong Yin: College of Resource and Environment Engineering, Guizhou University, Guiyang, Guizhou, China
Lirong Yin: Department of Geography and Anthropology, Louisiana State University, Baton Rouge, LA, United States of America

DOI: https://doi.org/10.7717/peerj-cs.1755
Journal volume & issue: Vol. 10
p. e1755

Abstract

Read online Read online

In recent years, the image feature extraction method based on Transformer has become a research hotspot. However, when using Transformer for image feature extraction, the model’s complexity increases quadratically with the number of tokens entered. The quadratic complexity prevents vision transformer-based backbone networks from modelling high-resolution images and is computationally expensive. To address this issue, this study proposes two approaches to speed up Transformer models. Firstly, the self-attention mechanism’s quadratic complexity is reduced to linear, enhancing the model’s internal processing speed. Next, a parameter-less lightweight pruning method is introduced, which adaptively samples input images to filter out unimportant tokens, effectively reducing irrelevant input. Finally, these two methods are combined to create an efficient attention mechanism. Experimental results demonstrate that the combined methods can reduce the computation of the original Transformer model by 30%–50%, while the efficient attention mechanism achieves an impressive 60%–70% reduction in computation.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords