Sketch face recognition based on light semantic Transformer network

Lin Cao; Jianqiang Yin; Yanan Guo; Kangning Du; Fan Zhang

doi:10.1049/cvi2.12209

IET Computer Vision (Dec 2023)

Sketch face recognition based on light semantic Transformer network

Lin Cao,
Jianqiang Yin,
Yanan Guo,
Kangning Du,
Fan Zhang

Affiliations

Lin Cao: Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument Beijing Information Science and Technology University Beijing China
Jianqiang Yin: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Yanan Guo: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Kangning Du: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Fan Zhang: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China

DOI: https://doi.org/10.1049/cvi2.12209
Journal volume & issue: Vol. 17, no. 8
pp. 962 – 976

Abstract

Read online

Abstract Sketch face recognition has a wide range of applications in criminal investigation, but it remains a challenging task due to the small‐scale sample and the semantic deficiencies caused by cross‐modality differences. The authors propose a light semantic Transformer network to extract and model the semantic information of cross‐modality images. First, the authors employ a meta‐learning training strategy to obtain task‐related training samples to solve the small sample problem. Then to solve the contradiction between the high complexity of the Transformer and the small sample problem of sketch face recognition, the authors build the light semantic transformer network by proposing a hierarchical group linear transformation and introducing parameter sharing, which can extract highly discriminative semantic features on small–scale datasets. Finally, the authors propose a domain‐adaptive focal loss to reduce the cross‐modality differences between sketches and photos and improve the training effect of the light semantic Transformer network. Extensive experiments have shown that the features extracted by the proposed method have significant discriminative effects. The authors’ method improves the recognition rate by 7.6% on the UoM‐SGFSv2 dataset, and the recognition rate reaches 92.59% on the CUFSF dataset.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords