Cross Task Modality Alignment Network for Sketch Face Recognition

Yanan Guo; Yanan Guo; Lin Cao; Lin Cao; Kangning Du; Kangning Du

doi:10.3389/fnbot.2022.823484

Frontiers in Neurorobotics (Jun 2022)

Cross Task Modality Alignment Network for Sketch Face Recognition

Yanan Guo,
Yanan Guo,
Lin Cao,
Lin Cao,
Kangning Du,
Kangning Du

Affiliations

Yanan Guo: Key Laboratory of Information and Communication Systems, Ministry of Information Industry, Beijing Information Science and Technology University, Beijing, China
Yanan Guo: Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing, China
Lin Cao: Key Laboratory of Information and Communication Systems, Ministry of Information Industry, Beijing Information Science and Technology University, Beijing, China
Lin Cao: Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing, China
Kangning Du: Key Laboratory of Information and Communication Systems, Ministry of Information Industry, Beijing Information Science and Technology University, Beijing, China
Kangning Du: Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing, China

DOI: https://doi.org/10.3389/fnbot.2022.823484
Journal volume & issue: Vol. 16

Abstract

Read online

The task of sketch face recognition refers to matching cross-modality facial images from sketch to photo, which is widely applied in the criminal investigation area. Existing works aim to bridge the cross-modality gap by inter-modality feature alignment approaches, however, the small sample problem has received much less attention, resulting in limited performance. In this paper, an effective Cross Task Modality Alignment Network (CTMAN) is proposed for sketch face recognition. To address the small sample problem, a meta learning training episode strategy is first introduced to mimic few-shot tasks. Based on the episode strategy, a two-stream network termed modality alignment embedding learning is used to capture more modality-specific and modality-sharable features, meanwhile, two cross task memory mechanisms are proposed to collect sufficient negative features to further improve the feature learning. Finally, a cross task modality alignment loss is proposed to capture modality-related information of cross task features for more effective training. Extensive experiments are conducted to validate the superiority of the CTMAN, which significantly outperforms state-of-the-art methods on the UoM-SGFSv2 set A, set B, CUFSF, and PRIP-VSGC dataset.

Published in Frontiers in Neurorobotics

ISSN: 1662-5218 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry
Website: https://www.frontiersin.org/journals/neurorobotics/

About the journal

Abstract

Keywords