A modality separation approach for facial sketch synthesis

Kangning Du; Chaoyi Wang; Lin Cao; Yanan Guo; Wenwen Sun

doi:10.1049/ipr2.13238

IET Image Processing (Nov 2024)

A modality separation approach for facial sketch synthesis

Kangning Du,
Chaoyi Wang,
Lin Cao,
Yanan Guo,
Wenwen Sun

Affiliations

Kangning Du: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Chaoyi Wang: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Lin Cao: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Yanan Guo: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China
Wenwen Sun: Key Laboratory of Information and Communication Systems Ministry of Information Industry Beijing Information Science and Technology University Beijing China

DOI: https://doi.org/10.1049/ipr2.13238
Journal volume & issue: Vol. 18, no. 13
pp. 4127 – 4140

Abstract

Read online

Abstract The technology for face‐to‐sketch synthesis transforms optical face images into a sketch‐style format. However, traditional style losses are insufficient to discern the modal differences between optical and sketch domain images, leading to unclear images. At the same time, generated images lack clarity due to traditional approaches' disregard for high‐frequency texture. To address these issues, a modality separation approach for facial sketch synthesis is proposed. First, a modality separation structure is proposed, using a quicksort algorithm to merge features of optical and sketch images as target modality (positive samples), ensuring the generated images' feature distribution matches real sketches. By controlling the Euclidean distance between generated images (anchors) and both target and filtered modality (positive and negative samples), irrelevant information is effectively filtered out. Next, an edge‐promoting module feeds processed blurry sketch images into the discriminator to enhance robustness. Lastly, a detail optimization module uses Laplacian filtering to extract high‐frequency texture from optical face images for local enhancement. Experimental validation on CUHK, AR, and XM2VTS datasets shows that this method outperforms mainstream sketch face synthesis methods in terms of Fréchet inception distance and learned perceptual image patch similarity, producing more realistic and natural images with richer texture details.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords