IEEE Access (Jan 2022)
GHOST—A New Face Swap Approach for Image and Video Domains
Abstract
Deep fake stands for a face swapping algorithm where the source and target can be an image or a video. Researchers have investigated sophisticated generative adversarial networks (GAN), autoencoders, and other approaches to establish precise and robust algorithms for face swapping. However the achieved results are far from perfect in terms of human and visual evaluation. In this study, we propose a new one-shot pipeline for image-to-image and image-to-video face swap solutions - GHOST (Generative High-fidelity One Shot Transfer). We take the FaceShifter (image-to-image) architecture as a baseline approach and propose several major architecture improvements which include a new eye-based loss function, face mask smooth algorithm, a new face swap pipeline for image-to-video face transfer, a new stabilization technique to decrease face jittering on adjacent frames and a super-resolution stage. In the experimental stage, we show that our solution outperforms SoTA face swap architectures in terms of ID retrieval (+1.5% improvement), shape (the second best value) and eye gaze preserving (+1% improvement) metrics. We also established an ablation study for our solution to estimate the contribution of pipeline stages to the overall accuracy, which showed that the eye loss leads to 2% improvement in the ID retrieval and 45% improvement in the eye gaze preserving.
Keywords