IEEE Access (Jan 2022)

DEEPFAKE Image Synthesis for Data Augmentation

  • Nawaf Waqas,
  • Sairul Izwan Safie,
  • Kushsairy Abdul Kadir,
  • Sheroz Khan,
  • Muhammad Haris Kaka Khel

DOI
https://doi.org/10.1109/ACCESS.2022.3193668
Journal volume & issue
Vol. 10
pp. 80847 – 80857

Abstract

Read online

Field of medical imaging is scarce in terms of a dataset that is reliable and extensive enough to train distinct supervised deep learning models. One way to tackle this problem is to use a Generative Adversarial Network to synthesize DEEPFAKE images to augment the data. DEEPFAKE refers to the transfer of important features from the source image (or video) to the target image (or video), such that the target modality appears to animate the source almost close to reality. In the past decade, medical image processing has made significant advances using the latest state-of-art-methods of deep learning techniques. Supervised deep learning models produce super-human results with the help of huge amount of dataset in a variety of medical image processing and deep learning applications. DEEPFAKE images can be a useful in various applications like translating to different useful and sometimes malicious modalities, unbalanced datasets or increasing the amount of datasets. In this paper the data scarcity has been addressed by using Progressive Growing Generative Adversarial Networks (PGGAN). However, PGGAN consists of convolution layer that suffers from the training-related issues. PGGAN requires a large number of convolution layers in order to obtain high-resolution image training, which makes training a difficult task. In this work, a subjective self-attention layer has been added before $256 \times 256$ convolution layer for efficient feature learning and the use of spectral normalization in the discriminator and pixel normalization in the generator for training stabilization - the two tasks resulting into what is referred to as Enhanced-GAN. The performance of Enhanced-GAN is compared to PGGAN performance using the parameters of AM Score and Mode Score. In addition, the strength of Enhanced-GAN and PGGAN synthesized data is evaluated using the U-net supervised deep learning model for segmentation tasks. Dice Coefficient metrics show that U-net trained on Enhanced-GAN DEEPFAKE data optimized with real data performs better than PGGAN DEEPFAKE data with real data.

Keywords