PeerJ Computer Science (Jun 2024)

Detection of fake face images using lightweight convolutional neural networks with stacking ensemble learning method

  • Emre Şafak,
  • Necaattin Barışçı

DOI
https://doi.org/10.7717/peerj-cs.2103
Journal volume & issue
Vol. 10
p. e2103

Abstract

Read online Read online

Images and videos containing fake faces are the most common type of digital manipulation. Such content can lead to negative consequences by spreading false information. The use of machine learning algorithms to produce fake face images has made it challenging to distinguish between genuine and fake content. Face manipulations are categorized into four basic groups: entire face synthesis, face identity manipulation (deepfake), facial attribute manipulation and facial expression manipulation. The study utilized lightweight convolutional neural networks to detect fake face images generated by using entire face synthesis and generative adversarial networks. The dataset used in the training process includes 70,000 real images in the FFHQ dataset and 70,000 fake images produced with StyleGAN2 using the FFHQ dataset. 80% of the dataset was used for training and 20% for testing. Initially, the MobileNet, MobileNetV2, EfficientNetB0, and NASNetMobile convolutional neural networks were trained separately for the training process. In the training, the models were pre-trained on ImageNet and reused with transfer learning. As a result of the first trainings EfficientNetB0 algorithm reached the highest accuracy of 93.64%. The EfficientNetB0 algorithm was revised to increase its accuracy rate by adding two dense layers (256 neurons) with ReLU activation, two dropout layers, one flattening layer, one dense layer (128 neurons) with ReLU activation function, and a softmax activation function used for the classification dense layer with two nodes. As a result of this process accuracy rate of 95.48% was achieved with EfficientNetB0 algorithm. Finally, the model that achieved 95.48% accuracy was used to train MobileNet and MobileNetV2 models together using the stacking ensemble learning method, resulting in the highest accuracy rate of 96.44%.

Keywords