Sensors (Nov 2024)

CSTAN: A Deepfake Detection Network with CST Attention for Superior Generalization

  • Rui Yang,
  • Kang You,
  • Cheng Pang,
  • Xiaonan Luo,
  • Rushi Lan

DOI
https://doi.org/10.3390/s24227101
Journal volume & issue
Vol. 24, no. 22
p. 7101

Abstract

Read online

With the advancement of deepfake forgery technology, highly realistic fake faces have posed serious security risks to sensor-based facial recognition systems. Recent deepfake detection models mainly use binary classification models based on deep learning. Despite achieving high detection accuracy on intra-datasets, these models lack generalization ability when applied to cross-datasets. We propose a deepfake detection model named Channel-Spatial-Triplet Attention Network (CSTAN), which focuses on the difference between real and fake features, thereby enhancing the generality of the detection model. To enhance the feature-learning ability of the model for image forgery regions, we have designed the Channel-Spatial-Triplet (CST) attention mechanism, which extracts subtle local information by capturing feature channels and the spatial correlation of three different scales. Additionally, we propose a novel feature extraction method, OD-ResNet-34, by embedding ODConv into the feature extraction network to enhance its dynamic adaptability to data features. Trained on the FF++ dataset and tested on the Celeb-DF-v1 and Celeb-DF-v2 datasets, the experimental results show that our model has stronger generalization ability in cross-datasets than similar models.

Keywords