IEEE Access (Jan 2023)

Audio Deepfake Approaches

  • Ousama A. Shaaban,
  • Remzi Yildirim,
  • Abubaker A. Alguttar

DOI
https://doi.org/10.1109/ACCESS.2023.3333866
Journal volume & issue
Vol. 11
pp. 132652 – 132682

Abstract

Read online

This paper presents a review of techniques involved in the creation and detection of audio deepfakes, the first section provides information about general deep fakes. In the second section, the main methods for audio deepfakes are outlined and subsequently compared. The results discuss various methods for detecting audio deepfakes, including analyzing statistical properties, examining media consistency, and utilizing machine learning and deep learning algorithms. Major methods used to detect fake audio in these studies included Support Vector Machines (SVMs), Decision Trees (DTs), Convolutional Neural Networks (CNNs), Siamese CNNs, Deep Neural Networks (DNNs), and a combination of CNNs and Recurrent Neural Networks (RNNs). The accuracy of these methods varied, with the highest accuracy being 99% for SVM and the lowest being 73.33% for DT. The Equal Error Rate (EER) was reported in a few of the studies, with the lowest being 2% for Deep-Sonar and the highest being 12.24 for DNN-HLLs. The t-DCF was also reported in some of the studies, with the Siamese CNN performing the best with a 55% improvement in min-t-DCF and EER compared to other methods.

Keywords