Time Series-Based Spoof Speech Detection Using Long Short-Term Memory and Bidirectional Long Short-Term Memory

Arsalan R. Mirza; Abdulbasit K. Al-Talabani

doi:10.14500/aro.11636

ARO-The Scientific Journal of Koya University (Sep 2024)

Time Series-Based Spoof Speech Detection Using Long Short-Term Memory and Bidirectional Long Short-Term Memory

Arsalan R. Mirza,
Abdulbasit K. Al-Talabani

Affiliations

Arsalan R. Mirza: Department of Computer Science, Faculty of Science, Soran University, Soran, Kurdistan Region – F.R. Iraq
Abdulbasit K. Al-Talabani: Department of Software Engineering, Faculty of Engineering, Koya University, Koya KOY45, Kurdistan Region - F.R. Iraq

DOI: https://doi.org/10.14500/aro.11636
Journal volume & issue: Vol. 12, no. 2

Abstract

Read online

Detecting fake speech in voice-based authentication systems is crucial for reliability. Traditional methods often struggle because they can't handle the complex patterns over time. Our study introduces an advanced approach using deep learning, specifically Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) models, tailored for identifying fake speech based on its temporal characteristics. We use speech signals with cepstral features like Mel-frequency cepstral coefficients (MFCC), Constant Q cepstral coefficients (CQCC), and open-source Speech and Music Interpretation by Large-space Extraction (OpenSMILE) to directly learn these patterns. Testing on the ASVspoof 2019 Logical Access dataset, we focus on metrics such as min-tDCF, Equal Error Rate (EER), Recall, Precision, and F1-score. Our results show that LSTM and BiLSTM models significantly enhance the reliability of spoof speech detection systems.

Published in ARO-The Scientific Journal of Koya University

ISSN: 2410-9355 (Print); 2307-549X (Online)
Publisher: Koya University
Country of publisher: Iraq
LCC subjects: Technology; Science
Website: http://aro.koyauniversity.org

About the journal

Abstract

Keywords