Voice Spoofing Countermeasure for Logical Access Attacks Detection

Tuba Arif; Ali Javed; Mohammed Alhameed; Fathe Jeribi; Ali Tahir

doi:10.1109/ACCESS.2021.3133134

IEEE Access (Jan 2021)

Voice Spoofing Countermeasure for Logical Access Attacks Detection

Tuba Arif,
Ali Javed,
Mohammed Alhameed,
Fathe Jeribi,
Ali Tahir

Affiliations

Tuba Arif: ORCiD; Department of Software Engineering, University of Engineering and Technology, Taxila, Taxila, Pakistan
Ali Javed: ORCiD; Department of Computer Science, University of Engineering and Technology, Taxila, Taxila, Pakistan
Mohammed Alhameed: ORCiD; College of Computer Science and Information Technology, Jazan University, Jazan, Saudi Arabia
Fathe Jeribi: ORCiD; College of Computer Science and Information Technology, Jazan University, Jazan, Saudi Arabia
Ali Tahir: ORCiD; College of Computer Science and Information Technology, Jazan University, Jazan, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2021.3133134
Journal volume & issue: Vol. 9
pp. 162857 – 162868

Abstract

Read online

Voice-driven devices (VDDs) like Google Home and Amazon Alexa, which are well-known connected devices in consumer IoT, have applications in various domains i.e., home appliances automation, next-generation vehicles, voice banking, and so on. However, these VDDs that are based on automatic speaker verification systems (ASVs) are vulnerable to voice based logical access (LA) attacks like Text-to-Speech (TTS) synthesis and converted voice signals. Intruders can exploit these attacks to bypass the security of such systems and gain access of victim’s bank account or home control. Thus, there exists a need to develop an effective voice spoofing countermeasure that can reliably be used to protect these VDDs against such malicious attacks. This work presents a novel audio features descriptor named as extended local ternary pattern (ELTP) to capture the vocal tract dynamically induced attributes of bonafide speech and algorithmic artifacts in synthetic and converted speeches. We fused our novel ELTP features with the linear frequency cepstral coefficients (LFCC) to further strengthen the capability of our features for capturing the traits of bonafide and spoofed signals. We employ the proposed ELTP-LFCC features to train the deep bidirectional Long Short-Term Memory (DBiLSTM) network for classification of the bonafide and spoof signal (i.e., TTS synthesis, converted speech). Performance of our spoofing countermeasure is measured on the large-scale and diverse ASVspoof 2019 logical access dataset. Experimental results demonstrate that the proposed audio spoofing countermeasure can reliably be used to detect the LA spoofing attacks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords