Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients

Carlos Alberto Hernández-Nava; Eric Alfredo Rincón-García; Pedro Lara-Velázquez; Sergio Gerardo de-los-Cobos-Silva; Miguel Angel Gutiérrez-Andrade; Roman Anselmo Mora-Gutiérrez

doi:10.7717/peerj-cs.1740

PeerJ Computer Science (Dec 2023)

Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients

Carlos Alberto Hernández-Nava,
Eric Alfredo Rincón-García,
Pedro Lara-Velázquez,
Sergio Gerardo de-los-Cobos-Silva,
Miguel Angel Gutiérrez-Andrade,
Roman Anselmo Mora-Gutiérrez

Affiliations

Carlos Alberto Hernández-Nava: Posgrado en Ciencias y Tecnologías de la Información, Universidad Autónoma Metropolitana, Ciudad de México, Ciudad de México, México
Eric Alfredo Rincón-García: Departamento de Ingeniería Eléctrica, Universidad Autónoma Metropolitana, Ciudad de México, Ciudad de México, México
Pedro Lara-Velázquez: Departamento de Ingeniería Eléctrica, Universidad Autónoma Metropolitana, Ciudad de México, Ciudad de México, México
Sergio Gerardo de-los-Cobos-Silva: Departamento de Ingeniería Eléctrica, Universidad Autónoma Metropolitana, Ciudad de México, Ciudad de México, México
Miguel Angel Gutiérrez-Andrade: Departamento de Ingeniería Eléctrica, Universidad Autónoma Metropolitana, Ciudad de México, Ciudad de México, México
Roman Anselmo Mora-Gutiérrez: Departamento de Sistemas, Universidad Autónoma Metropolitana de Azcapotzalco, Ciudad de México, Ciudad de México, México

DOI: https://doi.org/10.7717/peerj-cs.1740
Journal volume & issue: Vol. 9
p. e1740

Abstract

Read online Read online

Nowadays, biometric authentication has gained relevance due to the technological advances that have allowed its inclusion in many daily-use devices. However, this same advantage has also brought dangers, as spoofing attacks are now more common. This work addresses the vulnerabilities of automatic speaker verification authentication systems, which are prone to attacks arising from new techniques for the generation of spoofed audio. In this article, we present a countermeasure for these attacks using an approach that includes easy to implement feature extractors such as spectrograms and mel frequency cepstral coefficients, as well as a modular architecture based on deep neural networks. Finally, we evaluate our proposal using the well-know ASVspoof 2017 V2 database, the experiments show that using the final architecture the best performance is obtained, achieving an equal error rate of 6.66% on the evaluation set.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords