New Acoustic Features for Synthetic and Replay Spoofing Attack Detection

Linqiang Wei; Yanhua Long; Haoran Wei; Yijie Li

doi:10.3390/sym14020274

Symmetry (Jan 2022)

New Acoustic Features for Synthetic and Replay Spoofing Attack Detection

Linqiang Wei,
Yanhua Long,
Haoran Wei,
Yijie Li

Affiliations

Linqiang Wei: Key Innovation Group of Digital Humanities Resource and Research, Shanghai Normal University, Shanghai 200234, China
Yanhua Long: Key Innovation Group of Digital Humanities Resource and Research, Shanghai Normal University, Shanghai 200234, China
Haoran Wei: Department of ECE, University of Texas at Dallas, Richardson, TX 75080, USA
Yijie Li: Unisound AI Technology Co., Ltd., Beijing 100096, China

DOI: https://doi.org/10.3390/sym14020274
Journal volume & issue: Vol. 14, no. 2
p. 274

Abstract

Read online

With the rapid development of intelligent speech technologies, automatic speaker verification (ASV) has become one of the most natural and convenient biometric speaker recognition approaches. However, most state-of-the-art ASV systems are vulnerable to spoofing attack techniques, such as speech synthesis, voice conversion, and replay speech. Due to the symmetry distribution characteristic between the genuine (true) speech and spoof (fake) speech pair, the spoofing attack detection is challenging. Many recent research works have been focusing on the ASV anti-spoofing solutions. This work investigates two types of new acoustic features to improve the performance of spoofing attacks. The first features consist of two cepstral coefficients and one LogSpec feature, which are extracted from the linear prediction (LP) residual signals. The second feature is a harmonic and noise subband ratio feature, which can reflect the interaction movement difference of the vocal tract and glottal airflow of the genuine and spoofing speech. The significance of these new features has been investigated in both the t-stochastic neighborhood embedding space and the binary classification modeling space. Experiments on the ASVspoof 2019 database show that the proposed residual features can achieve from 7% to 51.7% relative equal error rate (EER) reduction on the development and evaluation set over the best single system baseline. Furthermore, more than 31.2% relative EER reduction on both the development and evaluation set shows that the proposed new features contain large information complementary to the source acoustic features.

Published in Symmetry

ISSN: 2073-8994 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/symmetry/

About the journal

Abstract

Keywords