Replay Speech Detection Based on Dual-Input Hierarchical Fusion Network

Chenlei Hu; Ruohua Zhou; Qingsheng Yuan

doi:10.3390/app13095350

Applied Sciences (Apr 2023)

Replay Speech Detection Based on Dual-Input Hierarchical Fusion Network

Chenlei Hu,
Ruohua Zhou,
Qingsheng Yuan

Affiliations

Chenlei Hu: School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102627, China
Ruohua Zhou: School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102627, China
Qingsheng Yuan: National Computer Network Emergency Response Technical Team Coordination Center of China, Beijing 100029, China

DOI: https://doi.org/10.3390/app13095350
Journal volume & issue: Vol. 13, no. 9
p. 5350

Abstract

Read online

Speech anti-spoofing is a crucial aspect of speaker recognition systems and has received a great deal of attention in recent years. Deep neural networks have achieved satisfactory results in datasets with similar training and testing data distributions, but their generalization ability is limited in datasets with different distributions. In this paper, we proposed a novel dual-input hierarchical fusion network (HFN) to improve the generalization ability of our model. The network had two inputs (the original speech signal and the time-reversed signal), which increased the volume and diversity of the training data. The hierarchical fusion model (HFM) enabled more thorough fusion of information from different input levels and improved model performance by fusing the two inputs after speech feature extraction. We finally evaluated the results using the ASVspoof 2021 PA (Physical Access) dataset, and the proposed system achieved an Equal Error Rate (EER) of 24.46% and a minimum tandem Detection Cost Function (min t-DCF) of 0.6708 in the test set. Compared with the four baseline systems in the ASVspoof 2021 competition, the proposed system min t-DCF values were decreased by 28.9%, 31.0%, 32.6%, and 32.9%, and the EERs were decreased by 35.7%, 38.1%, 45.4%, and 49.7%, respectively.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords