A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Yang Xiang; Liming Shi; Jesper Lisby Højvang; Morten Højfeldt Rasmussen; Mads Græsbøll Christensen

doi:10.1186/s13636-022-00256-5

EURASIP Journal on Audio, Speech, and Music Processing (Sep 2022)

A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence

Yang Xiang,
Liming Shi,
Jesper Lisby Højvang,
Morten Højfeldt Rasmussen,
Mads Græsbøll Christensen

Affiliations

Yang Xiang: CREATE, Aalborg University
Liming Shi: CREATE, Aalborg University
Jesper Lisby Højvang: Capturi A/S
Morten Højfeldt Rasmussen: Capturi A/S
Mads Græsbøll Christensen: CREATE, Aalborg University

DOI: https://doi.org/10.1186/s13636-022-00256-5
Journal volume & issue: Vol. 2022, no. 1
pp. 1 – 15

Abstract

Read online

Abstract In this paper, we propose a supervised single-channel speech enhancement method that combines Kullback-Leibler (KL) divergence-based non-negative matrix factorization (NMF) and a hidden Markov model (NMF-HMM). With the integration of the HMM, the temporal dynamics information of speech signals can be taken into account. This method includes a training stage and an enhancement stage. In the training stage, the sum of the Poisson distribution, leading to the KL divergence measure, is used as the observation model for each state of the HMM. This ensures that a computationally efficient multiplicative update can be used for the parameter update of this model. In the online enhancement stage, a novel minimum mean square error estimator is proposed for the NMF-HMM. This estimator can be implemented using parallel computing, reducing the time complexity. Moreover, compared to the traditional NMF-based speech enhancement methods, the experimental results show that our proposed algorithm improved the short-time objective intelligibility and perceptual evaluation of speech quality by 5% and 0.18, respectively.

Published in EURASIP Journal on Audio, Speech, and Music Processing

ISSN: 1687-4722 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Science: Physics: Acoustics. Sound; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://asmp-eurasipjournals.springeropen.com

About the journal

Abstract

Keywords