Journal of Electrical and Computer Engineering (Jan 2022)
Research and DSP Implementation of Speech Enhancement Technology Based on Dynamic Mixed Features and Adaptive Mask
Abstract
A deep learning speech enhancement algorithm based on dynamic hybrid feature and adaptive mask and DSP implementation is proposed in this paper, which solves the problem of feature loss and improves the performance of speech enhancement. The dynamic features incorporate the log Mel power spectrum, Mel cepstral coefficients, and Multiresolution Auditory Cepstral Coefficients (MRACC) and capture the speech transient information by deriving the derivatives to comprehensively represent the nonlinear structure of speech and reduce distortion. To make the system improve the speech quality while reducing the speech distortion as much as possible, a soft mask that can be adaptively adjusted considering the signal-to-noise ratio information is proposed, which can be automatically adjusted according to the different speech signal-to-noise ratio information to obtain the mask value under the corresponding signal-to-noise ratio conditions, and phase difference information that can improve the speech intelligibility is incorporated in it. Then, an improved deep neural network model is designed to effectively improve the speech enhancement performance. Finally, the hardware and algorithm software design of the DSP-based speech enhancement system is given. Experimental simulations are carried out for multiple voices in different noise backgrounds. The experimental results indicate that the performance indexes of the proposed method are significantly improved compared with the existing speech enhancement methods, which verifies the feasibility and superiority of the proposed method.