Journal of Electronic Science and Technology (Jun 2021)
Comparison of Khasi speech representations with different spectral features and hidden Markov states
Abstract
In this paper, we present a comparison of the Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-h speech data was used for training and 3-h data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results showed that MFCC provides a better representation for Khasi speech compared with the other three spectral features.