IEEE Access (Jan 2016)

Text-Independent Speaker Identification Using the Histogram Transform Model

  • Zhanyu Ma,
  • Hong Yu,
  • Zheng-Hua Tan,
  • Jun Guo

DOI
https://doi.org/10.1109/ACCESS.2016.2646458
Journal volume & issue
Vol. 4
pp. 9733 – 9739

Abstract

Read online

In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design super-mel-frequency cepstral coefficients (MFCCs) features by cascading three neighboring MFCCs frames together. These super-MFCC vectors are utilized for probabilistic model training such that the speaker's characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recede the commonly occurred discontinuity problem in multivariate histograms computing, more training data are generated by the HT method. Using these generated data, a smooth PDF of the super-MFCCs vectors is obtained. Compared with the typical PDF estimation methods, such as Gaussian mixture model, promising improvements have been obtained by employing the HT-based model in SI.

Keywords