PLoS ONE (Jan 2021)

Impacts of multicollinearity on CAPT modalities: An heterogeneous machine learning framework for computer-assisted French phoneme pronunciation training.

  • Yanjing Bi,
  • Chao Li,
  • Yannick Benezeth,
  • Fan Yang

DOI
https://doi.org/10.1371/journal.pone.0257901
Journal volume & issue
Vol. 16, no. 10
p. e0257901

Abstract

Read online

Phoneme pronunciations are usually considered as basic skills for learning a foreign language. Practicing the pronunciations in a computer-assisted way is helpful in a self-directed or long-distance learning environment. Recent researches indicate that machine learning is a promising method to build high-performance computer-assisted pronunciation training modalities. Many data-driven classifying models, such as support vector machines, back-propagation networks, deep neural networks and convolutional neural networks, are increasingly widely used for it. Yet, the acoustic waveforms of phoneme are essentially modulated from the base vibrations of vocal cords, and this fact somehow makes the predictors collinear, distorting the classifying models. A commonly-used solution to address this issue is to suppressing the collinearity of predictors via partial least square regressing algorithm. It allows to obtain high-quality predictor weighting results via predictor relationship analysis. However, as a linear regressor, the classifiers of this type possess very simple topology structures, constraining the universality of the regressors. For this issue, this paper presents an heterogeneous phoneme recognition framework which can further benefit the phoneme pronunciation diagnostic tasks by combining the partial least square with support vector machines. A French phoneme data set containing 4830 samples is established for the evaluation experiments. The experiments of this paper demonstrates that the new method improves the accuracy performance of the phoneme classifiers by 0.21 - 8.47% comparing to state-of-the-arts with different data training data density.