IEEE Access (Jan 2021)

Modeling the Progression of Speech Deficits in Cerebellar Ataxia Using a Mixture Mixed-Effect Machine Learning Framework

  • Bipasha Kashyap,
  • Pubudu N. Pathirana,
  • Malcolm Horne,
  • Laura Power,
  • David J. Szmulewicz

DOI
https://doi.org/10.1109/ACCESS.2021.3114328
Journal volume & issue
Vol. 9
pp. 135343 – 135353

Abstract

Read online

Background: Accurate and reliable prediction of changes in the severity of cerebellar ataxia (CA) will be necessary for trials of disease-modifying therapies. Cerebellar dysarthria (CD) is a common feature of CA. This study demonstrated that objective acoustic measures were more sensitive than perceptive analysis through the Scale for the Assessment and Rating of Ataxia (SARA) in assessing the progression of CD, within a time window of two years (mean). Method: Thirty-seven people with CA were tested at baseline (time point 1, TP1) and two years later (time point 2, TP2). A machine-learning framework with a robust three-step feature selection criterion and a Bayesian data-driven clustering technique based on the multivariate mixture extension of the generalized linear mixed model (GLMM) was used. The outcomes included two (time and cepstral-based) objective speech parameters recorded at TP1 and TP2. Subject testing involved dynamic prediction and was conducted using samples from the posterior distributions of parameter estimates and random effects. This study further employed the penalized expected deviance (PED) criterion for model comparison and the selection of the number of groups in the clustering procedure. Results: First, the selected objective speech metrics in the individual patients showed a significant worsening of the speech impairment (p<0.001, Kolmogorov–Smirnov test) between TP1 and TP2. Second, the cluster analysis divided the CA patients into two distinct subgroups showing a strong association between objective speech measures and disease duration, with ~96% of observed values falling within the 95% credible intervals. Third, for the training data, our multivariate model ( $PED_{Fea1+Fea2}=5175$ ; number of groups = 2) performed more reliably than the univariate models ( $PED_{Fea1}=4225$ , $PED_{Fea2}=3850$ ; number of groups = 2) in discriminating the CA patients. Fourth, the individual-level predictions of the change in profiles of the objective measures over time were performed for the testing data. Conclusion: Such a framework using objective speech metrics indeed holds promise to predict the rate of clinical progression of CD in individuals with CA.

Keywords