Alzheimer’s Research & Therapy (Jun 2021)

Correlating natural language processing and automated speech analysis with clinician assessment to quantify speech-language changes in mild cognitive impairment and Alzheimer’s dementia

  • Anthony Yeung,
  • Andrea Iaboni,
  • Elizabeth Rochon,
  • Monica Lavoie,
  • Calvin Santiago,
  • Maria Yancheva,
  • Jekaterina Novikova,
  • Mengdan Xu,
  • Jessica Robin,
  • Liam D. Kaufman,
  • Fariya Mostafa

DOI
https://doi.org/10.1186/s13195-021-00848-x
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Language impairment is an important marker of neurodegenerative disorders. Despite this, there is no universal system of terminology used to describe these impairments and large inter-rater variability can exist between clinicians assessing language. The use of natural language processing (NLP) and automated speech analysis (ASA) is emerging as a novel and potentially more objective method to assess language in individuals with mild cognitive impairment (MCI) and Alzheimer’s dementia (AD). No studies have analyzed how variables extracted through NLP and ASA might also be correlated to language impairments identified by a clinician. Methods Audio recordings (n=30) from participants with AD, MCI, and controls were rated by clinicians for word-finding difficulty, incoherence, perseveration, and errors in speech. Speech recordings were also transcribed, and linguistic and acoustic variables were extracted through NLP and ASA. Correlations between clinician-rated speech characteristics and the variables were compared using Spearman’s correlation. Exploratory factor analysis was applied to find common factors between variables for each speech characteristic. Results Clinician agreement was high in three of the four speech characteristics: word-finding difficulty (ICC = 0.92, p<0.001), incoherence (ICC = 0.91, p<0.001), and perseveration (ICC = 0.88, p<0.001). Word-finding difficulty and incoherence were useful constructs at distinguishing MCI and AD from controls, while perseveration and speech errors were less relevant. Word-finding difficulty as a construct was explained by three factors, including number and duration of pauses, word duration, and syntactic complexity. Incoherence was explained by two factors, including increased average word duration, use of past tense, and changes in age of acquisition, and more negative valence. Conclusions Variables extracted through automated acoustic and linguistic analysis of MCI and AD speech were significantly correlated with clinician ratings of speech and language characteristics. Our results suggest that correlating NLP and ASA with clinician observations is an objective and novel approach to measuring speech and language changes in neurodegenerative disorders.

Keywords