Alzheimer’s & Dementia: Translational Research & Clinical Interventions (Jan 2021)

Multilingual automation of transcript preprocessing in Alzheimer's disease detection

  • Frédéric Abiven,
  • Sylvie Ratté

DOI
https://doi.org/10.1002/trc2.12147
Journal volume & issue
Vol. 7, no. 1
pp. n/a – n/a

Abstract

Read online

Abstract Introduction Analyzing linguistic functions can improve early detection of Alzheimer's disease (AD). To date, no studies have focused on creating a universal pipeline for clinical transcript preprocessing. Methods This article presents a simple and efficient method for processing linguistic and phonetic data, sequencing subproblems of cleaning, normalization, and measure extraction tasks. Because some of these tasks are language‐ and context‐ dependent, they were designed to be easily configurable, thus increasing their scalability when dealing with new corpora. Results Results show improved performances over previous studies in this time‐consuming preprocessing task. Moreover, our findings showed that some discursive markers extracted from transcripts revealed a significant correlation (>0.5) with cognitive impairment severity. Discussion This article contributes to the literature on AD by presenting an efficient pipeline that allows speeding up the transcripts preprocessing task. We further invite other researchers to contribute to this work to help improve the quality of this pipeline (https://github.com/LiNCS-lab/usAge).

Keywords