IEEE Access (Jan 2024)
Exploring Emotion and Emotional Variability as DigitalBiomarkers in Frontotemporal Dementia Speech
Abstract
Frontotemporal Dementia (FTD) encompasses a diverse group of progressive neurodegenerative diseases that impact speech production and comprehension, higher-order cognition, behavior, and motor control. Traditional acoustic speech markers have been extensively studied in FTD, as have assessments capturing apathy and impairments in recognizing and expressing emotion. This work leverages machine learning to track changes in emotional content within the speech of individuals with FTD and healthy controls. The aim of the project is to develop tools for assessing and monitoring emotional changes in individuals with FTD, quantifying these subtle aspects of the disease and thus potentially providing insights for assessing future therapeutic interventions. A retrospective analysis was conducted on a dataset comprising standard elicited speech tasks performed by 78 individuals diagnosed with FTD and 55 healthy elderly controls. We employed an ensemble-based convolutional neural network (CNN) classifier trained on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) dataset to extract emotion scores from processed speech samples. The classifier was applied with a sliding window to the FTD and healthy control narratives to facilitate a granular examination of emotional changes throughout longer speech samples. Analysis of variance (ANOVA) was used to test for group differences in average emotion scores as well as emotional variability over the duration of the speech samples. Compared to healthy controls, people with FTD demonstrated reduced emotional change in a monologue task describing a happy experience, as measured by the interquartile range (IQR) (p ¡ 0.005) and slope of “happy” emotion scores vs. time (p ¡ 0.005). During a picture description task, people with FTD displayed a slightly elevated average level of frustration (p ¡ 0.005). Increased frustration levels in individuals with FTD could potentially indicate their difficulties in accomplishing the task. This study introduced the application of a pre-trained Speech Emotion Recognition (SER) model on overlapping short segments of extended speech samples, allowing for a detailed examination of emotional changes over time. Capturing the temporal evolution of emotional content offers a nuanced understanding of communication in individuals with FTD. Our findings lay the groundwork for further development of digital biomarkers to refine the assessment, monitoring, and understanding of the emotional and social communication impacts of FTD.
Keywords