Dementia Detection from Speech: What If Language Models Are Not the Answer?

Mondher Bouazizi; Chuheng Zheng; Siyuan Yang; Tomoaki Ohtsuki

doi:10.3390/info15010002

Information (Dec 2023)

Dementia Detection from Speech: What If Language Models Are Not the Answer?

Mondher Bouazizi,
Chuheng Zheng,
Siyuan Yang,
Tomoaki Ohtsuki

Affiliations

Mondher Bouazizi: Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan
Chuheng Zheng: Gradudate School of Science and Technology, Keio University, Yokohama 223-8522, Japan
Siyuan Yang: Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan
Tomoaki Ohtsuki: Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan

DOI: https://doi.org/10.3390/info15010002
Journal volume & issue: Vol. 15, no. 1
p. 2

Abstract

Read online

A growing focus among scientists has been on researching the techniques of automatic detection of dementia that can be applied to the speech samples of individuals with dementia. Leveraging the rapid advancements in Deep Learning (DL) and Natural Language Processing (NLP), these techniques have shown great potential in dementia detection. In this context, this paper proposes a method for dementia detection from the transcribed speech of subjects. Unlike conventional methods that rely on advanced language models to address the ability of the subject to make coherent and meaningful sentences, our approach relies on the center of focus of the subjects and how it changes over time as the subject describes the content of the cookie theft image, a commonly used image for evaluating one’s cognitive abilities. To do so, we divide the cookie theft image into regions of interest, and identify, in each sentence spoken by the subject, which regions are being talked about. We employed a Long Short-Term Memory (LSTM) neural network to learn different patterns of dementia subjects and control ones and used it to perform a 10-fold cross validation-based classification. Our experimental results on the Pitt corpus from the DementiaBank resulted in a 82.9% accuracy at the subject level and 81.0% at the sample level. By employing data-augmentation techniques, the accuracy at both levels was increased to 83.6% and 82.1%, respectively. The performance of our proposed method outperforms most of the conventional methods, which reach, at best, an accuracy equal to 81.5% at the subject level.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords