Alzheimer’s Research & Therapy (Nov 2023)

Identification of high likelihood of dementia in population-based surveys using unsupervised clustering: a longitudinal analysis

  • Amin Gharbi-Meliani,
  • François Husson,
  • Henri Vandendriessche,
  • Eleonore Bayen,
  • Kristine Yaffe,
  • Anne-Catherine Bachoud-Lévi,
  • Laurent Cleret de Langavant

DOI
https://doi.org/10.1186/s13195-023-01357-9
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Dementia is defined as a cognitive decline that affects functional status. Longitudinal ageing surveys often lack a clinical diagnosis of dementia though measure cognition and daily function over time. We used unsupervised machine learning and longitudinal data to identify transition to probable dementia. Methods Multiple Factor Analysis was applied to longitudinal function and cognitive data of 15,278 baseline participants (aged 50 years and more) from the Survey of Health, Ageing, and Retirement in Europe (SHARE) (waves 1, 2 and 4–7, between 2004 and 2017). Hierarchical Clustering on Principal Components discriminated three clusters at each wave. We estimated probable or “Likely Dementia” prevalence by sex and age, and assessed whether dementia risk factors increased the risk of being assigned probable dementia status using multistate models. Next, we compared the “Likely Dementia” cluster with self-reported dementia status and replicated our findings in the English Longitudinal Study of Ageing (ELSA) cohort (waves 1–9, between 2002 and 2019, 7840 participants at baseline). Results Our algorithm identified a higher number of probable dementia cases compared with self-reported cases and showed good discriminative power across all waves (AUC ranged from 0.754 [0.722–0.787] to 0.830 [0.800–0.861]). “Likely Dementia” status was more prevalent in older people, displayed a 2:1 female/male ratio, and was associated with nine factors that increased risk of transition to dementia: low education, hearing loss, hypertension, drinking, smoking, depression, social isolation, physical inactivity, diabetes, and obesity. Results were replicated in ELSA cohort with good accuracy. Conclusions Machine learning clustering can be used to study dementia determinants and outcomes in longitudinal population ageing surveys in which dementia clinical diagnosis is lacking.

Keywords