Alzheimer’s Research & Therapy (Aug 2022)

An explainable self-attention deep neural network for detecting mild cognitive impairment using multi-input digital drawing tasks

  • Natthanan Ruengchaijatuporn,
  • Itthi Chatnuntawech,
  • Surat Teerapittayanon,
  • Sira Sriswasdi,
  • Sirawaj Itthipuripat,
  • Solaphat Hemrungrojn,
  • Prodpran Bunyabukkana,
  • Aisawan Petchlorlian,
  • Sedthapong Chunamchai,
  • Thiparat Chotibut,
  • Chaipat Chunharas

DOI
https://doi.org/10.1186/s13195-022-01043-2
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Mild cognitive impairment (MCI) is an early stage of cognitive decline which could develop into dementia. An early detection of MCI is a crucial step for timely prevention and intervention. Recent studies have developed deep learning models to detect MCI and dementia using a bedside task like the classic clock drawing test (CDT). However, it remains a challenge to predict the early stage of the disease using the CDT data alone. Moreover, the state-of-the-art deep learning techniques still face black box challenges, making it questionable to implement them in a clinical setting. Methods We recruited 918 subjects from King Chulalongkorn Memorial Hospital (651 healthy subjects and 267 MCI patients). We propose a novel deep learning framework that incorporates data from the CDT, cube-copying, and trail-making tests. Soft label and self-attention were applied to improve the model performance and provide a visual explanation. The interpretability of the visualization of our model and the Grad-CAM approach were rated by experienced medical personnel and quantitatively evaluated using intersection over union (IoU) between the models’ heat maps and the regions of interest. Results Rather than using a single CDT image in the baseline VGG16 model, using multiple drawing tasks as inputs into our proposed model with soft label significantly improves the classification performance between the healthy aging controls and the MCI patients. In particular, the classification accuracy increases from 0.75 (baseline model) to 0.81. The F1-score increases from 0.36 to 0.65, and the area under the receiver operating characteristic curve (AUC) increases from 0.74 to 0.84. Compared to the multi-input model that also offers interpretable visualization, i.e., Grad-CAM, our model receives higher interpretability scores given by experienced medical experts and higher IoUs. Conclusions Our model achieves better classification performance at detecting MCI compared to the baseline model. In addition, the model provides visual explanations that are superior to those of the baseline model as quantitatively evaluated by experienced medical personnel. Thus, our work offers an interpretable machine learning model with high classification performance, both of which are crucial aspects of artificial intelligence in medical diagnosis.