NeuroImage (Feb 2020)
A distributed multitask multimodal approach for the prediction of Alzheimer’s disease in a longitudinal study
Abstract
Predicting the progression of Alzheimer’s Disease (AD) has been held back for decades due to the lack of sufficient longitudinal data required for the development of novel machine learning algorithms. This study proposes a novel machine learning algorithm for predicting the progression of Alzheimer’s disease using a distributed multimodal, multitask learning method. More specifically, each individual task is defined as a regression model, which predicts cognitive scores at a single time point. Since the prediction tasks for multiple intervals are related to each other in chronological order, multitask regression models have been developed to track the relationship between subsequent tasks. Furthermore, since subjects have various combinations of recording modalities together with other genetic, neuropsychological and demographic risk factors, special attention is given to the fact that each modality may experience a specific sparsity pattern. The model is hence generalized by exploiting multiple individual multitask regression coefficient matrices for each modality. The outcome for each independent modality-specific learner is then integrated with complementary information, known as risk factor parameters, revealing the most prevalent trends of the multimodal data. This new feature space is then used as input to the gradient boosting kernel in search for a more accurate prediction. This proposed model not only captures the complex relationships between the different feature representations, but it also ignores any unrelated information which might skew the regression coefficients. Comparative assessments are made between the performance of the proposed method with several other well-established methods using different multimodal platforms. The results indicate that by capturing the interrelatedness between the different modalities and extracting only relevant information in the data, even in an incomplete longitudinal dataset, will yield minimized prediction errors.