IEEE Access (Jan 2020)

Prediction Model of Dementia Risk Based on XGBoost Using Derived Variable Extraction and Hyper Parameter Optimization

  • Seong-Eun Ryu,
  • Dong-Hoon Shin,
  • Kyungyong Chung

DOI
https://doi.org/10.1109/ACCESS.2020.3025553
Journal volume & issue
Vol. 8
pp. 177708 – 177720

Abstract

Read online

With the development of healthcare technologies, the elderly population has grown and therefore populating ageing has emerged as a social issue. It is a cause of rise in patients with geriatric disorders, among which dementia is very fatal to the elderly's activities of daily living. In the studies on dementia risk prediction, a method using deep learning was proposed. It requires a lot of image data and much time to learn. Therefore, this study proposes a prediction model of dementia risk based on XGBoost using derived variable extraction from numericalized dementia data and hyper-parameters optimization. The proposed method extracts variable importance from typical independent variables with the use of gradient boosting and then generates derived variables. The generated derived variables are applied to variable importance analysis and thereby a Top-N group is created. Then, for achieving optimal performance in line with the data characteristics of each Top-N group, hyper-parameter tuning is conducted. With the optimized groups, XGBoost model based performance is evaluated. In addition, for the performance evaluation of the proposed model, goodness-of-fit for machine learning classification models is evaluated. According to the Top-N group performance evaluation with different numbers of derived variables, Top-20 model showed the best performance, and the optimized hyper-parameter values were eta = 0.10, gamma = 0, max_depth = 4, and min_child_weight = 1. As a result, the accuracy of the XGBoost model proposed in this study was 85.61%, and its F1-score was 79.28%. When the proposed model is compared with Decision Tree, Random Forest, SVM, and k-NN models, it has the best performance.

Keywords