PLoS ONE (Jan 2022)
Predicting age at onset of type 1 diabetes in children using regression, artificial neural network and Random Forest: A case study in Saudi Arabia
Abstract
The rising incidence of type 1 diabetes (T1D) among children is an increasing concern globally. A reliable estimate of the age at onset of T1D in children would facilitate intervention plans for medical practitioners to reduce the problems with delayed diagnosis of T1D. This paper has utilised Multiple Linear Regression (MLR), Artificial Neural Network (ANN) and Random Forest (RF) to model and predict the age at onset of T1D in children in Saudi Arabia (S.A.) which is ranked as the 7th for the highest number of T1D and 5th in the world for the incidence rate of T1D. De-identified data between (2010-2020) from three cities in S.A. were used to model and predict the age at onset of T1D. The best subset model selection criteria, coefficient of determination, and diagnostic tests were deployed to select the most significant variables. The efficacy of models for predicting the age at onset was assessed using multi-prediction accuracy measures. The average age at onset of T1D is 6.2 years and the most common age group for onset is (5-9) years. Most of the children in the sample (68%) are from urban areas of S.A., 75% were delivered after a full term pregnancy length and 31% were delivered through a cesarean section. The models of best fit were the MLR and RF models with R2 = (0.85 and 0.95), the root mean square error = (0.25 and 0.15) and mean absolute error = (0.19 and 0.11) respectively for logarithm of age at onset. This study for the first time has utilised MLR, ANN and RF models to predict the age at onset of T1D in children in S.A. These models can effectively aid health care providers to monitor and create intervention strategies to reduce the impact of T1D in children in S.A.