Medicina (Oct 2022)

A Machine Learning Model to Predict Length of Stay and Mortality among Diabetes and Hypertension Inpatients

  • Diana Barsasella,
  • Karamo Bah,
  • Pratik Mishra,
  • Mohy Uddin,
  • Eshita Dhar,
  • Dewi Lena Suryani,
  • Dedi Setiadi,
  • Imas Masturoh,
  • Ida Sugiarti,
  • Jitendra Jonnagaddala,
  • Shabbir Syed-Abdul

DOI
https://doi.org/10.3390/medicina58111568
Journal volume & issue
Vol. 58, no. 11
p. 1568

Abstract

Read online

Background and Objectives: Taiwan is among the nations with the highest rates of Type 2 Diabetes Mellitus (T2DM) and Hypertension (HTN). As more cases are reported each year, there is a rise in hospital admissions for people seeking medical attention. This creates a burden on hospitals and affects the overall management and administration of the hospitals. Hence, this study aimed to develop a machine learning (ML) model to predict the Length of Stay (LoS) and mortality among T2DM and HTN inpatients. Materials and Methods: Using Taiwan’s National Health Insurance Research Database (NHIRD), this cohort study consisted of 58,618 patients, where 25,868 had T2DM, 32,750 had HTN, and 6419 had both T2DM and HTN. We analyzed the data with different machine learning models for the prediction of LoS and mortality. The evaluation was done by plotting descriptive statistical graphs, feature importance, precision-recall curve, accuracy plots, and AUC. The training and testing data were set at a ratio of 8:2 before applying ML algorithms. Results: XGBoost showed the best performance in predicting LoS (R2 0.633; RMSE 0.386; MAE 0.123), and RF resulted in a slightly lower performance (R2 0.591; RMSE 0.401; MAE 0.027). Logistic Regression (LoR) performed the best in predicting mortality (CV Score 0.9779; Test Score 0.9728; Precision 0.9432; Recall 0.9786; AUC 0.97 and AUPR 0.93), closely followed by Ridge Classifier (CV Score 0.9736; Test Score 0.9692; Precision 0.9312; Recall 0.9463; AUC 0.94 and AUPR 0.89). Conclusions: We developed a robust prediction model for LoS and mortality of T2DM and HTN inpatients. Linear Regression showed the best performance for LoS, and Logistic Regression performed the best in predicting mortality. The results showed that ML algorithms can not only help healthcare professionals in data-driven decision-making but can also facilitate early intervention and resource planning.

Keywords