JTAM (Jurnal Teori dan Aplikasi Matematika) (Oct 2024)

Comparative Analysis of Decision Tree and Random Forest Algorithms for Diabetes Prediction

  • Aufar Faiq Fadhlullah,
  • Triyanna Widiyaningtyas

DOI
https://doi.org/10.31764/jtam.v8i4.24388
Journal volume & issue
Vol. 8, no. 4
pp. 1121 – 1132

Abstract

Read online

Diabetes Mellitus is a long-term medical disorder marked by high blood glucose levels that raise the risk of early mortality and organ failure. It has become an increasing global health problem, so making an accurate and timely diagnosis is urgently necessary. This study aims to diagnose people with diabetes mellitus by utilizing prediction techniques in data mining using experimental research. The prediction stage for diagnosing diabetes consists of four stages: dataset collection, data pre-processing, data processing, and evaluation. Data was obtained from Electronic Health Records (EHRs), namely the public "Diabetes Prediction Dataset". The pre-processing stage involves data filtering, attribute conversion, and class selection. The data processing utilizes random forests and decision tree models for diabetes prediction. The models were evaluated using accuracy, precision, and recall metrics. The results showed that the Random Forest algorithm produced an accuracy value of 93.97%, precision of 99.88%, and recall of 66.56%, with a computational time of 16s. Meanwhile, the decision tree algorithm produces an accuracy value of 93.89%, precision of 98.73%, and recall of 66.88%, with a computation time of less than 1s. Based on these results, it can be concluded that the Decision Tree algorithm is more effective because the difference in accuracy, precision, and recall values produced by the two algorithms does not have significant differences. However, the Decision Tree algorithm has the advantage of using computational time more effectively, which is needed in detecting diabetes because it is related to someone's life.

Keywords