Informatics (Sep 2024)

Differential Classification of Dengue, Zika, and Chikungunya Using Machine Learning—Random Forest and Decision Tree Techniques

  • Wilson Arrubla-Hoyos,
  • Jorge Gómez Gómez,
  • Emiro De-La-Hoz-Franco

DOI
https://doi.org/10.3390/informatics11030069
Journal volume & issue
Vol. 11, no. 3
p. 69

Abstract

Read online

Dengue, Zika, and chikungunya viruses pose a serious threat globally and circulate widely in America. These diseases share similar symptoms in their early stages, which can make early diagnosis difficult. In this study, two predictive models based on Decision Trees and Random Forests were developed to classify dengue, Zika, and chikungunya, with the aim of being supportive and easily interpretable for the medical community. To achieve this, a dataset was collected from a clinic in Sincelejo, Colombia, including the signs, symptoms, and laboratory results of these diseases. The Pan American Health Organization (PAHO) Diagnostic Guide 2022 methodology for the differential classification of dengue and chikungunya was applied by assigning evaluative weights to symptoms in the dataset. In addition, a bootstrapping resampling technique based on the central limit theorem was used to balance the target variable, and cross-validation was used to train the models. The main results were obtained with the Random Forest technique, achieving an accuracy of 99.7% for classifying chikungunya, 99.1% for dengue, and 98.8% for Zika. This study represents a significant advance in the differential prediction of these diseases through the use of automatic learning techniques and the integration of clinical and laboratory information.

Keywords