PLoS Neglected Tropical Diseases (Apr 2022)

Two-year death prediction models among patients with Chagas Disease using machine learning-based methods.

  • Ariela Mota Ferreira,
  • Laércio Ives Santos,
  • Ester Cerdeira Sabino,
  • Antonio Luiz Pinho Ribeiro,
  • Léa Campos de Oliveira-da Silva,
  • Renata Fiúza Damasceno,
  • Marcos Flávio Silveira Vasconcelos D'Angelo,
  • Maria do Carmo Pereira Nunes,
  • Desirée Sant Ana Haikal

DOI
https://doi.org/10.1371/journal.pntd.0010356
Journal volume & issue
Vol. 16, no. 4
p. e0010356

Abstract

Read online

Chagas disease (CD) is recognized by the World Health Organization as one of the thirteen most neglected tropical diseases. More than 80% of people affected by CD will not have access to diagnosis and continued treatment, which partly supports the high morbidity and mortality rate. Machine Learning (ML) can identify patterns in data that can be used to increase our understanding of a specific problem or make predictions about the future. Thus, the aim of this study was to evaluate different models of ML to predict death in two years of patients with CD. ML models were developed using different techniques and configurations. The techniques used were: Random Forests, Adaptive Boosting, Decision Tree, Support Vector Machine, and Artificial Neural Networks. The adopted settings considered only interview variables, only complementary exam variables, and finally, both mixed. Data from a cohort study with CD patients called SaMi-Trop were analyzed. The predictor variables came from the baseline; and the outcome, which was death, came from the first follow-up. All models were evaluated in terms of Sensitivity, Specificity and G-mean. Among the 1694 individuals with CD considered, 134 (7.9%) died within two years of follow-up. Using only the predictor variables from the interview, the different techniques achieved a maximum G-mean of 0.64 in predicting death. Using only the variables from complementary exams, the G-mean was up to 0.77. In this configuration, the protagonism of NT-proBNP was evident, where it was possible to observe that an ML model using only this single variable reached G-mean of 0.76. The configuration that mixed interview variables and complementary exams achieved G-mean of 0.75. ML can be used as a useful tool with the potential to contribute to the management of patients with CD, by identifying patients with the highest probability of death. Trial Registration: This trial is registered with ClinicalTrials.gov, Trial ID: NCT02646943.