PLOS Global Public Health (Jan 2022)
Machine learning with routine electronic medical record data to identify people at high risk of disengagement from HIV care in Tanzania.
Abstract
Machine learning methods for health care delivery optimization have the potential to improve retention in HIV care, a critical target of global efforts to end the epidemic. However, these methods have not been widely applied to medical record data in low- and middle-income countries. We used an ensemble decision tree approach to predict risk of disengagement from HIV care (missing an appointment by ≥28 days) in Tanzania. Our approach used routine electronic medical records (EMR) from the time of antiretroviral therapy (ART) initiation through 24 months of follow-up for 178 adults (63% female). We compared prediction accuracy when using EMR-based predictors alone and in combination with sociodemographic survey data collected by a research study. Models that included only EMR-based indicators and incorporated changes across past clinical visits achieved a mean accuracy of 75.2% for predicting risk of disengagement in the next 6 months, with a mean sensitivity of 54.7% for targeting the 30% highest-risk individuals. Additionally including survey-based predictors only modestly improved model performance. The most important variables for prediction were time-varying EMR indicators including changes in treatment status, body weight, and WHO clinical stage. Machine learning methods applied to existing EMR data in resource-constrained settings can predict individuals' future risk of disengagement from HIV care, potentially enabling better targeting and efficiency of interventions to promote retention in care.