BMJ Open (Sep 2024)
Prevention of adverse HIV treatment outcomes: machine learning to enable proactive support of people at risk of HIV care disengagement in Tanzania
Abstract
Objectives This study aimed to develop a machine learning (ML) model to predict disengagement from HIV care, high viral load or death among people living with HIV (PLHIV) with the goal of enabling proactive support interventions in Tanzania. The algorithm addressed common challenges when applying ML to electronic medical record (EMR) data: (1) imbalanced outcome distribution; (2) heterogeneity across multisite EMR data and (3) evolving virological suppression thresholds.Design Observational study using a national EMR database.Setting Conducted in two regions in Tanzania, using data from the National HIV Care database.Participants The study included over 6 million HIV care visit records from 295 961 PLHIV in two regions in Tanzania’s National HIV Care database from January 2015 to May 2023.Results Our ML model effectively identified PLHIV at increased risk of adverse outcomes. Key predictors included past disengagement from care, antiretroviral therapy (ART) status (which tracks a patient’s engagement with ART across visits), age and time on ART. The downsampling approach we implemented effectively managed imbalanced data to reduce prediction bias. Site-specific algorithms performed better compared with a universal approach, highlighting the importance of tailoring ML models to local contexts. A sensitivity analysis confirmed the model’s robustness to changes in viral load suppression thresholds.Conclusions ML models leveraging large-scale databases of patient data offer significant potential to identify PLHIV for interventions to enhance engagement in HIV care in resource-limited settings. Tailoring algorithms to local contexts and flexibility towards evolving clinical guidelines are essential for maximising their impact.