EBioMedicine (Mar 2025)
Machine learning detection of heteroresistance in Escherichia coliResearch in context
Abstract
Summary: Background: Heteroresistance (HR) is a significant type of antibiotic resistance observed for several bacterial species and antibiotic classes where a susceptible main population contains small subpopulations of resistant cells. Mathematical models, animal experiments and clinical studies associate HR with treatment failure. Currently used susceptibility tests do not detect heteroresistance reliably, which can result in misclassification of heteroresistant isolates as susceptible which might lead to treatment failure. Here we examined if whole genome sequence (WGS) data and machine learning (ML) can be used to detect bacterial HR. Methods: We classified 467 Escherichia coli clinical isolates as HR or non-HR to the often used β-lactam/inhibitor combination piperacillin-tazobactam using pre-screening and Population Analysis Profiling tests. We sequenced the isolates, assembled the whole genomes and created a set of predictors based on current knowledge of HR mechanisms. Then we trained several machine learning models on 80% of this data set aiming to detect HR isolates. We compared performance of the best ML models on the remaining 20% of the data set with a baseline model based solely on the presence of β-lactamase genes. Furthermore, we sequenced the resistant sub-populations in order to analyse the genetic mechanisms underlying HR. Findings: The best ML model achieved 100% sensitivity and 84.6% specificity, outperforming the baseline model. The strongest predictors of HR were the total number of β-lactamase genes, β-lactamase gene variants and presence of IS elements flanking them. Genetic analysis of HR strains confirmed that HR is caused by an increased copy number of resistance genes via gene amplification or plasmid copy number increase. This aligns with the ML model's findings, reinforcing the hypothesis that this mechanism underlies HR in Gram-negative bacteria. Interpretation: We demonstrate that a combination of WGS and ML can identify HR in bacteria with perfect sensitivity and high specificity. This improved detection would allow for better-informed treatment decisions and potentially reduce the occurrence of treatment failures associated with HR. Funding: Funding provided to DIA from the Swedish Research Council (2021-02091) and NIH (1U19AI158080-01).