Machine learning to improve the understanding of rabies epidemiology in low surveillance settings

Ravikiran Keshavamurthy; Cassandra Boutelle; Yoshinori Nakazawa; Haim Joseph; Dady W. Joseph; Pierre Dilius; Andrew D. Gibson; Ryan M. Wallace

doi:10.1038/s41598-024-76089-3

Scientific Reports (Oct 2024)

Machine learning to improve the understanding of rabies epidemiology in low surveillance settings

Ravikiran Keshavamurthy,
Cassandra Boutelle,
Yoshinori Nakazawa,
Haim Joseph,
Dady W. Joseph,
Pierre Dilius,
Andrew D. Gibson,
Ryan M. Wallace

Affiliations

Ravikiran Keshavamurthy: Poxvirus and Rabies Branch, Division of High Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention
Cassandra Boutelle: Poxvirus and Rabies Branch, Division of High Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention
Yoshinori Nakazawa: Poxvirus and Rabies Branch, Division of High Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention
Haim Joseph: Ministère de l’Agriculture, des Ressources Naturelles et du Développement Rural
Dady W. Joseph: Ministère de l’Agriculture, des Ressources Naturelles et du Développement Rural
Pierre Dilius: Ministère de l’Agriculture, des Ressources Naturelles et du Développement Rural
Andrew D. Gibson: Mission Rabies
Ryan M. Wallace: Poxvirus and Rabies Branch, Division of High Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention

DOI: https://doi.org/10.1038/s41598-024-76089-3
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 10

Abstract

Read online

Abstract In low and middle-income countries, a large proportion of animal rabies investigations end without a conclusive diagnosis leading to epidemiologic interpretations informed by clinical, rather than laboratory data. We compared Extreme Gradient Boosting (XGB) with Logistic Regression (LR) for their ability to estimate the probability of rabies in animals investigated as part of an Integrated Bite Case Management program (IBCM). To balance our training data, we used Random Oversampling (ROS) and Synthetic Minority Oversampling Technique. We developed a risk stratification framework based on predicted rabies probabilities. XGB performed better at predicting rabies cases than LR. Oversampling strategies enhanced the model sensitivity making them the preferred technique to predict rare events like rabies in a biting animal. XGB-ROS classified most of the confirmed rabies cases and only a small proportion of non-cases as either high (confirmed cases = 85.2%, non-cases = 0.01%) or moderate (confirmed cases = 8.4%, non-cases = 4.0%) risk. Model-based risk stratification led to a 3.2-fold increase in epidemiologically useful data compared to a routine surveillance strategy using IBCM case definitions. Our study demonstrates the application of machine learning to strengthen zoonotic disease surveillance under resource-limited settings.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords