MDM Policy & Practice (Dec 2023)
Informed Random Forest to Model Associations of Epidemiological Priors, Government Policies, and Public Mobility
Abstract
Background. Infectious diseases constitute a significant concern worldwide due to their increasing prevalence, associated health risks, and the socioeconomic costs. Machine learning (ML) models and epidemic models formulated using deterministic differential equations are the most dominant tools for analyzing and modeling the transmission of infectious diseases. However, ML models can be inconsistent in extracting the dynamics of a disease in the presence of data drifts. Likewise, the capability of epidemic models is constrained to parameter dimensions and estimation. We aimed at creating a framework of informed ML that integrates a random forest (RF) with an adapted susceptible infectious recovered (SIR) model to account for accuracy and consistency in stochasticity within the dynamics of coronavirus disease 2019 (COVID-19). Methods. An adapted SIR model was used to inform a default RF on predicting new COVID-19 cases (NCCs) at given intervals. We validated the performance of the informed RF (IRF) using real data. We used Botswana’s pharmaceutical interventions (PIs) and non-PIs (NPIs) adopted between February 2020 and August 2022. The discrepancy between predictions and observations is modeled using loss functions, which are minimized, interpreted, and used to assess the IRF. Results. The findings on the real data have revealed the effectiveness of the default RF in modeling and predicting NCCs. The use of the effective reproductive rate to inform the RF yielded an excellent predictive power (84%) compared with 75% by the default RF. Conclusion. This research has potential to inform policy and decision makers in developing systems to evaluate interventions for infectious diseases. Highlights This framework is initiated by incorporating model outputs from an epidemic model to a machine learning model. An informed random forest (RF) is instantiated to model government and public responses to the COVID-19 pandemic. This framework does not require data transformations, and the epidemic model is shown to boost the RF’s performance. This is a baseline knowledge-informed learning framework for assessing public health interventions in Botswana.