BMJ Open (Sep 2023)
Who is most at risk of dying if infected with SARS-CoV-2? A mortality risk factor analysis using machine learning of patients with COVID-19 over time: a large population-based cohort study in Mexico
Abstract
Objective COVID-19 would kill fewer people if health programmes can predict who is at higher risk of mortality because resources can be targeted to protect those people from infection. We predict mortality in a very large population in Mexico with machine learning using demographic variables and pre-existing conditions.Design Cohort study.Setting March 2020 to November 2021 in Mexico, nationally represented.Participants 1.4 million laboratory-confirmed patients with COVID-19 in Mexico at or over 20 years of age.Primary and secondary outcome measures Analysis is performed on data from March 2020 to November 2021 and over three phases: (1) from March to October in 2020, (2) from November 2020 to March 2021 and (3) from April to November 2021. We predict mortality using an ensemble machine learning method, super learner, and independently estimate the adjusted mortality relative risk of each pre-existing condition using targeted maximum likelihood estimation.Results Super learner fit has a high predictive performance (C-statistic: 0.907), where age is the most predictive factor for mortality. After adjusting for demographic factors, renal disease, hypertension, diabetes and obesity are the most impactful pre-existing conditions. Phase analysis shows that the adjusted mortality risk decreased over time while relative risk increased for each pre-existing condition.Conclusions While age is the most important predictor of mortality, younger individuals with hypertension, diabetes and obesity are at comparable mortality risk as individuals who are 20 years older without any of the three conditions. Our model can be continuously updated to identify individuals who should most be protected against infection as the pandemic evolves.