Frontiers in Public Health (Dec 2023)
Machine learning-driven development of a disease risk score for COVID-19 hospitalization and mortality: a Swedish and Norwegian register-based study
Abstract
AimsTo develop a disease risk score for COVID-19-related hospitalization and mortality in Sweden and externally validate it in Norway.MethodWe employed linked data from the national health registries of Sweden and Norway to conduct our study. We focused on individuals in Sweden with confirmed SARS-CoV-2 infection through RT-PCR testing up to August 2022 as our study cohort. Within this group, we identified hospitalized cases as those who were admitted to the hospital within 14 days of testing positive for SARS-CoV-2 and matched them with five controls from the same cohort who were not hospitalized due to SARS-CoV-2. Additionally, we identified individuals who died within 30 days after being hospitalized for COVID-19. To develop our disease risk scores, we considered various factors, including demographics, infectious, somatic, and mental health conditions, recorded diagnoses, and pharmacological treatments. We also conducted age-specific analyses and assessed model performance through 5-fold cross-validation. Finally, we performed external validation using data from the Norwegian population with COVID-19 up to December 2021.ResultsDuring the study period, a total of 124,560 individuals in Sweden were hospitalized, and 15,877 individuals died within 30 days following COVID-19 hospitalization. Disease risk scores for both hospitalization and mortality demonstrated predictive capabilities with ROC-AUC values of 0.70 and 0.72, respectively, across the entire study period. Notably, these scores exhibited a positive correlation with the likelihood of hospitalization or death. In the external validation using data from the Norwegian COVID-19 population (consisting of 53,744 individuals), the disease risk score predicted hospitalization with an AUC of 0.47 and death with an AUC of 0.74.ConclusionThe disease risk score showed moderately good performance to predict COVID-19-related mortality but performed poorly in predicting hospitalization when externally validated.
Keywords