Using machine learning to predict COVID-19 infection and severity risk among 4510 aged adults: a UK Biobank cohort study

Auriel A. Willette; Sara A. Willette; Qian Wang; Colleen Pappas; Brandon S. Klinedinst; Scott Le; Brittany Larsen; Amy Pollpeter; Tianqi Li; Jonathan P. Mochel; Karin Allenspach; Nicole Brenner; Tim Waterboer

doi:10.1038/s41598-022-07307-z

Scientific Reports (May 2022)

Using machine learning to predict COVID-19 infection and severity risk among 4510 aged adults: a UK Biobank cohort study

Auriel A. Willette,
Sara A. Willette,
Qian Wang,
Colleen Pappas,
Brandon S. Klinedinst,
Scott Le,
Brittany Larsen,
Amy Pollpeter,
Tianqi Li,
Jonathan P. Mochel,
Karin Allenspach,
Nicole Brenner,
Tim Waterboer

Affiliations

Auriel A. Willette: Department of Food Science and Human Nutrition, Iowa State University
Sara A. Willette: Iowa COVID-19 Tracker Inc.
Qian Wang: Department of Food Science and Human Nutrition, Iowa State University
Colleen Pappas: Department of Food Science and Human Nutrition, Iowa State University
Brandon S. Klinedinst: Department of Food Science and Human Nutrition, Iowa State University
Scott Le: Department of Food Science and Human Nutrition, Iowa State University
Brittany Larsen: Department of Food Science and Human Nutrition, Iowa State University
Amy Pollpeter: Department of Food Science and Human Nutrition, Iowa State University
Tianqi Li: Department of Food Science and Human Nutrition, Iowa State University
Jonathan P. Mochel: Department of Biomedical Sciences, Iowa State University
Karin Allenspach: Department of Veterinary Clinical Sciences, Iowa State University
Nicole Brenner: Infections and Cancer Epidemiology Division, German Cancer Research Center (DKFZ)
Tim Waterboer: Infections and Cancer Epidemiology Division, German Cancer Research Center (DKFZ)

DOI: https://doi.org/10.1038/s41598-022-07307-z
Journal volume & issue: Vol. 12, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Many risk factors have emerged for novel 2019 coronavirus disease (COVID-19). It is relatively unknown how these factors collectively predict COVID-19 infection risk, as well as risk for a severe infection (i.e., hospitalization). Among aged adults (69.3 ± 8.6 years) in UK Biobank, COVID-19 data was downloaded for 4510 participants with 7539 test cases. We downloaded baseline data from 10 to 14 years ago, including demographics, biochemistry, body mass, and other factors, as well as antibody titers for 20 common to rare infectious diseases in a subset of 80 participants with 124 test cases. Permutation-based linear discriminant analysis was used to predict COVID-19 risk and hospitalization risk. Probability and threshold metrics included receiver operating characteristic curves to derive area under the curve (AUC), specificity, sensitivity, and quadratic mean. Model predictions using the full cohort were marginal. The “best-fit” model for predicting COVID-19 risk was found in the subset of participants with antibody titers, which achieved excellent discrimination (AUC 0.969, 95% CI 0.934–1.000). Factors included age, immune markers, lipids, and serology titers to common pathogens like human cytomegalovirus. The hospitalization “best-fit” model was more modest (AUC 0.803, 95% CI 0.663–0.943) and included only serology titers, again in the subset group. Accurate risk profiles can be created using standard self-report and biomedical data collected in public health and medical settings. It is also worthwhile to further investigate if prior host immunity predicts current host immunity to COVID-19.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal