Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network

Ross D. Williams; Aniek F. Markus; Cynthia Yang; Talita Duarte-Salles; Scott L. DuVall; Thomas Falconer; Jitendra Jonnagaddala; Chungsoo Kim; Yeunsook Rho; Andrew E. Williams; Amanda Alberga Machado; Min Ho An; María Aragón; Carlos Areia; Edward Burn; Young Hwa Choi; Iannis Drakos; Maria Tereza Fernandes Abrahão; Sergio Fernández-Bertolín; George Hripcsak; Benjamin Skov Kaas-Hansen; Prasanna L. Kandukuri; Jan A. Kors; Kristin Kostka; Siaw-Teng Liaw; Kristine E. Lynch; Gerardo Machnicki; Michael E. Matheny; Daniel Morales; Fredrik Nyberg; Rae Woong Park; Albert Prats-Uribe; Nicole Pratt; Gowtham Rao; Christian G. Reich; Marcela Rivera; Tom Seinen; Azza Shoaibi; Matthew E. Spotnitz; Ewout W. Steyerberg; Marc A. Suchard; Seng Chan You; Lin Zhang; Lili Zhou; Patrick B. Ryan; Daniel Prieto-Alhambra; Jenna M. Reps; Peter R. Rijnbeek

doi:10.1186/s12874-022-01505-z

BMC Medical Research Methodology (Jan 2022)

Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network

Ross D. Williams,
Aniek F. Markus,
Cynthia Yang,
Talita Duarte-Salles,
Scott L. DuVall,
Thomas Falconer,
Jitendra Jonnagaddala,
Chungsoo Kim,
Yeunsook Rho,
Andrew E. Williams,
Amanda Alberga Machado,
Min Ho An,
María Aragón,
Carlos Areia,
Edward Burn,
Young Hwa Choi,
Iannis Drakos,
Maria Tereza Fernandes Abrahão,
Sergio Fernández-Bertolín,
George Hripcsak,
Benjamin Skov Kaas-Hansen,
Prasanna L. Kandukuri,
Jan A. Kors,
Kristin Kostka,
Siaw-Teng Liaw,
Kristine E. Lynch,
Gerardo Machnicki,
Michael E. Matheny,
Daniel Morales,
Fredrik Nyberg,
Rae Woong Park,
Albert Prats-Uribe,
Nicole Pratt,
Gowtham Rao,
Christian G. Reich,
Marcela Rivera,
Tom Seinen,
Azza Shoaibi,
Matthew E. Spotnitz,
Ewout W. Steyerberg,
Marc A. Suchard,
Seng Chan You,
Lin Zhang,
Lili Zhou,
Patrick B. Ryan,
Daniel Prieto-Alhambra,
Jenna M. Reps,
Peter R. Rijnbeek

Affiliations

Ross D. Williams: Department of Medical Informatics, Erasmus University Medical Center
Aniek F. Markus: Department of Medical Informatics, Erasmus University Medical Center
Cynthia Yang: Department of Medical Informatics, Erasmus University Medical Center
Talita Duarte-Salles: Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol)
Scott L. DuVall: Department of Veterans Affairs, University of Utah
Thomas Falconer: Department of Biomedical Informatics, Columbia University
Jitendra Jonnagaddala: School of Public Health and Community Medicine
Chungsoo Kim: Department of Biomedical Sciences, Ajou University Graduate School of Medicine
Yeunsook Rho: Department of Big Data Strategy, National Health Insurance Service
Andrew E. Williams: Tufts University School of Medicine, Institute for Clinical Research and Health Policy Studies
Amanda Alberga Machado: Independent Epidemiologist, OHDSI
Min Ho An: So Ahn Public Health Center, Wando County Health Center and Hospital
María Aragón: Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol)
Carlos Areia: Nuffield Department of Clinical Neurosciences, University of Oxford
Edward Burn: Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol)
Young Hwa Choi: Department of Infectious Diseases, Ajou University School of Medicine
Iannis Drakos: Center for Surgical Science
Maria Tereza Fernandes Abrahão: Faculty of Medicine, University of Sao Paulo
Sergio Fernández-Bertolín: Fundacio Institut Universitari per a la recerca a l’Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol)
George Hripcsak: Department of Biomedical Informatics, Columbia University
Benjamin Skov Kaas-Hansen: Clinical Pharmacology Unit, Zealand University Hospital
Prasanna L. Kandukuri: Abbvie
Jan A. Kors: Department of Medical Informatics, Erasmus University Medical Center
Kristin Kostka: Real World Solutions, IQVIA
Siaw-Teng Liaw: School of Public Health and Community Medicine
Kristine E. Lynch: Department of Veterans Affairs, University of Utah
Gerardo Machnicki: Janssen Latin America
Michael E. Matheny: Department of Veterans Affairs
Daniel Morales: Division of Population Health and Genomics, University of Dundee
Fredrik Nyberg: School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg
Rae Woong Park: Department of Biomedical Informatics, Ajou University School of Medicine
Albert Prats-Uribe: Centre for Statistics in Medicine, NDORMS, University of Oxford
Nicole Pratt: Quality Use of Medicines and Pharmacy Research Centre, University of South Australia
Gowtham Rao: Janssen Research & Development
Christian G. Reich: Real World Solutions, IQVIA
Marcela Rivera: Bayer Pharmaceuticals, Bayer Hispania, S.L.
Tom Seinen: Department of Medical Informatics, Erasmus University Medical Center
Azza Shoaibi: Janssen Research & Development
Matthew E. Spotnitz: Department of Biomedical Informatics, Columbia University
Ewout W. Steyerberg: Department of Public Health, Erasmus University Medical Center
Marc A. Suchard: Department of Biostatistics, UCLA Fielding School of Public Health, University of California
Seng Chan You: Department of Biomedical Informatics, Ajou University School of Medicine
Lin Zhang: School of Public Health, Peking Union Medical College
Lili Zhou: Abbvie
Patrick B. Ryan: Janssen Research & Development
Daniel Prieto-Alhambra: Centre for Statistics in Medicine, NDORMS, University of Oxford
Jenna M. Reps: Janssen Research & Development
Peter R. Rijnbeek: Department of Medical Informatics, Erasmus University Medical Center

DOI: https://doi.org/10.1186/s12874-022-01505-z
Journal volume & issue: Vol. 22, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient’s risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients. Methods We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date. Results Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69–0.81, COVER-I: 0.73–0.91, and COVER-F: 0.72–0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations. Conclusions This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Published in BMC Medical Research Methodology

ISSN: 1471-2288 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General)
Website: http://bmcmedresmethodol.biomedcentral.com

About the journal

Abstract

Keywords