Clinical Epidemiology (Mar 2022)

Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS

  • Kostka K,
  • Duarte-Salles T,
  • Prats-Uribe A,
  • Sena AG,
  • Pistillo A,
  • Khalid S,
  • Lai LYH,
  • Golozar A,
  • Alshammari TM,
  • Dawoud DM,
  • Nyberg F,
  • Wilcox AB,
  • Andryc A,
  • Williams A,
  • Ostropolets A,
  • Areia C,
  • Jung CY,
  • Harle CA,
  • Reich CG,
  • Blacketer C,
  • Morales DR,
  • Dorr DA,
  • Burn E,
  • Roel E,
  • Tan EH,
  • Minty E,
  • DeFalco F,
  • de Maeztu G,
  • Lipori G,
  • Alghoul H,
  • Zhu H,
  • Thomas JA,
  • Bian J,
  • Park J,
  • Martínez Roldán J,
  • Posada JD,
  • Banda JM,
  • Horcajada JP,
  • Kohler J,
  • Shah K,
  • Natarajan K,
  • Lynch KE,
  • Liu L,
  • Schilling LM,
  • Recalde M,
  • Spotnitz M,
  • Gong M,
  • Matheny ME,
  • Valveny N,
  • Weiskopf NG,
  • Shah N,
  • Alser O,
  • Casajust P,
  • Park RW,
  • Schuff R,
  • Seager S,
  • DuVall SL,
  • You SC,
  • Song S,
  • Fernández-Bertolín S,
  • Fortin S,
  • Magoc T,
  • Falconer T,
  • Subbian V,
  • Huser V,
  • Ahmed WUR,
  • Carter W,
  • Guan Y,
  • Galvan Y,
  • He X,
  • Rijnbeek PR,
  • Hripcsak G,
  • Ryan PB,
  • Suchard MA,
  • Prieto-Alhambra D

Journal volume & issue
Vol. Volume 14
pp. 369 – 384

Abstract

Read online

Kristin Kostka,1,2 Talita Duarte-Salles,3 Albert Prats-Uribe,4 Anthony G Sena,5,6 Andrea Pistillo,3 Sara Khalid,4 Lana YH Lai,7 Asieh Golozar,8,9 Thamir M Alshammari,10 Dalia M Dawoud,11 Fredrik Nyberg,12 Adam B Wilcox,13,14 Alan Andryc,5 Andrew Williams,15 Anna Ostropolets,16 Carlos Areia,17 Chi Young Jung,18 Christopher A Harle,19 Christian G Reich,1,2 Clair Blacketer,5,6 Daniel R Morales,20 David A Dorr,21 Edward Burn,3,4 Elena Roel,3,22 Eng Hooi Tan,4 Evan Minty,23 Frank DeFalco,5 Gabriel de Maeztu,24 Gigi Lipori,19 Hiba Alghoul,25 Hong Zhu,26 Jason A Thomas,13 Jiang Bian,19 Jimyung Park,27 Jordi Martínez Roldán,28 Jose D Posada,29 Juan M Banda,30 Juan P Horcajada,31 Julianna Kohler,32 Karishma Shah,33 Karthik Natarajan,16,34 Kristine E Lynch,35,36 Li Liu,37 Lisa M Schilling,38 Martina Recalde,3,22 Matthew Spotnitz,14 Mengchun Gong,39 Michael E Matheny,40,41 Neus Valveny,42 Nicole G Weiskopf,21 Nigam Shah,29 Osaid Alser,43 Paula Casajust,42 Rae Woong Park,27,44 Robert Schuff,21 Sarah Seager,1 Scott L DuVall,35,36 Seng Chan You,45 Seokyoung Song,46 Sergio Fernández-Bertolín,3 Stephen Fortin,5 Tanja Magoc,19 Thomas Falconer,16 Vignesh Subbian,47 Vojtech Huser,48 Waheed-Ul-Rahman Ahmed,33,49 William Carter,38 Yin Guan,50 Yankuic Galvan,19 Xing He,19 Peter R Rijnbeek,6 George Hripcsak,16,34 Patrick B Ryan,5,16 Marc A Suchard,51 Daniel Prieto-Alhambra4 1IQVIA, Cambridge, MA, USA; 2OHDSI Center at The Roux Institute, Northeastern University, Portland, ME, USA; 3Fundació Institut Universitari per a la recerca a l’Atenció Primària de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain; 4Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK; 5Janssen Research & Development, Titusville, NJ, USA; 6Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands; 7School of Medical Sciences, University of Manchester, Manchester, UK; 8Regeneron Pharmaceuticals, Tarrytown, NY, USA; 9Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA; 10College of Pharmacy, Riyadh Elm University, Riyadh, Saudi Arabia; 11National Institute for Health and Care Excellence, London, UK; 12School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden; 13Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA; 14Unviersity of Washington Medicine, Seattle, WA, USA; 15Tufts Institute for Clinical Research and Health Policy Studies, Boston, MA, USA; 16Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA; 17Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK; 18Division of Respiratory and Critical Care Medicine, Department of Internal Medicine, Daegu Catholic University Medical Center, Daegu, South Korea; 19University of Florida Health, Gainesville, FL, USA; 20Division of Population Health and Genomics, University of Dundee, Dundee, UK; 21Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; 22Universitat Autònoma de Barcelona, Barcelona, Spain; 23O’Brien Institute for Public Health, Faculty of Medicine, University of Calgary, Calgary, Canada; 24IOMED, Barcelona, Spain; 25Faculty of Medicine, Islamic University of Gaza, Gaza, Palestine; 26Nanfang Hospital, Southern Medical University, Guangzhou, People’s Republic of China; 27Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea; 28Director of Innovation and Digital Transformation, Hospital del Mar, Barcelona, Spain; 29Department of Medicine, School of Medicine, Stanford University, Redwood City, CA, USA; 30Georgia State University, Department of Computer Science, Atlanta, GA, USA; 31Department of Infectious Diseases, Hospital del Mar, Institut Hospital del Mar d’Investigació Mèdica (IMIM), Universitat Autònoma de Barcelona, Universitat Pompeu Fabra, Barcelona, Spain; 32United States Agency for International Development, Washington, DC, USA; 33Botnar Research Centre, NDORMS, University of Oxford, Oxford, UK; 34New York-Presbyterian Hospital, New York, NY, USA; 35VA Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT, USA; 36Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, USA; 37Biomedical Big Data Center, Nanfang Hospital, Southern Medical University, Guangzhou, People’s Republic of China; 38Data Science to Patient Value Program, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; 39Institute of Health Management, Southern Medical University, Guangzhou, People’s Republic of China; 40Tennessee Valley Healthcare System, Veterans Affairs Medical Center, Nashville, TN, USA; 41Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA; 42Real-World Evidence, TFS, Barcelona, Spain; 43Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; 44Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; 45Department of Preventive Medicine, Yonsei University College of Medicine, Seoul, South Korea; 46Department of Anesthesiology and Pain Medicine, Catholic University of Daegu, School of Medicine, Daegu, South Korea; 47College of Engineering, The University of Arizona, Tucson, AZ, USA; 48National Library of Medicine, National Institutes of Health, Bethesda, MD, USA; 49College of Medicine and Health, University of Exeter, St Luke’s Campus, Exeter, UK; 50DHC Technologies Co. Ltd., Beijing, People’s Republic of China; 51Departments of Biostatistics, Computational Medicine, and Human Genetics, University of California, Los Angeles, CA, USACorrespondence: Daniel Prieto-Alhambra, Botnar Research Centre, Windmill Road, Oxford, OX37LD, UK, Email [email protected]: Routinely collected real world data (RWD) have great utility in aiding the novel coronavirus disease (COVID-19) pandemic response. Here we present the international Observational Health Data Sciences and Informatics (OHDSI) Characterizing Health Associated Risks and Your Baseline Disease In SARS-COV-2 (CHARYBDIS) framework for standardisation and analysis of COVID-19 RWD.Patients and Methods: We conducted a descriptive retrospective database study using a federated network of data partners in the United States, Europe (the Netherlands, Spain, the UK, Germany, France and Italy) and Asia (South Korea and China). The study protocol and analytical package were released on 11th June 2020 and are iteratively updated via GitHub. We identified three non-mutually exclusive cohorts of 4,537,153 individuals with a clinical COVID-19 diagnosis or positive test, 886,193 hospitalized with COVID-19, and 113,627 hospitalized with COVID-19 requiring intensive services.Results: We aggregated over 22,000 unique characteristics describing patients with COVID-19. All comorbidities, symptoms, medications, and outcomes are described by cohort in aggregate counts and are readily available online. Globally, we observed similarities in the USA and Europe: more women diagnosed than men but more men hospitalized than women, most diagnosed cases between 25 and 60 years of age versus most hospitalized cases between 60 and 80 years of age. South Korea differed with more women than men hospitalized. Common comorbidities included type 2 diabetes, hypertension, chronic kidney disease and heart disease. Common presenting symptoms were dyspnea, cough and fever. Symptom data availability was more common in hospitalized cohorts than diagnosed.Conclusion: We constructed a global, multi-centre view to describe trends in COVID-19 progression, management and evolution over time. By characterising baseline variability in patients and geography, our work provides critical context that may otherwise be misconstrued as data quality issues. This is important as we perform studies on adverse events of special interest in COVID-19 vaccine surveillance.Keywords: OHDSI, OMOP CDM, descriptive epidemiology, real world data, real world evidence, open science

Keywords