JMIR Medical Informatics (Jul 2023)

Interoperable, Domain-Specific Extensions for the German Corona Consensus (GECCO) COVID-19 Research Data Set Using an Interdisciplinary, Consensus-Based Workflow: Data Set Development Study

  • Gregor Lichtner,
  • Thomas Haese,
  • Sally Brose,
  • Larissa Röhrig,
  • Liudmila Lysyakova,
  • Stefanie Rudolph,
  • Maria Uebe,
  • Julian Sass,
  • Alexander Bartschke,
  • David Hillus,
  • Florian Kurth,
  • Leif Erik Sander,
  • Falk Eckart,
  • Nicole Toepfner,
  • Reinhard Berner,
  • Anna Frey,
  • Marcus Dörr,
  • Jörg Janne Vehreschild,
  • Christof von Kalle,
  • Sylvia Thun

DOI
https://doi.org/10.2196/45496
Journal volume & issue
Vol. 11
p. e45496

Abstract

Read online

BackgroundThe COVID-19 pandemic has spurred large-scale, interinstitutional research efforts. To enable these efforts, researchers must agree on data set definitions that not only cover all elements relevant to the respective medical specialty but also are syntactically and semantically interoperable. Therefore, the German Corona Consensus (GECCO) data set was developed as a harmonized, interoperable collection of the most relevant data elements for COVID-19–related patient research. As the GECCO data set is a compact core data set comprising data across all medical fields, the focused research within particular medical domains demands the definition of extension modules that include data elements that are the most relevant to the research performed in those individual medical specialties. ObjectiveWe aimed to (1) specify a workflow for the development of interoperable data set definitions that involves close collaboration between medical experts and information scientists and (2) apply the workflow to develop data set definitions that include data elements that are the most relevant to COVID-19–related patient research regarding immunization, pediatrics, and cardiology. MethodsWe developed a workflow to create data set definitions that were (1) content-wise as relevant as possible to a specific field of study and (2) universally usable across computer systems, institutions, and countries (ie, interoperable). We then gathered medical experts from 3 specialties—infectious diseases (with a focus on immunization), pediatrics, and cardiology—to select data elements that were the most relevant to COVID-19–related patient research in the respective specialty. We mapped the data elements to international standardized vocabularies and created data exchange specifications, using Health Level Seven International (HL7) Fast Healthcare Interoperability Resources (FHIR). All steps were performed in close interdisciplinary collaboration with medical domain experts and medical information specialists. Profiles and vocabulary mappings were syntactically and semantically validated in a 2-stage process. ResultsWe created GECCO extension modules for the immunization, pediatrics, and cardiology domains according to pandemic-related requests. The data elements included in each module were selected, according to the developed consensus-based workflow, by medical experts from these specialties to ensure that the contents aligned with their research needs. We defined data set specifications for 48 immunization, 150 pediatrics, and 52 cardiology data elements that complement the GECCO core data set. We created and published implementation guides, example implementations, and data set annotations for each extension module. ConclusionsThe GECCO extension modules, which contain data elements that are the most relevant to COVID-19–related patient research on infectious diseases (with a focus on immunization), pediatrics, and cardiology, were defined in an interdisciplinary, iterative, consensus-based workflow that may serve as a blueprint for developing further data set definitions. The GECCO extension modules provide standardized and harmonized definitions of specialty-related data sets that can help enable interinstitutional and cross-country COVID-19 research in these specialties.