How to customize common data models for rare diseases: an OMOP-based implementation and lessons learned

Najia Ahmadi; Michele Zoch; Oya Guengoeze; Carlo Facchinello; Antonia Mondorf; Katharina Stratmann; Khader Musleh; Hans-Peter Erasmus; Jana Tchertov; Richard Gebler; Jannik Schaaf; Lena S. Frischen; Azadeh Nasirian; Jiabin Dai; Elisa Henke; Douglas Tremblay; Andrew Srisuwananukorn; Martin Bornhäuser; Christoph Röllig; Jan-Niklas Eckardt; Jan Moritz Middeke; Markus Wolfien; Martin Sedlmayr

doi:10.1186/s13023-024-03312-9

Orphanet Journal of Rare Diseases (Aug 2024)

How to customize common data models for rare diseases: an OMOP-based implementation and lessons learned

Najia Ahmadi,
Michele Zoch,
Oya Guengoeze,
Carlo Facchinello,
Antonia Mondorf,
Katharina Stratmann,
Khader Musleh,
Hans-Peter Erasmus,
Jana Tchertov,
Richard Gebler,
Jannik Schaaf,
Lena S. Frischen,
Azadeh Nasirian,
Jiabin Dai,
Elisa Henke,
Douglas Tremblay,
Andrew Srisuwananukorn,
Martin Bornhäuser,
Christoph Röllig,
Jan-Niklas Eckardt,
Jan Moritz Middeke,
Markus Wolfien,
Martin Sedlmayr

Affiliations

Najia Ahmadi: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Michele Zoch: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Oya Guengoeze: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Carlo Facchinello: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Antonia Mondorf: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Katharina Stratmann: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Khader Musleh: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Hans-Peter Erasmus: Department of Internal Medicine I, University Hospital Frankfurt, Goethe University
Jana Tchertov: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Richard Gebler: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Jannik Schaaf: Goethe University Frankfurt, University Hospital, Institute of Medical Informatics
Lena S. Frischen: University Hospital Frankfurt, Goethe University, Executive Department for Medical IT-Systems and Digitalization
Azadeh Nasirian: Center of Medical Informatics, University Hospital Carl Gustav Carus, TUD Dresden University of Technology
Jiabin Dai: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Elisa Henke: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Douglas Tremblay: Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai
Andrew Srisuwananukorn: Ohio State Comprehensive Cancer Center
Martin Bornhäuser: Department of Internal Medicine I, University Hospital Carl Gustav Carus, TUD Dresden University of Technology
Christoph Röllig: Department of Internal Medicine I, University Hospital Carl Gustav Carus, TUD Dresden University of Technology
Jan-Niklas Eckardt: Department of Internal Medicine I, University Hospital Carl Gustav Carus, TUD Dresden University of Technology
Jan Moritz Middeke: Department of Internal Medicine I, University Hospital Carl Gustav Carus, TUD Dresden University of Technology
Markus Wolfien: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology
Martin Sedlmayr: Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, TUD Dresden University of Technology

DOI: https://doi.org/10.1186/s13023-024-03312-9
Journal volume & issue: Vol. 19, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases. Methods In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM. Results We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs. Discussion This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM. Conclusion The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.

Published in Orphanet Journal of Rare Diseases

ISSN: 1750-1172 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://ojrd.biomedcentral.com

About the journal

Abstract

Keywords