A methodology for cohort harmonisation in multicentre clinical research

João Rafael Almeida; Luís Bastão Silva; Isabelle Bos; Pieter Jelle Visser; José Luís Oliveira

Informatics in Medicine Unlocked (Jan 2021)

A methodology for cohort harmonisation in multicentre clinical research

João Rafael Almeida,
Luís Bastão Silva,
Isabelle Bos,
Pieter Jelle Visser,
José Luís Oliveira

Affiliations

João Rafael Almeida: DETI/IEETA, University of Aveiro, Aveiro, Portugal; Department of Computation, University of A Coruña, A Coruña, Spain; Corresponding author at: DETI/IEETA, University of Aveiro, Aveiro, Portugal.
Luís Bastão Silva: DETI/IEETA, University of Aveiro, Aveiro, Portugal; BMD Software, Aveiro, Portugal
Isabelle Bos: Alzheimer Centre, Department of Neurology, VU University Medical Centre, Amsterdam, The Netherlands; Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience (MHeNS), Maastricht University, Maastricht, The Netherlands
Pieter Jelle Visser: Alzheimer Centre, Department of Neurology, VU University Medical Centre, Amsterdam, The Netherlands; Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience (MHeNS), Maastricht University, Maastricht, The Netherlands
José Luís Oliveira: DETI/IEETA, University of Aveiro, Aveiro, Portugal; Corresponding author at: DETI/IEETA, University of Aveiro, Aveiro, Portugal.

Journal volume & issue: Vol. 27
p. 100760

Abstract

Read online

Many clinical trials and scientific studies have been conducted aiming for better understanding of specific medical conditions. However, these studies are often based on a small number of participants due to the difficulty in finding people with similar medical characteristics and available to participate in the studies. This is particularly critical in rare diseases, where the reduced number of subjects hinders reliable findings. To generate more substantial clinical evidence by increasing the power of the analyses, researchers have started to perform data harmonisation and multiple cohort analyses. However, the analysis of heterogeneous data sources implies dealing with different data structures, terminologies, concepts, languages and, most importantly, the knowledge behind the data.In this paper, we present a methodology to harmonise different cohorts into a standard data schema, helping the research community to generate evidence from a wider variety of data sources. Our methodology was inspired by the OHDSI Common Data Model, which aims to harmonise EHR datasets for observational studies, leveraging on knowledge and open source tools to perform multicentric disease-specific studies. This proposal was validated using Alzheimer’s Disease cohorts from several countries, combining at the end 6,669 subjects and 172 clinical concepts. The harmonised datasets now enable multi-cohort querying and analysis, helping in the execution of new research. The methodology was implemented in Python language and is available, under the MIT licence, at https://bioinformatics-ua.github.io/CMToolkit/.

Published in Informatics in Medicine Unlocked

ISSN: 2352-9148 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.journals.elsevier.com/informatics-in-medicine-unlocked/

About the journal

Abstract

Keywords