Informatics in Medicine Unlocked (Jan 2024)

MONTRA2: A web platform for profiling distributed databases in the health domain

  • João Rafael Almeida,
  • José Luís Oliveira

Journal volume & issue
Vol. 45
p. 101447

Abstract

Read online

Background:: Data catalogues are used in multiple domains to provide an overview of databases’ characteristics without releasing the actual data. Despite the existence of several web-based catalogues, they do not always meet the needs of certain domains. In the healthcare field, they need to give multiple and iterative views to the data, from high-level metadata up to low-level samples or patient data. This approach is critical to help researchers find relevant datasets for their studies. Methods:: In this paper, we present MONTRA2, a web platform for profiling distributed databases. The users’ requirements were designed in the context of the EHDEN European project, in close collaboration with medical researchers, data owners, and pharmaceutical companies, leading to a rich set of functionalities to support databases and cohorts discovery. The platform was developed with a modular architecture which simplifies the integration of internal and external services. Results:: MONTRA2 is successfully being used in several European projects and research initiatives, focused on the dissemination and sharing of biomedical databases. In this paper, we present three health data catalogues that were built upon the core of this framework. MONTRA2 is publicly available under the MIT license at https://github.com/bioinformatics-ua/montra2. Conclusions:: The execution of federated studies on a large scale and involving multiple centres is only possible if adequate tools for data management and discovery are available. By providing tools for study management, database characterisation and publishing, among others, MONTRA2 simplifies the process of setting up a workspace for a community to expose the characteristics of datasets and provide multiple strategies for data analysis.

Keywords