FAIR compliant database development for human microbiome data samples

Mathieu Dorst; Nathan Zeevenhooven; Rory Wilding; Daniel Mende; Bernd W. Brandt; Egija Zaura; Alfons Hoekstra; Vivek M. Sheraton

doi:10.3389/fcimb.2024.1384809

Frontiers in Cellular and Infection Microbiology (May 2024)

FAIR compliant database development for human microbiome data samples

Mathieu Dorst,
Nathan Zeevenhooven,
Rory Wilding,
Daniel Mende,
Bernd W. Brandt,
Egija Zaura,
Alfons Hoekstra,
Vivek M. Sheraton

Affiliations

Mathieu Dorst: Informatics Institute, University of Amsterdam, Amsterdam, Netherlands
Nathan Zeevenhooven: Informatics Institute, University of Amsterdam, Amsterdam, Netherlands
Rory Wilding: Supabase Limited Liability Company (LLC), San Francisco, CA, United States
Daniel Mende: Amsterdam Institute of Infection and Immunity, Amsterdam University Medical Center, Amsterdam, Netherlands
Bernd W. Brandt: Department of Preventive Dentistry, Academic Centre for Dentistry Amsterdam, Vrije Universiteit Amsterdam and University of Amsterdam, Amsterdam, Netherlands
Egija Zaura: Department of Preventive Dentistry, Academic Centre for Dentistry Amsterdam, Vrije Universiteit Amsterdam and University of Amsterdam, Amsterdam, Netherlands
Alfons Hoekstra: Computational Science Lab, Informatics Institute, University of Amsterdam, Amsterdam, Netherlands
Vivek M. Sheraton: Computational Science Lab, Informatics Institute, University of Amsterdam, Amsterdam, Netherlands

DOI: https://doi.org/10.3389/fcimb.2024.1384809
Journal volume & issue: Vol. 14

Abstract

Read online

IntroductionSharing microbiome data among researchers fosters new innovations and reduces cost for research. Practically, this means that the (meta)data will have to be standardized, transparent and readily available for researchers. The microbiome data and associated metadata will then be described with regards to composition and origin, in order to maximize the possibilities for application in various contexts of research. Here, we propose a set of tools and protocols to develop a real-time FAIR (Findable. Accessible, Interoperable and Reusable) compliant database for the handling and storage of human microbiome and host-associated data.MethodsThe conflicts arising from privacy laws with respect to metadata, possible human genome sequences in the metagenome shotgun data and FAIR implementations are discussed. Alternate pathways for achieving compliance in such conflicts are analyzed. Sample traceable and sensitive microbiome data, such as DNA sequences or geolocalized metadata are identified, and the role of the GDPR (General Data Protection Regulation) data regulations are considered. For the construction of the database, procedures have been realized to make data FAIR compliant, while preserving privacy of the participants providing the data.Results and discussionAn open-source development platform, Supabase, was used to implement the microbiome database. Researchers can deploy this real-time database to access, upload, download and interact with human microbiome data in a FAIR complaint manner. In addition, a large language model (LLM) powered by ChatGPT is developed and deployed to enable knowledge dissemination and non-expert usage of the database.

Published in Frontiers in Cellular and Infection Microbiology

ISSN: 2235-2988 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Microbiology
Website: http://www.frontiersin.org/Cellular-and-Infection-Microbiology

About the journal

Abstract

Keywords