BioMedInformatics (Dec 2022)

Trecode: A FAIR Eco-System for the Analysis and Archiving of Omics Data in a Combined Diagnostic and Research Setting

  • Hindrik HD Kerstens,
  • Jayne Y Hehir-Kwa,
  • Ellen van de Geer,
  • Chris van Run,
  • Shashi Badloe,
  • Alex Janse,
  • John Baker-Hernandez,
  • Sam de Vos,
  • Douwe van der Leest,
  • Eugène TP Verwiel,
  • Bastiaan BJ Tops,
  • Patrick Kemmeren

DOI
https://doi.org/10.3390/biomedinformatics3010001
Journal volume & issue
Vol. 3, no. 1
pp. 1 – 16

Abstract

Read online

The increase in speed, reliability, and cost-effectiveness of high-throughput sequencing has led to the widespread clinical application of genome (WGS), exome (WXS), and transcriptome analysis. WXS and RNA sequencing is now being implemented as the standard of care for patients and for patients included in clinical studies. To keep track of sample relationships and analyses, a platform is needed that can unify metadata for diverse sequencing strategies with sample metadata whilst supporting automated and reproducible analyses, in essence ensuring that analyses are conducted consistently and data are Findable, Accessible, Interoperable, and Reusable (FAIR).We present “Trecode”, a framework that records both clinical and research sample (meta) data and manages computational genome analysis workflows executed for both settings, thereby achieving tight integration between analysis results and sample metadata. With complete, consistent, and FAIR (meta) data management in a single platform, stacked bioinformatic analyses are performed automatically and tracked by the database, ensuring data provenance, reproducibility, and reusability, which is key in worldwide collaborative translational research. The Trecode data model, codebooks, NGS workflows, and client programs are publicly available. In addition, the complete software stack is coded in an Ansible playbook to facilitate automated deployment and adoption of Trecode by other users.

Keywords