Nature Communications (May 2025)
Enabling pan-repository reanalysis for big data science of public metabolomics data
- Yasin El Abiead,
- Michael Strobel,
- Thomas Payne,
- Eoin Fahy,
- Claire O’Donovan,
- Shankar Subramamiam,
- Juan Antonio Vizcaíno,
- Ozgur Yurekten,
- Victoria Deleray,
- Simone Zuffa,
- Shipei Xing,
- Helena Mannochio-Russo,
- Ipsita Mohanty,
- Haoqi Nina Zhao,
- Andres M. Caraballo-Rodriguez,
- Paulo Wender P. Gomes,
- Nicole E. Avalon,
- Trent R. Northen,
- Benjamin P. Bowen,
- Katherine B. Louie,
- Pieter C. Dorrestein,
- Mingxun Wang
Affiliations
- Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Michael Strobel
- Department of Computer Science and Engineering, University of California Riverside
- Thomas Payne
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton
- Eoin Fahy
- Department of Bioengineering, and San Diego Supercomputer Center, University of California, San Diego
- Claire O’Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton
- Shankar Subramamiam
- Department of Bioengineering, and San Diego Supercomputer Center, University of California, San Diego
- Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton
- Ozgur Yurekten
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton
- Victoria Deleray
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Helena Mannochio-Russo
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Ipsita Mohanty
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Haoqi Nina Zhao
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Andres M. Caraballo-Rodriguez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Paulo Wender P. Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Nicole E. Avalon
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego
- Trent R. Northen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab
- Benjamin P. Bowen
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab
- Katherine B. Louie
- The DOE Joint Genome Institute, Lawrence Berkeley National Laboratory
- Pieter C. Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego
- Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside
- DOI
- https://doi.org/10.1038/s41467-025-60067-y
- Journal volume & issue
-
Vol. 16,
no. 1
pp. 1 – 7
Abstract
Abstract Public untargeted metabolomics data is a growing resource for metabolite and phenotype discovery; however, accessing and utilizing these data across repositories pose significant challenges. Therefore, here we develop pan-repository universal identifiers and harmonized cross-repository metadata. This ecosystem facilitates discovery by integrating diverse data sources from public repositories including MetaboLights, Metabolomics Workbench, and GNPS/MassIVE. Our approach simplified data handling and unlocks previously inaccessible reanalysis workflows, fostering unmatched research opportunities.