Nature Communications (May 2025)

Enabling pan-repository reanalysis for big data science of public metabolomics data

  • Yasin El Abiead,
  • Michael Strobel,
  • Thomas Payne,
  • Eoin Fahy,
  • Claire O’Donovan,
  • Shankar Subramamiam,
  • Juan Antonio Vizcaíno,
  • Ozgur Yurekten,
  • Victoria Deleray,
  • Simone Zuffa,
  • Shipei Xing,
  • Helena Mannochio-Russo,
  • Ipsita Mohanty,
  • Haoqi Nina Zhao,
  • Andres M. Caraballo-Rodriguez,
  • Paulo Wender P. Gomes,
  • Nicole E. Avalon,
  • Trent R. Northen,
  • Benjamin P. Bowen,
  • Katherine B. Louie,
  • Pieter C. Dorrestein,
  • Mingxun Wang

DOI
https://doi.org/10.1038/s41467-025-60067-y
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Public untargeted metabolomics data is a growing resource for metabolite and phenotype discovery; however, accessing and utilizing these data across repositories pose significant challenges. Therefore, here we develop pan-repository universal identifiers and harmonized cross-repository metadata. This ecosystem facilitates discovery by integrating diverse data sources from public repositories including MetaboLights, Metabolomics Workbench, and GNPS/MassIVE. Our approach simplified data handling and unlocks previously inaccessible reanalysis workflows, fostering unmatched research opportunities.