Frontiers in Bioinformatics (Jan 2022)
Challenges in Bioinformatics Workflows for Processing Microbiome Omics Data at Scale
- Bin Hu,
- Shane Canon,
- Emiley A. Eloe-Fadrosh,
- Anubhav,
- Michal Babinski,
- Yuri Corilo,
- Karen Davenport,
- William D. Duncan,
- Kjiersten Fagnan,
- Mark Flynn,
- Brian Foster,
- David Hays,
- Marcel Huntemann,
- Elais K. Player Jackson,
- Julia Kelliher,
- Po-E. Li,
- Chien-Chi Lo,
- Douglas Mans,
- Lee Ann McCue,
- Nigel Mouncey,
- Christopher J. Mungall,
- Paul D. Piehowski,
- Samuel O. Purvine,
- Montana Smith,
- Neha Jacob Varghese,
- Donald Winston,
- Yan Xu,
- Patrick S. G. Chain
Affiliations
- Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Shane Canon
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Emiley A. Eloe-Fadrosh
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Anubhav
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Michal Babinski
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Yuri Corilo
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Karen Davenport
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- William D. Duncan
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Kjiersten Fagnan
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Mark Flynn
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Brian Foster
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- David Hays
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Marcel Huntemann
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Elais K. Player Jackson
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Julia Kelliher
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Po-E. Li
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Chien-Chi Lo
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Douglas Mans
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Lee Ann McCue
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Nigel Mouncey
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Christopher J. Mungall
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Paul D. Piehowski
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Samuel O. Purvine
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Montana Smith
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States
- Neha Jacob Varghese
- Lawrence Berkeley National Laboratory, Berkeley, CA, United States
- Donald Winston
- Polyneme LLC, New York, NY, United States
- Yan Xu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- Patrick S. G. Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, United States
- DOI
- https://doi.org/10.3389/fbinf.2021.826370
- Journal volume & issue
-
Vol. 1
Abstract
The nascent field of microbiome science is transitioning from a descriptive approach of cataloging taxa and functions present in an environment to applying multi-omics methods to investigate microbiome dynamics and function. A large number of new tools and algorithms have been designed and used for very specific purposes on samples collected by individual investigators or groups. While these developments have been quite instructive, the ability to compare microbiome data generated by many groups of researchers is impeded by the lack of standardized application of bioinformatics methods. Additionally, there are few examples of broad bioinformatics workflows that can process metagenome, metatranscriptome, metaproteome and metabolomic data at scale, and no central hub that allows processing, or provides varied omics data that are findable, accessible, interoperable and reusable (FAIR). Here, we review some of the challenges that exist in analyzing omics data within the microbiome research sphere, and provide context on how the National Microbiome Data Collaborative has adopted a standardized and open access approach to address such challenges.
Keywords