mSystems (Aug 2024)
Metagenomic clustering links specific metabolic functions to globally relevant ecosystems
Abstract
ABSTRACT Metagenomic sequencing has advanced our understanding of biogeochemical processes by providing an unprecedented view into the microbial composition of different ecosystems. While the amount of metagenomic data has grown rapidly, simple-to-use methods to analyze and compare across studies have lagged behind. Thus, tools expressing the metabolic traits of a community are needed to broaden the utility of existing data. Gene abundance profiles are a relatively low-dimensional embedding of a metagenome’s functional potential and are, thus, tractable for comparison across many samples. Here, we compare the abundance of KEGG Ortholog Groups (KOs) from 6,539 metagenomes from the Joint Genome Institute’s Integrated Microbial Genomes and Metagenomes (JGI IMG/M) database. We find that samples cluster into terrestrial, aquatic, and anaerobic ecosystems with marker KOs reflecting adaptations to these environments. For instance, functional clusters were differentiated by the metabolism of antibiotics, photosynthesis, methanogenesis, and surprisingly GC content. Using this functional gene approach, we reveal the broad-scale patterns shaping microbial communities and demonstrate the utility of ortholog abundance profiles for representing a rapidly expanding body of metagenomic data.IMPORTANCEMetagenomics, or the sequencing of DNA from complex microbiomes, provides a view into the microbial composition of different environments. Metagenome databases were created to compile sequencing data across studies, but it remains challenging to compare and gain insight from these large data sets. Consequently, there is a need to develop accessible approaches to extract knowledge across metagenomes. The abundance of different orthologs (i.e., genes that perform a similar function across species) provides a simplified representation of a metagenome’s metabolic potential that can easily be compared with others. In this study, we cluster the ortholog abundance profiles of thousands of metagenomes from diverse environments and uncover the traits that distinguish them. This work provides a simple to use framework for functional comparison and advances our understanding of how the environment shapes microbial communities.
Keywords