PeerJ (Sep 2018)
Metabolic marker gene mining provides insight in global mcrA diversity and, coupled with targeted genome reconstruction, sheds further light on metabolic potential of the Methanomassiliicoccales
Abstract
Over the past years, metagenomics has revolutionized our view of microbial diversity. Moreover, extracting near-complete genomes from metagenomes has led to the discovery of known metabolic traits in unsuspected lineages. Genome-resolved metagenomics relies on assembly of the sequencing reads and subsequent binning of assembled contigs, which might be hampered by strain heterogeneity or low abundance of a target organism. Here we present a complementary approach, metagenome marker gene mining, and use it to assess the global diversity of archaeal methane metabolism through the mcrA gene. To this end, we have screened 18,465 metagenomes for the presence of reads matching a database representative of all known mcrA proteins and reconstructed gene sequences from the matching reads. We use our mcrA dataset to assess the environmental distribution of the Methanomassiliicoccales and reconstruct and analyze a draft genome belonging to the ‘Lake Pavin cluster’, an uncultivated environmental clade of the Methanomassiliicoccales. Analysis of the ‘Lake Pavin cluster’ draft genome suggests that this organism has a more restricted capacity for hydrogenotrophic methylotrophic methanogenesis than previously studied Methanomassiliicoccales, with only genes for growth on methanol present. However, the presence of the soluble subunits of methyltetrahydromethanopterin:CoM methyltransferase (mtrAH) provide hypothetical pathways for methanol fermentation, and aceticlastic methanogenesis that await experimental verification. Thus, we show that marker gene mining can enhance the discovery power of metagenomics, by identifying novel lineages and aiding selection of targets for in-depth analyses. Marker gene mining is less sensitive to strain heterogeneity and has a lower abundance threshold than genome-resolved metagenomics, as it only requires short contigs and there is no binning step. Additionally, it is computationally cheaper than genome resolved metagenomics, since only a small subset of reads needs to be assembled. It is therefore a suitable approach to extract knowledge from the many publicly available sequencing projects.
Keywords