Advanced Science (Apr 2024)
Efficient Recovery of Complete Gut Viral Genomes by Combined Short‐ and Long‐Read Sequencing
Abstract
Abstract Current metagenome assembled human gut phage catalogs contained mostly fragmented genomes. Here, comprehensive gut virome detection procedure is developed involving virus‐like particle (VLP) enrichment from ≈500 g feces and combined sequencing of short‐ and long‐read. Applied to 135 samples, a Chinese Gut Virome Catalog (CHGV) is assembled consisting of 21,499 non‐redundant viral operational taxonomic units (vOTUs) that are significantly longer than those obtained by short‐read sequencing and contained ≈35% (7675) complete genomes, which is ≈nine times more than those in the Gut Virome Database (GVD, ≈4%, 1,443). Interestingly, the majority (≈60%, 13,356) of the CHGV vOTUs are obtained by either long‐read or hybrid assemblies, with little overlap with those assembled from only the short‐read data. With this dataset, vast diversity of the gut virome is elucidated, including the identification of 32% (6,962) novel vOTUs compare to public gut virome databases, dozens of phages that are more prevalent than the crAssphages and/or Gubaphages, and several viral clades that are more diverse than the two. Finally, the functional capacities are also characterized of the CHGV encoded proteins and constructed a viral‐host interaction network to facilitate future research and applications.
Keywords