Scientific Data (Apr 2025)
Genomes of Prochlorococcus, Synechococcus, bacteria, and viruses recovered from marine picocyanobacteria cultures based on Illumina and Qitan nanopore sequencing
Abstract
Abstract Prochlorococcus and Synechococcus are key contributors to marine primary production and play essential roles in global biogeochemical cycles. Despite the ecological importance of these two picocyanobacterial genera, current genomic datasets still lack comprehensive representation of under-sampled ocean regions, associated bacteria and viruses. To address this gap, we used a combination of second- and third-generation sequencing technologies to assemble comprehensive genomic data from 105 Picocyanobacterial enrichment cultures isolated from the Indian Ocean, the South China Sea, and the western Pacific Ocean. This dataset includes 55 Prochlorococcus and 50 Synechococcus genomes with high completeness (>98%) and low contamination (<2%), along with 308 non-redundant associated bacterial genomes derived from 1,457 medium- and high-quality non-cyanobacteria metagenome-assembled genomes (MAGs, completeness ≥50% and contamination ≤10%). Additionally, 2,113 non-redundant viral operational taxonomic units (vOTUs) were derived from a total of 7632 qualified viral contigs. This dataset provides a valuable resource for improving our understanding of the complex interactions among Prochlorococcus, Synechococcus, and their associated bacteria and viruses in marine ecosystems, offering a foundation to study their ecological roles and evolutionary dynamics.