PLoS Biology (Mar 2007)

The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

  • Douglas B Rusch,
  • Aaron L Halpern,
  • Granger Sutton,
  • Karla B Heidelberg,
  • Shannon Williamson,
  • Shibu Yooseph,
  • Dongying Wu,
  • Jonathan A Eisen,
  • Jeff M Hoffman,
  • Karin Remington,
  • Karen Beeson,
  • Bao Tran,
  • Hamilton Smith,
  • Holly Baden-Tillson,
  • Clare Stewart,
  • Joyce Thorpe,
  • Jason Freeman,
  • Cynthia Andrews-Pfannkoch,
  • Joseph E Venter,
  • Kelvin Li,
  • Saul Kravitz,
  • John F Heidelberg,
  • Terry Utterback,
  • Yu-Hui Rogers,
  • Luisa I Falcón,
  • Valeria Souza,
  • Germán Bonilla-Rosso,
  • Luis E Eguiarte,
  • David M Karl,
  • Shubha Sathyendranath,
  • Trevor Platt,
  • Eldredge Bermingham,
  • Victor Gallardo,
  • Giselle Tamayo-Castillo,
  • Michael R Ferrari,
  • Robert L Strausberg,
  • Kenneth Nealson,
  • Robert Friedman,
  • Marvin Frazier,
  • J Craig Venter

DOI
https://doi.org/10.1371/journal.pbio.0050077
Journal volume & issue
Vol. 5, no. 3
p. e77

Abstract

Read online

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed "fragment recruitment," addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed "extreme assembly," made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.