BMC Genomics (Nov 2011)

Characterization of the genome of bald cypress

  • Liu Wenxuan,
  • Thummasuwan Supaphan,
  • Sehgal Sunish K,
  • Chouvarine Philippe,
  • Peterson Daniel G

DOI
https://doi.org/10.1186/1471-2164-12-553
Journal volume & issue
Vol. 12, no. 1
p. 553

Abstract

Read online

Abstract Background Bald cypress (Taxodium distichum var. distichum) is a coniferous tree of tremendous ecological and economic importance. It is a member of the family Cupressaceae which also includes cypresses, redwoods, sequoias, thujas, and junipers. While the bald cypress genome is more than three times the size of the human genome, its 1C DNA content is amongst the smallest of any conifer. To learn more about the genome of bald cypress and gain insight into the evolution of Cupressaceae genomes, we performed a Cot analysis and used Cot filtration to study Taxodium DNA. Additionally, we constructed a 6.7 genome-equivalent BAC library that we screened with known Taxodium genes and select repeats. Results The bald cypress genome is composed of 90% repetitive DNA with most sequences being found in low to mid copy numbers. The most abundant repeats are found in fewer than 25,000 copies per genome. Approximately 7.4% of the genome is single/low-copy DNA (i.e., sequences found in 1 to 5 copies). Sequencing of highly repetitive Cot clones indicates that most Taxodium repeats are highly diverged from previously characterized plant repeat sequences. The bald cypress BAC library consists of 606,336 clones (average insert size of 113 kb) and collectively provides 6.7-fold genome equivalent coverage of the bald cypress genome. Macroarray screening with known genes produced, on average, about 1.5 positive clones per probe per genome-equivalent. Library screening with Cot-1 DNA revealed that approximately 83% of BAC clones contain repetitive sequences iterated 103 to 104 times per genome. Conclusions The BAC library for bald cypress is the first to be generated for a conifer species outside of the family Pinaceae. The Taxodium BAC library was shown to be useful in gene isolation and genome characterization and should be an important tool in gymnosperm comparative genomics, physical mapping, genome sequencing, and gene/polymorphism discovery. The single/low-copy (SL) component of bald cypress is 4.6 times the size of the Arabidopsis genome. As suggested for other gymnosperms, the large amount of SL DNA in Taxodium is likely the result of divergence among ancient repeat copies and gene/pseudogene duplication.