BMC Plant Biology (Oct 2009)
Sampling nucleotide diversity in cotton
Abstract
Abstract Background Cultivated cotton is an annual fiber crop derived mainly from two perennial species, Gossypium hirsutum L. or upland cotton, and G. barbadense L., extra long-staple fiber Pima or Egyptian cotton. These two cultivated species are among five allotetraploid species presumably derived monophyletically between G. arboreum and G. raimondii. Genomic-based approaches have been hindered by the limited variation within species. Yet, population-based methods are being used for genome-wide introgression of novel alleles from G. mustelinum and G. tomentosum into G. hirsutum using combinations of backcrossing, selfing, and inter-mating. Recombinant inbred line populations between genetics standards TM-1, (G. hirsutum) × 3-79 (G. barbadense) have been developed to allow high-density genetic mapping of traits. Results This paper describes a strategy to efficiently characterize genomic variation (SNPs and indels) within and among cotton species. Over 1000 SNPs from 270 loci and 279 indels from 92 loci segregating in G. hirsutum and G. barbadense were genotyped across a standard panel of 24 lines, 16 of which are elite cotton breeding lines and 8 mapping parents of populations from six cotton species. Over 200 loci were genetically mapped in a core mapping population derived from TM-1 and 3-79 and in G. hirsutum breeding germplasm. Conclusion In this research, SNP and indel diversity is characterized for 270 single-copy polymorphic loci in cotton. A strategy for SNP discovery is defined to pre-screen loci for copy number and polymorphism. Our data indicate that the A and D genomes in both diploid and tetraploid cotton remain distinct from each such that paralogs can be distinguished. This research provides mapped DNA markers for intra-specific crosses and introgression of exotic germplasm in cotton.