Frontiers in Ecology and Evolution (Nov 2014)

Hundreds of SNPs versus dozens of SSRs: which dataset better characterizes natural clonal lineages in a self-fertilizing fish?

  • Felix eMesak,
  • Andrey eTatarenkov,
  • Ryan L Earley,
  • John C. Avise

DOI
https://doi.org/10.3389/fevo.2014.00074
Journal volume & issue
Vol. 2

Abstract

Read online

For more than two decades, mitochondrial DNA sequences and simple sequence repeats (SSRs, or microsatellite loci) have served as gold standards in population genetics. More recently, next generation sequencing (NGS) has enabled researchers to address biological questions that can benefit from hundreds or even thousands of nuclear single-nucleotide polymorphisms (SNPs) generated by restriction-site associated DNA sequencing (RAD-seq). Here we compare the performance of SSR and RAD-seq SNP methods to characterize clonal patterns in a self-fertilizing and highly inbred killifish, Kryptolebias marmoratus (mangrove rivulus) in Florida. RAD-seq analyses conducted on 18 inbred lineages of mangrove rivulus obtained from western Florida and a distant location in eastern Florida unveiled 481 polymorphic RAD loci of which 129 were homozygous within individuals and 352 loci were heterozygous in at least one individual. An initial UPGMA phenogram was constructed, based on 32 microsatellite loci, and used as a benchmark for comparisons with SNP-based phenograms, using a number of different criteria for SNP selection. A phenogram produced by the homozygous SNPs was in excellent agreement with the one generated from 32 microsatellite loci. However, heterozygous SNP data and RAD loci with more than one polymorphic site contributed more noise than usable signal and were unable to resolve clades consistently. This is likely due to errors in identifying homologous loci in the absence of a reference genome. In summary, although the RAD data were powerful in distinguishing the clonal lineages identified by SSR analyses, they also carried considerable phylogenetic noise. Our results suggest that RAD-seq methods should be used with caution for inferring fine population structure, and that stringent quality controls are necessary to reduce false phylogenetic signals.

Keywords