PLoS Computational Biology (Nov 2011)

The statistics of bulk segregant analysis using next generation sequencing.

  • Paul M Magwene,
  • John H Willis,
  • John K Kelly

DOI
https://doi.org/10.1371/journal.pcbi.1002255
Journal volume & issue
Vol. 7, no. 11
p. e1002255

Abstract

Read online

We describe a statistical framework for QTL mapping using bulk segregant analysis (BSA) based on high throughput, short-read sequencing. Our proposed approach is based on a smoothed version of the standard G statistic, and takes into account variation in allele frequency estimates due to sampling of segregants to form bulks as well as variation introduced during the sequencing of bulks. Using simulation, we explore the impact of key experimental variables such as bulk size and sequencing coverage on the ability to detect QTLs. Counterintuitively, we find that relatively large bulks maximize the power to detect QTLs even though this implies weaker selection and less extreme allele frequency differences. Our simulation studies suggest that with large bulks and sufficient sequencing depth, the methods we propose can be used to detect even weak effect QTLs and we demonstrate the utility of this framework by application to a BSA experiment in the budding yeast Saccharomyces cerevisiae.