GCB Bioenergy (Jul 2024)

Improving precision and accuracy of genetic mapping with genotyping‐by‐sequencing data in outcrossing species

  • Nicholas R. LaBonte,
  • Dessireé P. Zerpa‐Catanho,
  • Siyao Liu,
  • Liang Xiao,
  • Hongxu Dong,
  • Lindsay V. Clark,
  • Erik J. Sacks

DOI
https://doi.org/10.1111/gcbb.13167
Journal volume & issue
Vol. 16, no. 7
pp. n/a – n/a

Abstract

Read online

Abstract Genotyping‐by‐sequencing (GBS) is a widely used strategy for obtaining large numbers of genetic markers in model and non‐model organisms. In crop plants, GBS‐derived marker datasets are frequently used to perform quantitative trait locus (QTL) mapping. In some plant species, however, high heterozygosity and complex genome structure mean that researchers must use care in handling GBS data to conduct QTL mapping most effectively. Such outbred crops include most of the perennial grass and tree species used for bioenergy. To identify strategies for increasing accuracy and precision of QTL mapping using GBS data in outbred crops, we conducted an empirical study of SNP‐calling and genetic map‐building pipeline parameters in a Miscanthus sinensis population, and a complementary simulation study to estimate the relationship between genome‐wide error rate, read depth, and marker number. The bioenergy grass Miscanthus is an obligate outcrossing species with a recent (diploidized) whole‐genome duplication. For the study of empirical M. sinensis data, we compared two SNP‐calling methods (one non‐reference‐based and one reference‐based), a series of depth filters (12×, 20×, 30×, and 40×) and two map‐construction methods (i.e., marker ordering: linkage‐only and order‐corrected based on a reference genome). We found that correcting the order of markers on a linkage map by using a high‐quality reference genome improved QTL precision (shorter confidence intervals). For typical GBS datasets of between 1000 and 5000 markers to build a genetic map for biparental populations, a depth filter set at 30× to 40× applied to outbred populations provided a genome‐wide genotype‐calling error rate of less than 1%, improved accuracy of QTL point estimates and minimized type I errors for identifying QTL. Based on these results, we recommend using a reference genome to correct the marker order of genetic maps and a robust genotype depth filter to improve QTL mapping for outbred crops.

Keywords