The Plant Genome (Mar 2015)

Efficient Use of Historical Data for Genomic Selection: A Case Study of Stem Rust Resistance in Wheat

  • J. Rutkoski,
  • R. P. Singh,
  • J. Huerta-Espino,
  • S. Bhavani,
  • J. Poland,
  • J. L. Jannink,
  • M. E. Sorrells

DOI
https://doi.org/10.3835/plantgenome2014.09.0046
Journal volume & issue
Vol. 8, no. 1

Abstract

Read online

Genomic selection (GS) is a methodology that can improve crop breeding efficiency. To implement GS, a training population (TP) with phenotypic and genotypic data is required to train a statistical model used to predict genotyped selection candidates (SCs). A key factor impacting prediction accuracy is the relationship between the TP and the SCs. This study used empirical data for quantitative adult plant resistance to stem rust of wheat ( L.) to investigate the utility of a historical TP (TP) compared with a population-specific TP (TP), the potential for TP optimization, and the utility of TP data when close relative data is available for training. We found that, depending on the population size, a TP was 1.5 to 4.4 times more accurate than a TP, and TP optimization based on the mean of the generalized coefficient of determination or prediction error variance enabled the selection of subsets that led to significantly higher accuracy than randomly selected subsets. Retaining historical data when data on close relatives were available lead to a 11.9% increase in accuracy, at best, and a 12% decrease in accuracy, at worst, depending on the heritability. We conclude that historical data could be used successfully to initiate a GS program, especially if the dataset is very large and of high heritability. Training population optimization would be useful for the identification of TP subsets to phenotype additional traits. However, after model updating, discarding historical data may be warranted. More studies are needed to determine if these observations represent general trends.