Training set design in genomic prediction with multiple biparental families

Xintian Zhu; Willmar L. Leiser; Volker Hahn; Tobias Würschum

doi:10.1002/tpg2.20124

The Plant Genome (Nov 2021)

Training set design in genomic prediction with multiple biparental families

Xintian Zhu,
Willmar L. Leiser,
Volker Hahn,
Tobias Würschum

Affiliations

Xintian Zhu: State Plant Breeding Institute Univ. of Hohenheim Stuttgart 70593 Germany
Willmar L. Leiser: State Plant Breeding Institute Univ. of Hohenheim Stuttgart 70593 Germany
Volker Hahn: State Plant Breeding Institute Univ. of Hohenheim Stuttgart 70593 Germany
Tobias Würschum: Institute of Plant Breeding, Seed Science and Population Genetics Univ. of Hohenheim Stuttgart 70593 Germany

DOI: https://doi.org/10.1002/tpg2.20124
Journal volume & issue: Vol. 14, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Genomic selection is a powerful tool to reduce the cycle length and enhance the genetic gain of complex traits in plant breeding. However, questions remain about the optimum design and composition of the training set. In this study, we used 944 soybean [Glycine max (L.) Merr.] recombinant inbred lines from eight families derived through a partial–diallel mating design among five parental lines. The cross‐validated prediction accuracies for the six traits seed yield, 1,000‐seed weight, protein yield, plant height, protein content, and oil content were high, ranging from 0.79 to 0.87. We investigated among‐family predictions, making use of the special mating design with different degrees of relatedness among families. Generally, the prediction accuracy decreased from full‐sibs to half‐sib families to unrelated families. However, half‐sib and unrelated families also showed substantial variation in their prediction accuracy for a given family, which appeared to be caused at least in part by the shared segregation of quantitative trait loci in both the training and prediction sets. Combining several half‐sib families in composite training sets generally led to an increase in the prediction accuracy compared with the best family alone. The prediction accuracy increased with the size of the training set, but for comparable prediction accuracy, substantially more half‐sibs were required than full‐sibs. Collectively, our results highlight the potential of genomic selection for soybean breeding and, in a broader context, illustrate the importance of the targeted design of the training set.

Published in The Plant Genome

ISSN: 1940-3372 (Online)
Publisher: Wiley
Country of publisher: United States
LCC subjects: Agriculture: Plant culture; Science: Biology (General): Genetics
Website: https://acsess.onlinelibrary.wiley.com/journal/19403372

About the journal