G3: Genes, Genomes, Genetics (Jul 2020)
Training Population Optimization for Genomic Selection in Miscanthus
Abstract
Miscanthus is a perennial grass with potential for lignocellulosic ethanol production. To ensure its utility for this purpose, breeding efforts should focus on increasing genetic diversity of the nothospecies Miscanthus × giganteus (M×g) beyond the single clone used in many programs. Germplasm from the corresponding parental species M. sinensis (Msi) and M. sacchariflorus (Msa) could theoretically be used as training sets for genomic prediction of M×g clones with optimal genomic estimated breeding values for biofuel traits. To this end, we first showed that subpopulation structure makes a substantial contribution to the genomic selection (GS) prediction accuracies within a 538-member diversity panel of predominately Msi individuals and a 598-member diversity panels of Msa individuals. We then assessed the ability of these two diversity panels to train GS models that predict breeding values in an interspecific diploid 216-member M×g F2 panel. Low and negative prediction accuracies were observed when various subsets of the two diversity panels were used to train these GS models. To overcome the drawback of having only one interspecific M×g F2 panel available, we also evaluated prediction accuracies for traits simulated in 50 simulated interspecific M×g F2 panels derived from different sets of Msi and diploid Msa parents. The results revealed that genetic architectures with common causal mutations across Msi and Msa yielded the highest prediction accuracies. Ultimately, these results suggest that the ideal training set should contain the same causal mutations segregating within interspecific M×g populations, and thus efforts should be undertaken to ensure that individuals in the training and validation sets are as closely related as possible.
Keywords