Frontiers in Plant Science (Dec 2023)
Genomic prediction and allele mining of agronomic and morphological traits in pea (Pisum sativum) germplasm collections
Abstract
Well-performing genomic prediction (GP) models for polygenic traits and molecular marker sets for oligogenic traits could be useful for identifying promising genetic resources in germplasm collections, setting core collections, and establishing molecular variety distinction. This study aimed at (i) defining GP models and key marker sets for predicting 15 agronomic or morphological traits in germplasm collections, (ii) verifying the GP model usefulness also for selection in breeding programs, (iii) investigating the consistency between molecular and phenotypic diversity patterns, and (iv) identifying genomic regions associated with to the target traits. The study was based on phenotyping data and over 41,000 genotyping-by-sequencing-generated SNP markers of 220 landraces or old cultivars belonging to a world germplasm collection and 11 modern cultivars. Non-metric multi-dimensional scaling (NMDS) and an analysis of population genetic structure indicated a high level of genetic differentiation of material from Western Asia, a major West-East diversity gradient, and quite limited genetic diversity of the improved germplasm. Mantel’s test revealed a low correlation (r = 0.12) between phenotypic and molecular diversity, which increased (r = 0.45) when considering only the molecular diversity relative to significant SNPs from genome-wide association analyses. These analyses identified, inter alia, several areas of chromosome 6 involved in a largely pleiotropic control of vegetative or reproductive organ pigmentation. We found various significant SNPs for grain and straw yield under severe drought and onset of flowering, and one SNP on chromosome 5 for grain protein content. GP models displayed moderately high predictive ability (0.43 to 0.61) for protein content, grain and straw yield, and onset of flowering, and high predictive ability (0.76) for individual seed weight, based on intra-population, intra-environment cross-validations. The inter-population, inter-environment assessment of the models trained on the germplasm collection for breeding material of three recombinant inbred line (RIL) populations, which was challenged by much narrower diversity of the material, over eight-fold less available markers and quite different test environments, led to an overall loss of predictive ability of about 40% for seed weight, 50% for protein content and straw yield, and 60% for onset of flowering, and no prediction for grain yield. Within-RIL population predictive ability differed among populations.
Keywords