Frontiers in Plant Science (Sep 2024)

Training set optimization is a feasible alternative for perennial orphan crop domestication and germplasm management: an Acrocomia aculeata example

  • Evellyn G. O. Couto,
  • Saulo F. S. Chaves,
  • Kaio Olimpio G. Dias,
  • Jonathan A. Morales-Marroquín,
  • Alessandro Alves-Pereira,
  • Sérgio Yoshimitsu Motoike,
  • Carlos Augusto Colombo,
  • Maria Imaculada Zucchi

DOI
https://doi.org/10.3389/fpls.2024.1441683
Journal volume & issue
Vol. 15

Abstract

Read online

Orphan perennial native species are gaining importance as sustainability in agriculture becomes crucial to mitigate climate change. Nevertheless, issues related to the undomesticated status and lack of improved germplasm impede the evolution of formal agricultural initiatives. Acrocomia aculeata - a neotropical palm with potential for oil production - is an example. Breeding efforts can aid the species to reach its full potential and increase market competitiveness. Here, we present genomic information and training set optimization as alternatives to boost orphan perennial native species breeding using Acrocomia aculeata as an example. Furthermore, we compared three SNP calling methods and, for the first time, presented the prediction accuracies of three yield-related traits. We collected data for two years from 201 wild individuals. These trees were genotyped, and three references were used for SNP calling: the oil palm genome, de novo sequencing, and the A. aculeata transcriptome. The traits analyzed were fruit dry mass (FDM), pulp dry mass (PDM), and pulp oil content (OC). We compared the predictive ability of GBLUP and BayesB models in cross- and real validation procedures. Afterwards, we tested several optimization criteria regarding consistency and the ability to provide the optimized training set that yielded less risk in both targeted and untargeted scenarios. Using the oil palm genome as a reference and GBLUP models had better results for the genomic prediction of FDM, OC, and PDM (prediction accuracies of 0.46, 0.45, and 0.39, respectively). Using the criteria PEV, r-score and core collection methodology provides risk-averse decisions. Training set optimization is an alternative to improve decision-making while leveraging genomic information as a cost-saving tool to accelerate plant domestication and breeding. The optimized training set can be used as a reference for the characterization of native species populations, aiding in decisions involving germplasm collection and construction of breeding populations

Keywords