Two-step genomic prediction using artificial neural networks - an effective strategy for reducing computational costs and increasing prediction accuracy

Maurício de Oliveira  Celeri; Cynthia Aparecida Valiati  Barreto; Wagner Faria  Barbosa; Leísa Pires  Lima; Lucas Souza da  Silveira; Ana Carolina Campana Nascimento; Moyses Nascimento; Camila Ferreira Azevedo

doi:10.4025/actasciagron.v47i1.69089

Acta Scientiarum: Agronomy (Nov 2024)

Two-step genomic prediction using artificial neural networks - an effective strategy for reducing computational costs and increasing prediction accuracy

Maurício de Oliveira Celeri,
Cynthia Aparecida Valiati Barreto,
Wagner Faria Barbosa,
Leísa Pires Lima,
Lucas Souza da Silveira,
Ana Carolina Campana Nascimento,
Moyses Nascimento,
Camila Ferreira Azevedo

Affiliations

Maurício de Oliveira Celeri: Universidade Federal de Viçosa
Cynthia Aparecida Valiati Barreto: Universidade Federal de Viçosa
Wagner Faria Barbosa: Universidade Federal de Viçosa
Leísa Pires Lima: Instituto Federal de Educação, Ciência e Tecnologia do Sudeste de Minas Gerais
Lucas Souza da Silveira: Universidade Federal de Viçosa
Ana Carolina Campana Nascimento: Universidade Federal de Viçosa
Moyses Nascimento: Universidade Federal de Viçosa
Camila Ferreira Azevedo: Universidade Federal de Viçosa

DOI: https://doi.org/10.4025/actasciagron.v47i1.69089
Journal volume & issue: Vol. 47, no. 1

Abstract

Read online

Artificial neural networks (ANNs) are powerful nonparametric tools for estimating genomic breeding values (GEBVs) in genetic breeding. One significant advantage of ANNs is their ability to make predictions without requiring prior assumptions about data distribution or the relationship between genotype and phenotype. However, ANNs come with a high computational cost, and their predictions may be underestimated when including all molecular markers. This study proposes a two-step genomic prediction procedure using ANNs to address these challenges. Initially, molecular markers were selected either directly through Multivariate Adaptive Regression Splines (MARS) or indirectly based on their importance, identified through Boosting, considering the top 5, 20, and 50% of markers with the highest significance. Subsequently, the selected markers were employed for genomic prediction using ANNs. This approach was applied to two simulated traits: one with ten trait-controlling loci and heritability of 0.4 (Scenario SC1) and the other with 100 trait-controlling loci and a heritability of 0.2 (Scenario SC2). Comparisons were made between ANN predictions using marker selection and those without any marker selection. Reducing the number of markers proved to be an efficient strategy, resulting in improved accuracy, reduced mean squared error (MSE), and shorter adjustment times. The best ANN predictions were obtained with ten markers selected by MARS in SC1, and the top 5% most relevant markers selected using Boosting in SC2. As a result, in SC1, predictions using MARS achieved over a 31% increase in accuracy and a 90% reduction in MSE. In SC2, predictions using Boosting resulted in more than a 15% increase in accuracy and an 83% reduction in MSE. For both scenarios, computational time was up to ten times shorter with marker selection. Overall, the two-step prediction procedure emerged as an effective strategy for enhancing the computational and predictive performance of ANN models.

multivariate adaptive regression splines; boosting; artificial neural network; genetic breeding.

Published in Acta Scientiarum: Agronomy

ISSN: 1679-9275 (Print); 1807-8621 (Online)
Publisher: Eduem (Editora da Universidade Estadual de Maringá)
Country of publisher: Brazil
LCC subjects: Agriculture: Agriculture (General)
Website: https://periodicos.uem.br/ojs/index.php/ActaSciAgron

About the journal

Abstract

Keywords