ExAutoGP: Enhancing Genomic Prediction Stability and Interpretability with Automated Machine Learning and SHAP

Yao Rao; Lilian Zhang; Lutao Gao; Shuran Wang; Linnan Yang

doi:10.3390/ani15081172

Animals (Apr 2025)

ExAutoGP: Enhancing Genomic Prediction Stability and Interpretability with Automated Machine Learning and SHAP

Yao Rao,
Lilian Zhang,
Lutao Gao,
Shuran Wang,
Linnan Yang

Affiliations

Yao Rao: College of Big Data, Yunnan Agricultural University, Kunming 650201, China
Lilian Zhang: College of Big Data, Yunnan Agricultural University, Kunming 650201, China
Lutao Gao: College of Big Data, Yunnan Agricultural University, Kunming 650201, China
Shuran Wang: College of Big Data, Yunnan Agricultural University, Kunming 650201, China
Linnan Yang: College of Big Data, Yunnan Agricultural University, Kunming 650201, China

DOI: https://doi.org/10.3390/ani15081172
Journal volume & issue: Vol. 15, no. 8
p. 1172

Abstract

Read online

Machine learning has attracted much attention in the field of genomic prediction due to its powerful predictive capabilities, yet the lack of an explanatory nature in modeling decisions remains a major challenge. In this study, we propose a novel machine learning method, ExAutoGP, which aims to improve the accuracy of genomic prediction and enhance the transparency of the model by combining automated machine learning (AutoML) with SHapley Additive exPlanations (SHAP). To evaluate ExAutoGP’s effectiveness, we designed a comparative experiment consisting of a simulated dataset and two real animal datasets. For each dataset, we applied ExAutoGP and five baseline models—Genomic Best Linear Unbiased Prediction (GBLUP), BayesB, Support Vector Regression (SVR), Kernel Ridge Regression (KRR), and Random Forest (RF). All models were trained and evaluated using five repeated five-fold cross-validation, and their performance was assessed based on both predictive accuracy and computational efficiency. The results show that ExAutoGP exhibits robust and excellent prediction performance on all datasets. In addition, the SHAP method not only effectively reveals the decision-making process of ExAutoGP and enhances its interpretability, but also identifies genetic markers closely related to the traits. This study demonstrates the strong potential of AutoML in genomic prediction, while the introduction of SHAP provides actionable biological insights. The synergy of high prediction accuracy and interpretability offers new perspectives for optimizing genomic selection strategies in livestock and poultry breeding.

Published in Animals

ISSN: 2076-2615 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Agriculture: Animal culture: Veterinary medicine; Science: Zoology
Website: http://www.mdpi.com/journal/animals/

About the journal

Abstract

Keywords