Animals (Apr 2025)

ExAutoGP: Enhancing Genomic Prediction Stability and Interpretability with Automated Machine Learning and SHAP

  • Yao Rao,
  • Lilian Zhang,
  • Lutao Gao,
  • Shuran Wang,
  • Linnan Yang

DOI
https://doi.org/10.3390/ani15081172
Journal volume & issue
Vol. 15, no. 8
p. 1172

Abstract

Read online

Machine learning has attracted much attention in the field of genomic prediction due to its powerful predictive capabilities, yet the lack of an explanatory nature in modeling decisions remains a major challenge. In this study, we propose a novel machine learning method, ExAutoGP, which aims to improve the accuracy of genomic prediction and enhance the transparency of the model by combining automated machine learning (AutoML) with SHapley Additive exPlanations (SHAP). To evaluate ExAutoGP’s effectiveness, we designed a comparative experiment consisting of a simulated dataset and two real animal datasets. For each dataset, we applied ExAutoGP and five baseline models—Genomic Best Linear Unbiased Prediction (GBLUP), BayesB, Support Vector Regression (SVR), Kernel Ridge Regression (KRR), and Random Forest (RF). All models were trained and evaluated using five repeated five-fold cross-validation, and their performance was assessed based on both predictive accuracy and computational efficiency. The results show that ExAutoGP exhibits robust and excellent prediction performance on all datasets. In addition, the SHAP method not only effectively reveals the decision-making process of ExAutoGP and enhances its interpretability, but also identifies genetic markers closely related to the traits. This study demonstrates the strong potential of AutoML in genomic prediction, while the introduction of SHAP provides actionable biological insights. The synergy of high prediction accuracy and interpretability offers new perspectives for optimizing genomic selection strategies in livestock and poultry breeding.

Keywords