BMC Bioinformatics (Mar 2024)

Ensemble learning for integrative prediction of genetic values with genomic variants

  • Lin-Lin Gu,
  • Run-Qing Yang,
  • Zhi-Yong Wang,
  • Dan Jiang,
  • Ming Fang

DOI
https://doi.org/10.1186/s12859-024-05720-x
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Whole genome variants offer sufficient information for genetic prediction of human disease risk, and prediction of animal and plant breeding values. Many sophisticated statistical methods have been developed for enhancing the predictive ability. However, each method has its own advantages and disadvantages, so far, no one method can beat others. Results We herein propose an Ensemble Learning method for Prediction of Genetic Values (ELPGV), which assembles predictions from several basic methods such as GBLUP, BayesA, BayesB and BayesCπ, to produce more accurate predictions. We validated ELPGV with a variety of well-known datasets and a serious of simulated datasets. All revealed that ELPGV was able to significantly enhance the predictive ability than any basic methods, for instance, the comparison p-value of ELPGV over basic methods were varied from 4.853E−118 to 9.640E−20 for WTCCC dataset. Conclusions ELPGV is able to integrate the merit of each method together to produce significantly higher predictive ability than any basic methods and it is simple to implement, fast to run, without using genotype data. is promising for wide application in genetic predictions.

Keywords