Frontiers in Genetics (Jun 2024)

SABO-ILSTSVR: a genomic prediction method based on improved least squares twin support vector regression

  • Rui Li,
  • Rui Li,
  • Jing Gao,
  • Jing Gao,
  • Jing Gao,
  • Ganghui Zhou,
  • Ganghui Zhou,
  • Dongshi Zuo,
  • Dongshi Zuo,
  • Yao Sun,
  • Yao Sun

DOI
https://doi.org/10.3389/fgene.2024.1415249
Journal volume & issue
Vol. 15

Abstract

Read online

In modern breeding practices, genomic prediction (GP) uses high-density single nucleotide polymorphisms (SNPs) markers to predict genomic estimated breeding values (GEBVs) for crucial phenotypes, thereby speeding up selection breeding process and shortening generation intervals. However, due to the characteristic of genotype data typically having far fewer sample numbers than SNPs markers, overfitting commonly arise during model training. To address this, the present study builds upon the Least Squares Twin Support Vector Regression (LSTSVR) model by incorporating a Lasso regularization term named ILSTSVR. Because of the complexity of parameter tuning for different datasets, subtraction average based optimizer (SABO) is further introduced to optimize ILSTSVR, and then obtain the GP model named SABO-ILSTSVR. Experiments conducted on four different crop datasets demonstrate that SABO-ILSTSVR outperforms or is equivalent in efficiency to widely-used genomic prediction methods. Source codes and data are available at: https://github.com/MLBreeding/SABO-ILSTSVR.

Keywords