Frontiers in Plant Science (Jul 2025)

SaGP: identifying plant saline-alkali tolerance genes based on machine learning techniques

  • Baixue Qiao,
  • Baixue Qiao,
  • Baixue Qiao,
  • Wentao Gao,
  • Xudong Zhang,
  • Xudong Zhang,
  • Min Du,
  • Min Du,
  • Shuda Wang,
  • Shuda Wang,
  • Xuanrui Liu,
  • Xuanrui Liu,
  • Shaozi Pang,
  • Shaozi Pang,
  • Chunxue Yang,
  • Jiang Wang,
  • Jiang Wang,
  • Yuming Zhao,
  • Linan Xie,
  • Linan Xie

DOI
https://doi.org/10.3389/fpls.2025.1629794
Journal volume & issue
Vol. 16

Abstract

Read online

Mining novel genes underlying agronomical traits is a crucial subject in plant biology, essential for enhancing crop quality, ensuring food security, and preserving biodiversity. Wet experiments are the main methods to uncover genes with target functions but are expensive and time-consuming. Machine learning, in contrast, can accelerate the gene discovery process by learning from accumulated data, making it more efficient and cost-effective. However, despite their potential, existing machine-learning tools to mine stress-resistant genes in plants are scarce. In this study, we developed the first known machine learning model, SaGP (Saline-alkali Genes Prediction), to identify plant saline-alkali tolerance genes based on sequencing data. It outperformed traditional computational tools, i.e., BLAST, and correctly identified the latest published genes. Moreover, we utilized SaGP to evaluate three recently published genes: GhAG2, MdBPR6, and TaCCD1. SaGP correctly identified all their functions. Overall, these results suggest that SaGP can be used for the large-scale identification of saline-alkali tolerance genes and served as a framework for the development of additional automated tools, thus promoting crop breeding and plant conservation. To efficiently identify salt-alkali resistant genes in large-scale data, we developed a user-friendly, freely accessible web service platform based on SaGP (https://www.sagprediction.com/).

Keywords