Scientific Reports (Jul 2024)

Analyzing Medicago spp. seed morphology using GWAS and machine learning

  • Jacob Botkin,
  • Cesar Medina,
  • Sunchung Park,
  • Kabita Poudel,
  • Minhyeok Cha,
  • Yoonjung Lee,
  • Louis K. Prom,
  • Shaun J. Curtin,
  • Zhanyou Xu,
  • Ezekiel Ahn

DOI
https://doi.org/10.1038/s41598-024-67790-4
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Alfalfa is widely recognized as an important forage crop. To understand the morphological characteristics and genetic basis of seed morphology in alfalfa, we screened 318 Medicago spp., including 244 Medicago sativa subsp. sativa (alfalfa) and 23 other Medicago spp., for seed area size, length, width, length-to-width ratio, perimeter, circularity, the distance between the intersection of length & width (IS) and center of gravity (CG), and seed darkness & red–green–blue (RGB) intensities. The results revealed phenotypic diversity and correlations among the tested accessions. Based on the phenotypic data of M. sativa subsp. sativa, a genome-wide association study (GWAS) was conducted using single nucleotide polymorphisms (SNPs) called against the Medicago truncatula genome. Genes in proximity to associated markers were detected, including CPR1, MON1, a PPR protein, and Wun1(threshold of 1E−04). Machine learning models were utilized to validate GWAS, and identify additional marker-trait associations for potentially complex traits. Marker S7_33375673, upstream of Wun1, was the most important predictor variable for red color intensity and highly important for brightness. Fifty-two markers were identified in coding regions. Along with strong correlations observed between seed morphology traits, these genes will facilitate the process of understanding the genetic basis of seed morphology in Medicago spp.

Keywords