Frontiers in Plant Science (Sep 2023)
Chloroplast genome assembly of Serjania erecta Raldk: comparative analysis reveals gene number variation and selection in protein-coding plastid genes of Sapindaceae
Abstract
Serjania erecta Raldk is an essential genetic resource due to its anti-inflammatory, gastric protection, and anti-Alzheimer properties. However, the genetic and evolutionary aspects of the species remain poorly known. Here, we sequenced and assembled the complete chloroplast genome of S. erecta and used it in a comparative analysis within the Sapindaceae family. S. erecta has a chloroplast genome (cpDNA) of 159,297 bp, divided into a Large Single Copy region (LSC) of 84,556 bp and a Small Single Copy region (SSC) of 18,057 bp that are surrounded by two Inverted Repeat regions (IRa and IRb) of 28,342 bp. Among the 12 species used in the comparative analysis, S. erecta has the fewest long and microsatellite repeats. The genome structure of Sapindaceae species is relatively conserved; the number of genes varies from 128 to 132 genes, and this variation is associated with three main factors: (1) Expansion and retraction events in the size of the IRs, resulting in variations in the number of rpl22, rps19, and rps3 genes; (2) Pseudogenization of the rps2 gene; and (3) Loss or duplication of genes encoding tRNAs, associated with the duplication of trnH-GUG in X. sorbifolium and the absence of trnT-CGU in the Dodonaeoideae subfamily. We identified 10 and 11 mutational hotspots for Sapindaceae and Sapindoideae, respectively, and identified six highly diverse regions (tRNA-Lys — rps16, ndhC – tRNA-Val, petA – psbJ, ndhF, rpl32 – ccsA, and ycf1) are found in both groups, which show potential for the development of DNA barcode markers for molecular taxonomic identification of Serjania. We identified that the psaI gene evolves under neutrality in Sapindaceae, while all other chloroplast genes are under strong negative selection. However, local positive selection exists in the ndhF, rpoC2, ycf1, and ycf2 genes. The genes ndhF and ycf1 also present high nucleotide diversity and local positive selection, demonstrating significant potential as markers. Our findings include providing the first chloroplast genome of a member of the Paullinieae tribe. Furthermore, we identified patterns in variations in the number of genes and selection in genes possibly associated with the family’s evolutionary history.
Keywords