Advanced Science (Jun 2024)

Statistical Genomics Analysis of Simple Sequence Repeats from the Paphiopedilum Malipoense Transcriptome Reveals Control Knob Motifs Modulating Gene Expression

  • Yingyi Liang,
  • Jing Hao,
  • Jieyu Wang,
  • Guoqiang Zhang,
  • Yingjuan Su,
  • Zhong‐Jian Liu,
  • Ting Wang

DOI
https://doi.org/10.1002/advs.202304848
Journal volume & issue
Vol. 11, no. 24
pp. n/a – n/a

Abstract

Read online

Abstract Simple sequence repeats (SSRs) are found in nonrandom distributions in genomes and are thought to impact gene expression. The distribution patterns of 48 295 SSRs of Paphiopedilum malipoense are mined and characterized based on the first full‐length transcriptome and comprehensive transcriptome dataset from 12 organs. Statistical genomics analyses are used to investigate how SSRs in transcripts affect gene expression. The results demonstrate the correlations between SSR distributions, characteristics, and expression level. Nine expression‐modulating motifs (expMotifs) are identified and a model is proposed to explain the effect of their key features, potency, and gene function on an intra‐transcribed region scale. The expMotif‐transcribed region combination is the most predominant contributor to the expression‐modulating effect of SSRs, and some intra‐transcribed regions are critical for this effect. Genes containing the same type of expMotif‐SSR elements in the same transcribed region are likely linked in function, regulation, or evolution aspects. This study offers novel evidence to understand how SSRs regulate gene expression and provides potential regulatory elements for plant genetic engineering.

Keywords