Scientific Reports (Apr 2022)

Sequence-based evaluation of promoter context for prediction of transcription start sites in Arabidopsis and rice

  • Tosei Hiratsuka,
  • Yuko Makita,
  • Yoshiharu Y. Yamamoto

DOI
https://doi.org/10.1038/s41598-022-11169-w
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Genes are transcribed from transcription start sites (TSSs), and their position in a genome is strictly controlled to avoid mis-expression of undesired regions. In this study, we designed and developed a methodology for the evaluation of promoter context, which detects proximal promoter regions from − 200 to − 60 bp relative to a TSS, in Arabidopsis and rice genomes. The method positively evaluates spacer sequences and Regulatory Element Groups, but not core promoter elements like TATA boxes, and is able to predict the position of a TSS within a width of 200 bp. An important feature of the evaluation/prediction method is its independence of the core promoter elements, which was demonstrated by successful prediction of all the TATA, GA, and coreless types of promoters without notable differences in the accuracy of prediction. The positive relationship identified between the evaluation scores and gene expression levels suggests that this method is useful for the evaluation of promoter maturity.