RNA Biology (Dec 2023)

PacBio full-length transcriptome analysis provides new insights into transcription of chloroplast genomes

  • Jinsong Shi,
  • Shuangyong Yan,
  • Wenjing Li,
  • Xiurong Yang,
  • Zhongqiu Cui,
  • Junling Li,
  • Guangsheng Li,
  • Yuejiao Li,
  • Yanping Hu,
  • Shan Gao

DOI
https://doi.org/10.1080/15476286.2023.2214435
Journal volume & issue
Vol. 20, no. 1
pp. 248 – 256

Abstract

Read online

Chloroplast and mitochondrial DNA (cpDNA and mtDNA) are apart from nuclear DNA (nuDNA) in a eukaryotic cell. The transcription system of chloroplasts differs from those of mitochondria and eukaryotes. In contrast to nuDNA and animal mtDNA, the transcription of cpDNA is still not well understood, primarily due to the unresolved identification of transcription initiation sites (TISs) and transcription termination sites (TTSs) on the genome scale. In the present study, we characterized the transcription of chloroplast (cp) genes with greater accuracy and comprehensive information using PacBio full-length transcriptome data from Arabidopsis thaliana. The major findings included the discovery of four types of artifacts, the validation and correction of cp gene annotations, the exact identification of TISs that start with G, and the discovery of polyA-like sites as TTSs. Notably, we proposed a new model to explain cp transcription initiation and termination at the whole-genome level. Four types of artifacts, degraded RNAs and splicing intermediates deserve the attention from researchers working with PacBio full-length transcriptome data, as these contaminant sequences can lead to incorrect downstream analysis. Cp transcription initiates at multiple promoters and terminates at polyA-like sites. Our study provides new insights into cp transcription and new clues to study the evolution of promoters, TISs, TTSs and polyA tails of eukaryotic genes.

Keywords