Communications Biology (Jul 2024)

High-throughput single-molecule long-read RNA sequencing analysis of tissue-specific genes and isoforms in lettuce (Lactuca sativa L.)

  • Zhuo-Xing Shi,
  • Lei Xiang,
  • Hai-Ming Zhao,
  • Lang-Qi Yang,
  • Zhi-Chao Chen,
  • Yu-Qing Pu,
  • Yan-Wen Li,
  • Bei Luo,
  • Quan-Ying Cai,
  • Bai-Lin Liu,
  • Nai-Xian Feng,
  • Hui Li,
  • Qing X. Li,
  • Chong Tang,
  • Ce-Hui Mo

DOI
https://doi.org/10.1038/s42003-024-06598-4
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Lettuce is one of the most widely cultivated and consumed dicotyledonous vegetables globally. Despite the availability of its reference genome sequence, lettuce gene annotation remains incomplete, impeding comprehensive research and the broad application of genomic resources. Long-read RNA isoform sequencing (Iso-Seq) offers substantial advantages for analyzing RNA alternative splicing and aiding gene annotation, yet it faces throughput limitations. We present the HIT-ISOseq method tailored for bulk sample analysis, significantly enhancing RNA sequencing throughput on the PacBio platform by concatenating cDNA. Here we show, HIT-ISOseq generates 3-4 cDNA molecules per CCS read in lettuce, yielding 15.7 million long reads per PacBio Sequel II SMRT Cell 8 M. We validate its effectiveness in analyzing six lettuce tissue samples, including roots, stems, and leaves, revealing tissue-specific gene expression patterns and RNA isoforms. Leveraging diverse tissue long-read RNA sequencing, we refine the transcript annotation of the lettuce reference genome, expanding its GO and KEGG annotation repertoire. Collectively, this study serves as a foundational reference for genome annotation and the analysis of multi-sample isoform expression, utilizing high-throughput long-read transcriptome sequencing.