Scientific Reports (Oct 2022)

Combination of long-read and short-read sequencing provides comprehensive transcriptome and new insight for Chrysanthemum morifolium ray-floret colorization

  • Mitsuko Kishi-Kaboshi,
  • Tsuyoshi Tanaka,
  • Katsutomo Sasaki,
  • Naonobu Noda,
  • Ryutaro Aida

DOI
https://doi.org/10.1038/s41598-022-22589-z
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Chrysanthemum morifolium is one of the most popular ornamental plants globally. Owing to its large and complex genome (around 10 Gb, segmental hexaploid), it has been difficult to obtain comprehensive transcriptome, which will promote to perform new breeding technique, such as genome editing, in C. morifolium. In this study, we used single-molecule real-time (SMRT) sequencing and RNA-seq technologies, combined them with an error-correcting process, and obtained high-coverage ray-floret transcriptome. The SMRT-seq data increased the ratio of long mRNAs containing complete open-reading frames, and the combined dataset provided a more complete transcriptomic data than those produced from either SMRT-seq or RNA-seq-derived transcripts. We finally obtained ‘Sei Arabella’ transcripts containing 928,645 non-redundant mRNA, which showed 96.6% Benchmarking Universal Single-Copy Orthologs (BUSCO) score. We also validated the reliability of the dataset by analyzing a mapping rate, annotation and transcript expression. Using the dataset, we searched anthocyanin biosynthesis gene orthologs and performed a qRT-PCR experiment to assess the usability of the dataset. The assessment of the dataset and the following analysis indicated that our dataset is reliable and useful for molecular biology. The combination of sequencing methods provided genetic information and a way to analyze the complicated C. morifolium transcriptome.