Guangxi Zhiwu (Jul 2023)
Full-length transcriptome information for Tibetan medicine “Zangyinchen” of original plant Comastoma polycladum
Abstract
Comastoma polycladum is one of the original plant of Tibetan medicine “Zangyinchen”, which contains abundant medical components. To further know the transcriptome of C. polycladum and enrich its genetic information of gene annotation and metabolic pathway, the Pacbio sequencing platform was used to perform full-length transcriptome sequencing of C. polycladum leaves. The results were as follows: (1) A total of 17 Gb of sequencing data was collected, and 87 814 high-quality full-length transcripts were obtained by clustering and de-redundancy of 795 698 circular consistency sequences (CCSs) sequences. (2) Comparing with seven databases, 277 451 transcripts were annotated successfully, and in NR database with 39 214 transcripts annotated the most transcripts. A total of 26 396 transcripts were annotated to the KOG database, with 26 subcategories, and a total of 39 104 transcripts with six major pathways and 40 secondary pathways to the KEGG database. A total of 39 102 transcripts were annotated to the GO database, which were divided into three major categories: molecular function, biological process and cellular component. (3) SSR analysis yielded 22 861 SSRs, with single-base repeats being the most abundant. A total of 1 874 transcription factors and 15 166 long non-coding RNAs (LncRNAs) were identified, and the C3H transcription factor family had the most annotated transcripts. (4) A total of 55 transcripts involved in the synthesis of monoterpenes and flavonoids were screened out. These results enrich the transcriptome information of C. polycladum, and provide significant genetic resources for further screening of candidate genes related to the synthesis of medicinal components of C. polycladum.
Keywords