BMC Genomics (Oct 2017)

An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta

  • Xiaolong Cao,
  • Haobo Jiang

DOI
https://doi.org/10.1186/s12864-017-4147-y
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes. Results We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns. Conclusion We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect.

Keywords