BMC Genomics (May 2019)

Transcriptome assembly for a colour-polymorphic grasshopper (Gomphocerus sibiricus) with a very large genome size

  • Abhijeet Shah,
  • Joseph I. Hoffman,
  • Holger Schielzeth

DOI
https://doi.org/10.1186/s12864-019-5756-4
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background The club-legged grasshopper Gomphocerus sibiricus is a Gomphocerinae grasshopper with a promising future as model species for studying the maintenance of colour-polymorphism, the genetics of sexual ornamentation and genome size evolution. However, limited molecular resources are available for this species. Here, we present a de novo transcriptome assembly as reference resource for gene expression studies. We used high-throughput Illumina sequencing to generate 5,070,036 paired-end reads after quality filtering. We then combined the best-assembled contigs from three different de novo transcriptome assemblers (Trinity, SOAPdenovo-trans and Oases/Velvet) into a single assembly. Results This resulted in 82,251 contigs with a N50 of 1357 and a TransRate assembly score of 0.325, which compares favourably with other orthopteran transcriptome assemblies. Around 87% of the transcripts could be annotated using InterProScan 5, BLASTx and the dammit! annotation pipeline. We identified a number of genes involved in pigmentation and green pigment metabolism pathways. Furthermore, we identified 76,221 putative single nucleotide polymorphisms residing in 8400 contigs. We also assembled the mitochondrial genome and investigated levels of sequence divergence with other species from the genus Gomphocerus. Finally, we detected and assembled Wolbachia sequences, which revealed close sequence similarity to the strain pel wPip. Conclusions Our study has generated a significant resource for uncovering genotype-phenotype associations in a species with an extraordinarily large genome, while also providing mitochondrial and Wolbachia sequences that will be useful for comparative studies.

Keywords