Toxins (Jun 2022)
Improving the Annotation of the Venom Gland Transcriptome of <i>Pamphobeteus verdolaga</i>, Prospecting Novel Bioactive Peptides
Abstract
Spider venoms constitute a trove of novel peptides with biotechnological interest. Paucity of next-generation-sequencing (NGS) data generation has led to a description of less than 1% of these peptides. Increasing evidence supports the underestimation of the assembled genes a single transcriptome assembler can predict. Here, the transcriptome of the venom gland of the spider Pamphobeteus verdolaga was re-assembled, using three free access algorithms, Trinity, SOAPdenovo-Trans, and SPAdes, to obtain a more complete annotation. Assembler’s performance was evaluated by contig number, N50, read representation on the assembly, and BUSCO’s terms retrieval against the arthropod dataset. Out of all the assembled sequences with all software, 39.26% were common between the three assemblers, and 27.88% were uniquely assembled by Trinity, while 27.65% were uniquely assembled by SPAdes. The non-redundant merging of all three assemblies’ output permitted the annotation of 9232 sequences, which was 23% more when compared to each software and 28% more when compared to the previous P. verdolaga annotation; moreover, the description of 65 novel theraphotoxins was possible. In the generation of data for non-model organisms, as well as in the search for novel peptides with biotechnological interest, it is highly recommended to employ at least two different transcriptome assemblers.
Keywords