Toxins (Mar 2020)
No More Tears: Mining Sequencing Data for Novel <i>Bt</i> Cry Toxins with CryProcessor
Abstract
Bacillus thuringiensis (Bt) is a natural pathogen of insects and some other groups of invertebrates that produces three-domain Cry (3d-Cry) toxins, which are highly host-specific pesticidal proteins. These proteins represent the most commonly used bioinsecticides in the world and are used for commercial purposes on the market of insecticides, being convergent with the paradigm of sustainable growth and ecological development. Emerging resistance to known toxins in pests stresses the need to expand the list of known toxins to broaden the horizons of insecticidal approaches. For this purpose, we have elaborated a fast and user-friendly tool called CryProcessor, which allows productive and precise mining of 3d-Cry toxins. The only existing tool for mining Cry toxins, called a BtToxin_scanner, has significant limitations such as limited query size, lack of accuracy and an outdated database. In order to find a proper solution to these problems, we have developed a robust pipeline, capable of precise 3d-Cry toxin mining. The unique feature of the pipeline is the ability to search for Cry toxins sequences directly on assembly graphs, providing an opportunity to analyze raw sequencing data and overcoming the problem of fragmented assemblies. Moreover, CryProcessor is able to predict precisely the domain layout in arbitrary sequences, allowing the retrieval of sequences of definite domains beyond the bounds of a limited number of toxins presented in CryGetter. Our algorithm has shown efficiency in all its work modes and outperformed its analogues on large amounts of data. Here, we describe its main features and provide information on its benchmarking against existing analogues. CryProcessor is a novel, fast, convenient, open source (https://github.com/lab7arriam/cry_processor), platform-independent, and precise instrument with a console version and elaborated web interface (https://lab7.arriam.ru/tools/cry_processor). Its major merits could make it possible to carry out massive screening for novel 3d-Cry toxins and obtain sequences of specific domains for further comprehensive in silico experiments in constructing artificial toxins.
Keywords