BMC Genomics (Nov 2011)
<it>In silico</it> prediction of the granzyme B degradome
Abstract
Abstract Background Granzyme B is a serine protease which cleaves at unique tetrapeptide sequences. It is involved in several signaling cross-talks with caspases and functions as a pivotal mediator in a broad range of cellular processes such as apoptosis and inflammation. The granzyme B degradome constitutes proteins from a myriad of functional classes with many more expected to be discovered. However, the experimental discovery and validation of bona fide granzyme B substrates require time consuming and laborious efforts. As such, computational methods for the prediction of substrates would be immensely helpful. Results We have compiled a dataset of 580 experimentally verified granzyme B cleavage sites and found distinctive patterns of residue conservation and position-specific residue propensities which could be useful for in silico prediction using machine learning algorithms. We trained a series of support vector machines (SVM) classifiers employing Bayes Feature Extraction to predict cleavage sites using sequence windows of diverse lengths and compositions. The SVM classifiers achieved accuracy and AROC scores between 71.00% to 86.50% and 0.78 to 0.94 respectively on independent test sets. We have applied our prediction method on the Chikungunya viral proteome and identified several regulatory domains of viral proteins to be potential sites of granzyme B cleavage, suggesting direct antiviral activity of granzyme B during host-viral innate immune responses. Conclusions We have compiled a comprehensive dataset of granzyme B cleavage sites and developed an accurate SVM-based prediction method utilizing Bayes Feature Extraction to identify novel substrates of granzyme B in silico. The prediction server is available online, together with reference datasets and supplementary materials.