IEEE Access (Jan 2019)
Improved Prediction of Protein-Protein Interactions Using Descriptors Derived From PSSM via Gray Level Co-Occurrence Matrix
Abstract
A better exploring biological processes, means, and functions demands trusted information about Protein-protein interactions (PPIs). High-throughput technologies have produced a large number of PPIs data for various species, however, they are resource-expensive and often suffer from high error rates. To supplement the limitations of the traditional methods, in this paper, a sequence-based computational method is proposed to insight whether two proteins interact or not. The proposed method divides the novel PPIs prediction process into three stages: first, the position-specific scoring matrices (PSSMs) are produced by incorporating the evolutionary information; second, the 352-dimensional feature vector is constructed for each protein pair; third, effective parameters for the ensemble learning algorithm rotation forest (RF) are selected. In the proposed model, the evolutionary features are extracted from PSSM for each protein without considering any protein annotations. In addition, by using more accurate and diverse classifiers constructed by RF algorithm to avoid yielding coincident errors, one sample incorrectly divided by one classifier will be corrected by another classifier. The proposed method is evaluated in terms of accuracy, precision, sensitivity, and so on using Yeast, Human, and Pylori datasets and finds that its performance is superior to that of the competing methods. Specifically, the average accuracies achieved by the proposed method are 97.06% (Yeast), 98.95% (Human), and 89.69% (H.pylori), which improves the accuracy of PPIs prediction by 0.54%~3.89% (Yeast), 1.29%~3.85% (Human), and 0.22%~4.85% (H.pylori). The experimental results prove that the proposed method is an effective alternative approach for predicting novel PPIs.
Keywords