IEEE Access (Jan 2020)
IoMT-Based Association Rule Mining for the Prediction of Human Protein Complexes
Abstract
The inspiring increase in the Internet-enabling devices has influenced health industry due to the nature of these devices where they offer health related information swiftly. One of the prominent characteristics of these devices is to provide physicians with effective diagnosis of sensitive diseases. Internet of Medical Things (IoMT) is a means of connecting medical devices to computing nodes with the help of Internet for affording real-time communications between patients and clinicians to understand the interaction of human protein complexes. A secure and correct protein complex prediction plays an important job in perceiving the principal method of various cellular determinations and to elucidate the functionality of different un-annotated proteins. Different experimental schemes have been evolved to accomplish this task, however, these schemes have high error rates and are not efficient in terms of time, cost, privacy, and security. To tackle these limitations, numerous computational models have been developed that consider a protein complex as a dense sub-graph and utilize some basic topological properties such as density and degree statistics as a feature set for protein complex prediction. Different kinds of sub-graph structures, e.g., ring, star, linear, and hybrid have also been found in Protein-Protein Interaction Network (PPIN), therefore, more advance topological properties may be helpful to predict these structures. Moreover, the amino acid sequence of protein determines its formation, thus, the sequence information is important for predicting the interacting property among proteins in a secure way. In this study, we have computed basic as well as advance topological features by considering the interaction network of human protein complexes in the IoMT environment. In addition, biological features, i.e., discrete wavelet coefficients, length, and entropy from amino acid sequences of proteins have been computed. The supervised learning method based on association rules such as Partial Tree (PART) and Non-Nested Generalized Exemplars (NNGE) are trained to identify human protein complexes on the basis of integrated topological and biological properties. The 10-fold cross validation is exercised to measure the proposed methods. Experimental results show that association rule learners with integrated features outperform other complex mining algorithms, i.e., probabilistic Bayesian Network (BN), and Random Forest, in terms of accuracy and efficiency in addition to provide privacy.
Keywords