IEEE Access (Jan 2021)
A Fully Automatic and Efficient Methodology for Peptide Activity Identification Using Their 3D Conformations
Abstract
Over the past decades, the understanding of peptides and proteins biological functions has been an active research topic. Latest research works in this field have suggested that protein conformations may be a key feature for gaining insights into protein biological functions. However, analyzing small and highly flexible protein chunks, namely oligopeptides made of a handful of amino acids, remains challenging because of their dynamics and wide range of conformations. In this paper, a statistical methodology based on unsupervised statistical learning is proposed for analyzing 3D conformations small and highly flexible elastin-derived peptides. The goal of this study is twofold: first, is it aimed at identifying the most frequent conformations of each peptide and to study their stability. Second, and most important, it is aimed at comparing main conformations of different elastin-derived peptides to identify the “signature” than can be linked to a biological activity. The main strength of the present work is to propose a method for confirmation recognition that is not affected by peptide rotations or translations and, hence, avoids the use of the complex superposition methods. In addition, the proposed approach uses Kernel PCA to eliminate atypical peptide conformations. Due to the instability of those peptides, removing outliers is crucial since they may dramatically impact clustering results. To extract the most frequent conformations, we propose to use a hierarchical clustering method. Eventually, a peptide activity detector is defined based on comparison of main conformation found in different peptides. The main interests of the proposed method are twofold: first, it is fully automatic method, second, it does not require any additional information or expertise and, third, it can identify conformations accurately that make peptides enabling a given biological activity. Experimental results on a large dataset of peptides conformations highlight the relevance and efficiency of the proposed method.
Keywords