IEEE Access (Jan 2023)
DTL-NeddSite: A Deep-Transfer Learning Architecture for Prediction of Lysine Neddylation Sites
Abstract
Neddylation, as a reversible post-translational modification (PTM), plays a role in various cellular processes. Defects in neddylation are related to human diseases. Detecting neddylation sites is necessary for revealing the mechanisms of protein neddylation. As identifying such sites through experimental methods is expensive and time-consuming, it is essential to develop in silico methods to predict neddylation sites. In this study, we constructed a few classifiers integrating various algorithms and encoding features. However, they performed poorly (AUC $\approx 0.767$ ), mainly due to the limited number ( $\sim $ 1000) of identified neddylation sites. The large number ( $>$ 100,000) of other lysine PTM sites inspired us to employ a deep transfer learning (DTL) strategy for performance improvement. We constructed a predictor, dubbed DTL-NeddSite, which adopted the DTL-based convolution neural network using the one-hot encoding approach. Specifically, the massive number of lysine PTM sites were used to build the source model, followed by the fine-tuning of the target model using neddylation sites. DTL-NeddSite compared favourably with the corresponding model without the DTL strategy in cross-validation and independent tests. For instance, the AUC value increased to 0.818. Contrary to a general DTL model that combines frozen and unfrozen layers, all the layers in DTL-NeddSite were unfrozen to re-train. We expect the DTL strategy to be widely used in newly discovered modification types with limited known sites. Furthermore, DTL-NeddSite is freely accessible at https://github.com/XuDeli123/DTL-NeddSite.
Keywords