IEEE Access (Jan 2020)

Predicting Sub-Golgi Apparatus Resident Protein With Primary Sequence Hybrid Features

  • Chunyu Wang,
  • Jialin Li,
  • Xiaoyan Liu,
  • Maozu Guo

DOI
https://doi.org/10.1109/ACCESS.2019.2962821
Journal volume & issue
Vol. 8
pp. 4442 – 4450

Abstract

Read online

The Golgi apparatus is a significant membrane-bound organelle of eukaryotic cells that is made up of a series of flattened, stacked pouches (called cisternae). The Golgi apparatus packages proteins into membrane-bound vesicles, and so it is responsible for transporting, modifying, and packaging proteins and lipids into vesicles for delivery to targeted destinations. It belongs to the central organelle mediating system of eukaryotic cells. Functional defects of the Golgi apparatus are associated with many kinds of neurodegenerative diseases, such as Parkinson's and Alzheimer's diseases. Golgi-resident proteins play an important role in the Golgi apparatus' processing, which includes storing, packaging, and dispatching proteins. Identifying sub-Golgi protein types can help researchers to develop more effective therapies and drugs for diseases that result from disorders of Golgi-resident proteins. In this paper, we propose a computational model to discriminate cis-Golgi proteins from trans-Golgi proteins using a machine learning method. First, we use PseKNC, K-separated Bigrams, and PsePSSM as feature extraction techniques, and then we select the optimal features among those identified by PseKNC with the AdaBoost classifier. To create a balanced dataset out of the imbalanced set of Golgi proteins, we used the Random-SMOTE oversampling approach. Finally, we employed the SVM algorithm to distinguish cis-Golgi proteins from trans-Golgi proteins. The proposed method achieves promising performance, with accuracy of 96.5%, 96.5%, and 96.9% in the experiments with jackknife cross-validation, independent testing, and 10-fold cross-validation, respectively, which exceeds the performance of previous related work.

Keywords