IEEE Access (Jan 2020)

Prediction of Cyclin Protein Using Two-Step Feature Selection Technique

  • Jia-Nan Sun,
  • Hua-Yi Yang,
  • Jing Yao,
  • Hui Ding,
  • Shu-Guang Han,
  • Cheng-Yan Wu,
  • Hua Tang

DOI
https://doi.org/10.1109/ACCESS.2020.2999394
Journal volume & issue
Vol. 8
pp. 109535 – 109542

Abstract

Read online

Cyclins are a family of proteins that regulate the cell cycle by activating cyclin-dependent kinases or a group of enzymes required in the cell cycle. Constructing a model to classify Cyclins is of importance to understand their function. It is urgent to construct a machine learning based model to identify Cyclins because of low similarity between the sequence of Cyclins. In this study, a method based on support vector machine (SVM) is developed to recognize Cyclins only using amino acid sequence information. 18 feature descriptors with a total of 13151-dimension features were extracted, and the feature dimension were reduced to 8 through feature selection technique. The reserved features show some of feature descriptors such as Autocorrelation, AAC and CTDC are important in the identification of Cyclins. Jackknife cross-validation results indicate our model would classify Cyclins with an accuracy of 91.9%, which is superior to a recent study using the same data set. Our work provides an important tool for discriminating Cyclins.

Keywords