IEEE Access (Jan 2019)

A Fast and Robust Support Vector Machine With Anti-Noise Convex Hull and its Application in Large-Scale ncRNA Data Classification

  • Xiaoqing Gu,
  • Tongguang Ni,
  • Yiqing Fan

DOI
https://doi.org/10.1109/ACCESS.2019.2941986
Journal volume & issue
Vol. 7
pp. 134730 – 134741

Abstract

Read online

Support vector machine (SVM) achieves successful classification performance with the application in non-coding RNA (ncRNA) data. With the rapid increase of the species and sizes of ncRNA sequences, several fast SVM methods based on data distribution and contour information have been developed to reduce their time complexity. However, they are sensitive to both noise and class imbalance problems. In this paper, a fast and robust SVM with anti-noise convex hull for large-scale ncRNA data classification (called FRSVM-ANCH) is proposed. FRSVM-ANCH discards the outliers in the feature space and obtains the convex hull of different classes. Then, the convex hull as the training data, along with its weight is used to train the SVM. Due to less sensitive to noise, pinball loss is adopted in SVM classifier. Theoretical analysis and experimental results verify the advantages of FRSVM-ANCH in classification performance and training time on large scale noisy and imbalanced ncRNA datasets.

Keywords