IEEE Access (Jan 2020)

Robust Multi-Linear Fuzzy SVR Designed With the Aid of Fuzzy C-Means Clustering Based on Insensitive Data Information

  • Zheng Wang,
  • Cheng Yang,
  • Sung-Kwun Oh,
  • Zunwei Fu,
  • Witold Pedrycz

DOI
https://doi.org/10.1109/ACCESS.2020.3030083
Journal volume & issue
Vol. 8
pp. 184997 – 185011

Abstract

Read online

Multiple SVR based on ensemble learning could be enhanced from the viewpoint of the performance, but the performance of modeling closely depends on the initial condition of the partitioning method and they are easily affected by noise and outliers. In this study, a multi-linear fuzzy support vector regression (MFSVR) robust to noise is proposed with the aid of the composite kernel function and $\varepsilon $ -fuzzy c-means (FCM) clustering based on insensitive data information. Here insensitive data information stands for the interval data information of “$\varepsilon $ ” which stands for insensitive loss parameter used in the $\varepsilon $ - insensitive loss function. The objective of this study is to reduce the effect of noise and to alleviate the overfitting problem through the synergistic effect of the following methods: First, $\varepsilon $ -FCM clustering based on insensitive data information is used for considering more impact on decision boundary and reducing the effect of noise. Second, the composite kernel based on multiple linear kernel expression is proposed for implementing multi-linear decision boundary to alleviate overfitting problem. In more detail, each training data point is assigned with corresponding membership degrees in the $\varepsilon $ -FCM clustering. Some data which are potentially to be noise or outlier are assigned with lower membership degrees and given small contribution (compensation) considered in composite kernel function. Then, the composite kernel function for multiple local SVRs is constructed according to the distribution characteristics of $\varepsilon $ -FCM clustering. The proposed MFSVR is tested with both synthetic and UCI data sets in order to verify the effectiveness as well as efficient performance improvement. Experimental results demonstrate that the proposed method shows the better performance when compared to other some methods studied so far.

Keywords