EURASIP Journal on Image and Video Processing (Feb 2018)

SIP-FS: a novel feature selection for data representation

  • Yiyou Guo,
  • Jinsheng Ji,
  • Hong Huo,
  • Tao Fang,
  • Deren Li

DOI
https://doi.org/10.1186/s13640-018-0252-3
Journal volume & issue
Vol. 2018, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Multiple features are widely used to characterize real-world datasets. It is desirable to select leading features with stability and interpretability from a set of distinct features for a comprehensive data description. However, most of existing feature selection methods focus on the predictability (e.g., prediction accuracy) of selected results yet neglect stability. To obtain compact data representation, a novel feature selection method is proposed to improve stability, and interpretability without sacrificing predictability (SIP-FS). Instead of mutual information, generalized correlation is adopted in minimal redundancy maximal relevance to measure the relation between different feature types. Several feature types (each contains a certain number of features) can then be selected and evaluated quantitatively to determine what types contribute to a specific class, thereby enhancing the so-called interpretability of features. Moreover, stability is introduced in the criterion of SIP-FS to obtain consistent results of ranking. We conduct experiments on three publicly available datasets using one-versus-all strategy to select class-specific features. The experiments illustrate that SIP-FS achieves significant performance improvements in terms of stability and interpretability with desirable prediction accuracy and indicates advantages over several state-of-the-art approaches.

Keywords