International Journal of Computational Intelligence Systems (May 2020)

Online Streaming Feature Selection via Multi-Conditional Independence and Mutual Information Entropy†

  • Hongyi Wang,
  • Dianlong You

DOI
https://doi.org/10.2991/ijcis.d.200423.002
Journal volume & issue
Vol. 13, no. 1

Abstract

Read online

The goals of feature selection are to remove redundant and irrelevant features from high-dimensional data, extract the “optimal feature subset” of the original feature space to improve the classification accuracy, and reduce the time complexity. Traditional feature selection algorithms are based on static feature spaces that are difficult to apply in dynamic streaming data environments. Existing works, such as Alpha-investing and Online Streaming Feature Selection (OSFS), and Scalable and Accurate OnLine Approach (SAOLA), have been proposed to serve the feature selection with streaming feature, but they have drawbacks, including low prediction accuracy and a large number of selected features if the streaming features exhibit characteristics such as low redundancy and high relevance. To address the limitations of the abovementioned works, we propose the algorithm of Online Streaming Feature Selection via Conditional dependence and Mutual information (OSFSCM) for streaming feature, which is found to be superior to Alpha-investing and OSFS for datasets with low redundancy and high relevance. The efficiency of the proposed OSFSCM algorithm is validated through a performance test on widely used datasets, e.g., NIPS 2003 and Causality Workbench. Through extensive experimental results, we demonstrate that OSFSCM significantly improves the prediction accuracy and requires fewer selected features compared with Alpha-investing and OSFS.

Keywords