IEEE Access (Jan 2019)

Online Feature Selection for Streaming Features Using Self-Adaption Sliding-Window Sampling

  • Dianlong You,
  • Xindong Wu,
  • Limin Shen,
  • Song Deng,
  • Zhen Chen,
  • Chuan Ma,
  • Qiusheng Lian

DOI
https://doi.org/10.1109/ACCESS.2019.2894121
Journal volume & issue
Vol. 7
pp. 16088 – 16100

Abstract

Read online

In recent years, online feature selection has been a research topic on streaming feature mining, as it can reduce the dimensionality of the streaming features by removing the irrelevant and redundant features in real time. There are many representative research efforts on the online feature selection with streaming features, i.e., alpha - investing, online streaming feature selection (OSFS), and scalable and accurate online approach (SAOLA) for feature selection. In these studies, alpha-investing has limited prediction accuracy and a large number of selected features. SAOLA sometimes offers outstanding efficiency in running time and prediction accuracy but possesses a large number of selected features. OSFS offers high prediction accuracy in many datasets, but its running time increases exponentially with an increasing number of features with low redundancy and high relevance. To address the limitations of the above-mentioned works, we propose an online learning algorithm named OSFAS, which samples streaming features in real-time by a self-adaption sliding-window and discards the irrelevant and redundant features by conditional independence. The OSFAS obtains an approximate Markov blanket with high prediction accuracy, meanwhile reducing the number of selected features. The efficiency of the proposed OSFASW algorithm was validated in a performance test on widely used datasets, e.g., NIPS2003 and causality workbench. Through the extensive experimental results, we demonstrate that OSFAS significantly improves the prediction accuracy and requires a smaller number of selected features than alpha - investing, OSFS, and SAOLA.

Keywords