IEEE Access (Jan 2018)

Feature Selection Based on Term Frequency Reordering of Document Level

  • Hongfang Zhou,
  • Yingjie Zhang,
  • Hongjiang Liu,
  • Yao Zhang

DOI
https://doi.org/10.1109/ACCESS.2018.2868844
Journal volume & issue
Vol. 6
pp. 51655 – 51668

Abstract

Read online

In this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance ordering. In the experiments, our proposed algorithm is compared with Normalized Difference Measure, Chi-squared, Odds Ratio, Gini Index, and Balanced Accuracy on the WAP, K1a, K1b RE0, RE1, 20 Newsgroups, Reuters-21578, and RCV1-v2 data sets. The experimental results show that our proposed algorithm is superior to other five algorithms.

Keywords