IEEE Access (Jan 2023)

A More Flexible and Robust Feature Selection Algorithm

  • Tianyi Tu,
  • Ye Su,
  • Yayuan Tang,
  • Wenxue Tan,
  • Sheng Ren

DOI
https://doi.org/10.1109/ACCESS.2023.3342044
Journal volume & issue
Vol. 11
pp. 141512 – 141522

Abstract

Read online

With the increasing amount of real data, the challenges of large-scale model operations as well as poor generalization capacity, making selection of an appropriate feature set a significant concern. This study proposes ImprovedRFECV, an enhanced approach for cross-validated recursive feature elimination (RFECV). The algorithm first enhances the robustness of the optimal feature subset through random sampling of different data, building multiple models, and comparing their scores. Simultaneously, the L1 and L2 regularization terms are introduced to evaluate the value of each feature more comprehensively, thus reducing the impact on the anti-interference term and further improving the accuracy of the algorithm and its stability. Furthermore, a multi-model ensemble learning framework is employed to enhance generalization ability and effectively prevent overfitting. Lastly, a both-end expansion removal strategy is adopted to address the issue of strong covariance among features while enhancing the algorithm’s flexibility. The experimental results demonstrate that, compared to the RFECV algorithm, the ImprovedRFECV algorithm achieves fewer optimal average features and outperforms the optimal feature subset across five datasets spanning five different domains, demonstrating the algorithm’s high level of robustness and generalization ability.

Keywords