Jisuanji kexue yu tansuo (Feb 2024)
Ensemble Feature Selection Method with Fast Transfer Model
Abstract
Compared with the traditional ensemble feature selection methods, the recently-developed ensemble feature selection with block-regularized [m×2] cross-validation (EFSBCV) not only has a variance of the estimator smaller than that of random [m×2] cross-validation, but also enhances the selection probability of important features and reduces the selection probability of noise features. However, the adopted linear regression model without the use of the bias term in EFSBCV may easily lead to underfitting. Moreover, EFSBCV does not consider the importance of each feature subset. Aiming at these two problems, an ensemble feature selection method called EFSFT (ensemble feature selection method using fast transfer model) is proposed in this paper. The basic idea is that the base feature selector in EFSBCV adopts the fast transfer model in this paper, so as to introduce the bias term. EFSFT transfers 2m subsets of features as the source knowledge, and then recalculates the weight of each feature subset, and the linear model fitting ability with the addition of bias terms is better. The results on real datasets show that compared with EFSBCV, the average FP value by EFSFT reduces up to 58%, proving that EFSFT has more advantages in removing noise features. In contrast to least-squares support vector machine (LSSVM), the average TP value by EFSFT increases up to 5%, which clearly indicates the superiority of EFSFT over LSSVM in choosing important features.
Keywords