Alexandria Engineering Journal (Jun 2023)
Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting
Abstract
This paper proposes two novel approaches based on feature weighting and model selection for building more accurate kNN ensembles. The first approach identifies the nearest observations using a feature weighting scheme concerning the response variable via support vectors. A randomly selected subset of features is used for the feature weighting and model construction. After building a sufficiently large number of base models on bootstrap samples, a subset of the models is selected based on out-of-bag prediction error for the final ensemble. The second approach builds base learners build on random subsamples instead of bootstrap samples with a random subset of features. The method uses feature weighting while building the models. The remaining observations from each sample are used to assess the corresponding base learner and select a subset of the models for the final ensemble. The suggested ensemble methods are assessed on 12 benchmark datasets against other classical methods, including kNN-based models. The analyses reveal that the proposed methods are often better than the others.