Journal of ICT (Feb 2021)
ENSEMBLE META CLASSIFIER WITH SAMPLING AND FEATURE SELECTION FOR DATA WITH IMBALANCE MULTICLASS PROBLEM
Abstract
Ensemble learning by combining several single or another ensemble classifier is one of the procedures to solve the imbalance problem in multiclass data. However, this approach is still facing the question of how the ensemble methods obtain their higher performance. In this paper, the investigation is carried out on the design of the ensemble meta classifier with sampling and feature selection for imbalance multiclass data. The specific objectives are 1) to improve the ensemble classifier through data-level approach (sampling and feature selection); 2) to perform experiments on sampling, feature selection, and ensemble classifier model; and 3) to evaluate the performance of the ensemble classifier. To fulfill the objectives, a preliminary data collection of Malaysian plants leaf images was prepared, experimented, and comparing the results. The ensemble design is also tested with another three high imbalance ratio benchmark data. It is found that the design using sampling, feature selection and ensemble classifier method using AdaboostM1 with Random Forest (also an ensemble classifier) provides the improved performance throughout the investigation. The result of this study is important to the ongoing problem of multiclass imbalance where specific structure and its performance can be improved in terms of processing time and accuracy.
Keywords