IEEE Access (Jan 2019)

A Novel Diversity Measure and Classifier Selection Approach for Generating Ensemble Classifiers

  • Muhammad Zohaib Jan,
  • Brijesh Verma

DOI
https://doi.org/10.1109/ACCESS.2019.2949059
Journal volume & issue
Vol. 7
pp. 156360 – 156373

Abstract

Read online

Accuracy and diversity are considered to be the two deriving factors when it comes to generating an ensemble classifier. Focusing only on accuracy causes the ensemble classifier to suffer from “diminishing returns” and the ensemble accuracy tends to plateau; whereas focusing only on diversity causes the ensemble classifier to suffer in accuracy. Therefore, a balance must be maintained between the two for the ensemble classifier to achieve high classification accuracy. In this paper, we propose a novel diversity measure known as Misclassification Diversity (MD) and an Incremental Layered Classifier Selection (ILCS) approach to generate an ensemble classifier. The proposed approach ILCS-MD generates an ensemble classifier by incrementally selecting classifiers from the base classifier pool based on increasing accuracy and diversity. The benefits are in two folds 1) the generated ensemble classifier contains only those classifiers from the pool which can either maximize accuracy whilst maintaining or increasing the diversity, and 2) the generated ensemble classifier selects only a few classifiers from the base classifier pool thus reducing ensemble component size as well. The proposed approach is evaluated on 55 benchmark datasets taken from UCI and KEEL dataset repositories. The results are compared with five existing pairwise diversity measures, and existing state of the art ensemble classifier approaches. A significance test is also conducted to verify the significance of the results.

Keywords