A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

Moussa Diallo; Shengwu Xiong; Eshete Derb Emiru; Awet Fesseha; Aminu Onimisi Abdulsalami; Mohamed Abd Elaziz

doi:10.3390/info12080291

Information (Jul 2021)

A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

Moussa Diallo,
Shengwu Xiong,
Eshete Derb Emiru,
Awet Fesseha,
Aminu Onimisi Abdulsalami,
Mohamed Abd Elaziz

Affiliations

Moussa Diallo: School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
Shengwu Xiong: School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
Eshete Derb Emiru: School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
Awet Fesseha: School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
Aminu Onimisi Abdulsalami: School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070, China
Mohamed Abd Elaziz: School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

DOI: https://doi.org/10.3390/info12080291
Journal volume & issue: Vol. 12, no. 8
p. 291

Abstract

Read online

Classification algorithms have shown exceptional prediction results in the supervised learning area. These classification algorithms are not always efficient when it comes to real-life datasets due to class distributions. As a result, datasets for real-life applications are generally imbalanced. Several methods have been proposed to solve the problem of class imbalance. In this paper, we propose a hybrid method combining the preprocessing techniques and those of ensemble learning. The original training set is undersampled by evaluating the samples by stochastic measurement (SM) and then training these samples selected by Multilayer Perceptron to return a balanced training set. The MLPUS (Multilayer perceptron undersampling) balanced training set is aggregated using the bagging ensemble method. We applied our method to the real-life Niger_Rice dataset and forty-four other imbalanced datasets from the KEEL repository in this study. We also compared our method with six other existing methods in the literature, such as the MLP classifier on the original imbalance dataset, MLPUS, UnderBagging (combining random under-sampling and bagging), RUSBoost, SMOTEBagging (Synthetic Minority Oversampling Technique and bagging), SMOTEBoost. The results show that our method is competitive compared to other methods. The Niger_Rice real-life dataset results are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC with our proposed method. In contrast, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords