Adaptive Data Balancing Method Using Stacking Ensemble Model and Its Application to Non-Technical Loss Detection in Smart Grids

Ashraf Ullah; Nadeem Javaid; Muhammad Umar Javed; Pamir; Byung-Seo Kim; Saeed Ali Bahaj

doi:10.1109/ACCESS.2022.3230952

IEEE Access (Jan 2022)

Adaptive Data Balancing Method Using Stacking Ensemble Model and Its Application to Non-Technical Loss Detection in Smart Grids

Ashraf Ullah,
Nadeem Javaid,
Muhammad Umar Javed,
Pamir,
Byung-Seo Kim,
Saeed Ali Bahaj

Affiliations

Ashraf Ullah: Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
Nadeem Javaid: ORCiD; Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
Muhammad Umar Javed: ORCiD; Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
Pamir: ORCiD; Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
Byung-Seo Kim: ORCiD; Department of Computer and Information Communications Engineering, Hongik University, Sejong, South Korea
Saeed Ali Bahaj: Department of Management Information Systems, College of Business Administration, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2022.3230952
Journal volume & issue: Vol. 10
pp. 133244 – 133255

Abstract

Read online

A stacking ensemble model (SEM) is proposed in this paper to identify non-technical losses. Three layers make up the proposed model. Data pre-processing is performed at the first layer, where issues of data imbalance, missing values, and data normalization are dealt with. Min-max and a simple imputer are used to handle data normalization and missing values, respectively. Besides, ADASYN and TomekLink are used in a combined form to address the problem of data imbalance. The second layer employs three different machine learning models. The models, also referred to as base classifiers, used at the second layer in the proposed SEM include the following classifiers: random forest (RF), extra tree (ET), and extreme gradient boosting (XGBoost). To accomplish the final classification using the ridge classifier, the output of the basic classifiers is ensembled at the third layer. The ridge classifier is also regarded as the meta classifier. Furthermore, the training and testing of the suggested model is aided by real-time data from the smart grid corporation of China (SGCC). The proposed model’s performance is validated by multiple simulations using various performance indicators and is found to surpass the standalone classifiers in terms of ETD.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords