IEEE Access (Jan 2022)
Adaptive Data Balancing Method Using Stacking Ensemble Model and Its Application to Non-Technical Loss Detection in Smart Grids
Abstract
A stacking ensemble model (SEM) is proposed in this paper to identify non-technical losses. Three layers make up the proposed model. Data pre-processing is performed at the first layer, where issues of data imbalance, missing values, and data normalization are dealt with. Min-max and a simple imputer are used to handle data normalization and missing values, respectively. Besides, ADASYN and TomekLink are used in a combined form to address the problem of data imbalance. The second layer employs three different machine learning models. The models, also referred to as base classifiers, used at the second layer in the proposed SEM include the following classifiers: random forest (RF), extra tree (ET), and extreme gradient boosting (XGBoost). To accomplish the final classification using the ridge classifier, the output of the basic classifiers is ensembled at the third layer. The ridge classifier is also regarded as the meta classifier. Furthermore, the training and testing of the suggested model is aided by real-time data from the smart grid corporation of China (SGCC). The proposed model’s performance is validated by multiple simulations using various performance indicators and is found to surpass the standalone classifiers in terms of ETD.
Keywords