Energy Exploration & Exploitation (Sep 2024)
Improving Electricity Theft Detection Using Electricity Information Collection System and Customers’ Consumption Patterns
Abstract
Electricity theft detection (ETD) techniques employed to identify fraudulent consumers often fail to accurately pinpoint electricity thieves in real time. The patterns associated with electricity use are leveraged to identify anomalies indicative of electricity theft. However, challenges in the benchmark ETD include overfitting and a high incidence of false positives (FPs) resulting from incorrect usage patterns formed by considering only electricity consumption patterns without accounting for external factors that contribute to variations in normal consumption patterns. Further investigation is required to precisely detect instances of electricity theft, thereby minimizing nontechnical losses and forecasting future electricity demand to maintain a stable supply. This study employs a master energy meter located on the transformer side to monitor the amount of energy distributed to the region. Zones with a high likelihood of energy theft are detected by comparing the sum of readings from all the smart meters with the readings from the master energy meter installed on the HV of the substation transformer within the same electric feeder. Ensemble XGBoost machine-learning algorithm and K-Means algorithm are used for the classification of malicious and nonmalicious samples and grouping similar types of consumers together, respectively. This makes the proposed model resistant to false-positive rates caused by changes in usage patterns that aren’t done on purpose. Furthermore, energy thieves are identified by detecting anomalies in consumption behavior while maintaining constant internal and external environmental variables. This novel model proposed here mitigates the FP rate found in the present research of electricity usage data. Our approach outperforms support vector machines, convolution neural network, and logistic regression in simulations. Precision, F1-score, recall, Matthews Correlation Coefficient, Receiver Operating Characteristics (ROC)-Area Under The Curve (AUC), and Precision Recal (PR)-Area Under The Curve (AUC) validate our model. The simulation results show that the proposed K-Means-LSTM-XGBoost model improved the classifier’s F1-score, precision, and recall to 93.75%, 95.16%, and 92.38%, respectively. Our model classifies huge time series data with high precision and can be utilized by the utility for real time electricity theft detection.