Energies (Jul 2023)

Enhancing Electricity Theft Detection through K-Nearest Neighbors and Logistic Regression Algorithms with Synthetic Minority Oversampling Technique: A Case Study on State Electricity Company (PLN) Customer Data

  • Yan Maraden,
  • Gunawan Wibisono,
  • I Gde Dharma Nugraha,
  • Budi Sudiarto,
  • Fauzan Hanif Jufri,
  • Kazutaka,
  • Anton Satria Prabuwono

DOI
https://doi.org/10.3390/en16145405
Journal volume & issue
Vol. 16, no. 14
p. 5405

Abstract

Read online

Electricity theft has caused massive losses and damage to electricity utilities. The damage affects the electricity supply’s quality and increases the generation load. The losses happen not only for the electricity utilities but also affect the legitimate users who have to pay excessive electricity bills. That is why the method to detect electricity theft is indispensable. Recently, machine learning algorithms have been used to develop a model for detecting electricity theft. However, most algorithms have problems due to imbalanced data, overfitting issues, and lack of data. Therefore, this paper proposes a solution that implements the oversampling technique to address the problems and increase the developed model’s accuracy. It is used to perform oversampling on the imbalanced dataset. Our proposed method consists of a pre-processing step to remove empty values and extract several parameters. After that, the oversampling technique is performed on the result of the pre-processing step. The logistic regression model combined with the oversampling techniques shows the best performance results on the developed model of electricity theft detection based on the state electricity company customers. The experiment shows that the proposed method, logistic regression combined with the synthetic minority oversampling technique, shows superior performance in terms of the accuracy of the training data and data testing, precision, recall, and F1-scores of 98.97%, 98.7%, 95%, 99%, and 97%, respectively. Moreover, the experiment also shows that the proposed solution outperforms existing methods.

Keywords