Skip to main content
Log in

Accurate Detection of Electricity Theft Using Classification Algorithms and Internet of Things in Smart Grid

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Electricity theft is one of the most significant factors among non-technical losses. Because of electricity theft, genuine users have to pay more, supply quality decreases, and generation load increases. Development in the Internet of Things-based sensors has changed the way to monitor the electricity consumption pattern of the consumers. This electricity consumption data are processed by classification algorithms to identify electricity theft. The electricity consumption data are imbalanced in nature. The objective of this study is to design a machine learning model considering that data imbalance issue is resolved using six data balancing techniques, namely Synthetic Minority over Sampling (SMOTE), Adaptive Synthetic Sampling (ADASYN), Random over Sampler, Support Vector Machine-Synthetic Minority over Sampling, SMOTEENN (Edited Nearest Neighbor) and SMOTE Tomek Links. The designed model consists of two stages. In first stage, twelve classification algorithms (Decision Tree, Adaboost, Extra Tree, Logistic Regression, XGBoost, Light GBM, Multi-Layer Perceptron, Bagging, Random Forest, Support Vector Machine, and Naïve Bayes, K-Nearest Neighbor) are applied on the data balanced by six techniques. In the next stage, two ensemble techniques, namely maximum voting and stacking, are applied to the best five performing algorithms using Python language. Dataset from the State Grid Corporation of China is considered, and algorithms are compared based on accuracy, MCC (Matthews Correlation Coefficient), f1-Score, and log-loss. We observed that SMOTEENN with stacking ensemble algorithm gives the highest performance: accuracy value—97.67%, MCC value—0.9434, log-loss value—1.01, and f1-score value—97.88% among all the experiments performed. The proposed model reported around 3% higher performance than the results presented in the literature. Moreover, the effectiveness of the model is validated by applying the ANOVA (Analysis of variance) one-way statistical test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Abbreviations

AdaBoost:

Adaptive Boosting

ANOVA:

Analysis of Variance

ADASYN:

Adaptive Synthetic

ANN:

Artificial Neural Network

AUC:

Area Under Curve

CNN:

Convolutional Neural Network

DANN:

Deep Artificial Neural Network

DT:

Decision Tree

EMD:

Empirical Mode Decomposition

ET:

Extra Tree

ETD:

Electricity Theft Detection

FA:

Firefly Algorithm

FIS:

Fuzzy Inference System

GA:

Genetic Algorithm

KNN:

K-Nearest Neighbor

LightGBM:

Light Gradient Boosting Machine

LR:

Logistic Regression

LSTM:

Long Short-Term Memory

MAP:

Mean Average Precision

MIN–MAX:

Minimum–Maximum

MCC:

Matthews Correlation Coefficient

ML:

Machine Learning

MLP:

Multi-Layer Perceptron

NB:

Naïve Bayes

OPF:

Optimum Path Forest

PSO:

Particle Swarm Optimization

RF:

Random Forest

ROC:

Receiver Operating Characteristic

RUSBoost:

Random Undersampling Boosting

SGCC:

State Grid Corporation of China

SVM:

Support Vector Machine

SMOTE:

Synthetic Minority Over Sampling

SMOTEENN:

Synthetic Minority Over Sampling Edited Nearest Neighbor

SVM-SMOTE:

SVM- Synthetic Minority Over Sampling

VGG:

Visual Geomtry Group

XGBoost:

EXtreme Gradient Boosting

References

  1. Jiang, R.; Lu, R.; Wang, Y.; Luo, J.; Shen, C.; Shen, X.S.: Energy theft detection issues for advanced metering infrastructure in smart grid. Tsinghua Sci. Technol. 19(2), 105–120 (2014)

    Article  Google Scholar 

  2. Agüero, J.R.: Improving the efficiency of power distribution systems through technical and non-technical losses reduction. InProceedingsofthePEST&D2012, Orlando, FL, USA, 7–10 May 2012; pp. 1–8

  3. McLaughlin, S.; Holbert, B.; Fawaz, A.; Berthier, R.; Zonouz, S.: A multi-sensor energy theft detection framework for advanced metering infrastructures. IEEE J. Sel. Areas Commun. 31, 1319–1330 (2013)

    Article  Google Scholar 

  4. Smith, T.B.: Electricity theft: a comparative analysis. Energy Policy 32, 2067–2076 (2004)

    Article  Google Scholar 

  5. McLaughlin, S.; Holbert, B.; Fawaz, A.Q.; Berthier, R.; Zonouz, S.: A multi-sensor energy theft detection framework for advanced metering infrastructures. IEEE J. Sel. Areas Commun. 31(7), 1319–1330 (2013)

    Article  Google Scholar 

  6. https://electricenergyonline.com/article/energy/organisation/northeast-group-llc/24442/636817/northeast-group-llc-96-billion-is-lost-every-year-to-electricity-theftutilities-increasingly-investing-in-solutions-to-combat-theft-and-non-technical-losses.html. [Accessed: April 2020]

  7. Li, S.; Han, Y.; Yao, X.; Yingchen, S.; Wang, J.; Zhao, Q.: Electricity theft detection in power grids with deep learning and random forests. J. Electr. Comput. Eng. 2019 (2019)

  8. Guerrero, J.I.; León, C.; Monedero, I.; Biscarri, F.; Biscarri, J.: Improving knowledge-based systems with statistical techniques, text mining, and neural networks for non-technical loss detection. Knowl.-Based Syst. 71, 376–388 (2014)

    Article  Google Scholar 

  9. Ramos, C.C.O.; Souza, A.N.; Chiachia, G.; Falcão, A.X.; Papa, J.P.: A novel algorithm for feature selection using harmony search and its application for non-technical losses detection. Comput. Electr. Eng. 37(6), 886–894 (2011)

    Article  Google Scholar 

  10. Wang, Y. F.; Lin, W. M.; Zhang, T.; Ma, Y. Y.: Research on application and security protection of internet of things in smart grid. (2012): 1–54

  11. Mehdipour Pirbazari, A.; Farmanbar, M.; Chakravorty, A.; Rong, C.: Short-term load forecasting using smart meter data: a generalization analysis. Processes 8(4), 484 (2020)

    Article  Google Scholar 

  12. Wang, K.; Chenhan, Xu.; Zhang, Y.; Guo, S.; Zomaya, A.Y.: Robust big data analytics for electricity price forecasting in the smart grid. IEEE Trans. Big Data 5(1), 34–45 (2017)

    Article  Google Scholar 

  13. Liu, Y.; Yuen, C.; Yu, R.; Zhang, Y.; Xie, S.: Queuing-based energy consumption management for heterogeneous residential demands in smart grid. IEEE Trans. Smart Grid 7(3), 1650–1659 (2016)

    Article  Google Scholar 

  14. Wu, Y.; Tan, X.; Qian, L.; Tsang, D.H.; Song, W.-Z.; Yu, L.: Optimal pricing and energy scheduling for hybrid energy trading market in future smart grid. IEEE Trans. Industr. Inf. 11(6), 1585–1596 (2015)

    Article  Google Scholar 

  15. Yaghmaee, M.H.; Moghaddassian, M.; Leon-Garcia, A.: Autonomous two-tier cloud-based demand side management approach with microgrid. IEEE Trans. Industr. Inf. 13(3), 1109–1120 (2017)

    Article  Google Scholar 

  16. Costa, B.C.; Alberto, B.L.A.; Portela, A.M.; Maduro, W.; Eler, E.O.: Fraud detection in electric power distribution networks using an ann based knowledge-discovery process. Int. J. Artif. Intell. Appl. 4(6), 17–21 (2013)

    Google Scholar 

  17. Guerrero, J.I.; Leon, C.; Monedero, I.; Biscarri, F.; Biscarri, J.: Improving knowledge-based systems with statistical techniques, text mining, and neural networks for non-technical loss detection. Knowl.-Based Syst. 71, 376–388 (2014)

    Article  Google Scholar 

  18. Ramos, C.C.; Souza, A.N.; Chiachia, G.; Falcao, A.X.; Papa, J.P.: A novel algorithm for feature selection using harmony search and its application for non-technical losses detection. Comput. Electr. Eng. 37(6), 886–894 (2011)

    Article  Google Scholar 

  19. Junior, L.A.P.; Ramos, C.C.O.; Rodrigues, D.; Pereira, D.R.; de Souza, A.N.; da Costa, K.A.P.; Papa, J.P.: Unsupervised non-technical losses identification through optimum-path forest. Electric Power Syst. Res. 140, 413–423 (2016)

    Article  Google Scholar 

  20. Glauner, P.; Meira, J.A.; Valtchev, P.; State, R.; Bettinger, F.: The challenge of non-technical loss detection using artificial intelligence: a surveyficial intelligence: a survey. Int. J. Comput. Intell. Syst. 10(1), 760–775 (2017)

    Article  Google Scholar 

  21. Lo, C.-H.; Ansari, N.: CONSUMER: A novel hybrid intrusion detection system for distribution networks in smart grid. IEEE Trans. Emerg. Top. Comput. 1, 33–44 (2013)

    Article  Google Scholar 

  22. Xiao, Z.; Xiao, Y.; Du, D.H.-C.: Non-repudiation in neighborhood area networks for smart grid. IEEE Commun. Mag. 51, 18–26 (2013)

    Article  Google Scholar 

  23. Amin, S.; Schwartz, G.A.; Cardenas, A.A.; Sastry, S.S.: Game theoretic models of electricity theft detection in smart utility networks: providing new capabilities with advanced Journal of Electrical and Computer Engineering 11 metering infrastructure. IEEE Control Syst. Mag. 35(1), 66–81 (2015)

    Article  MATH  Google Scholar 

  24. Mitchell, T.M.: Machine learning. 1997. Burr Ridge, IL: McGraw Hill 45(37), 870–877 (1997)

    Google Scholar 

  25. Ahuja, R.; Chug, A.; Gupta, S.; Ahuja, P.; Kohli, S.: Classification and clustering algorithms of machine learning with their applications. In: Nature-Inspired Computation in Data Mining and Machine Learning, pp. 225–248. Springer, Cham, (2020)

  26. LeCun, Y.; Bengio, Y.; Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  27. Rehman, H.A.U.; Lin, C.-Y.; Mushtaq, Z.; Su, S.-F.: Performance analysis of machine learning algorithms for thyroid disease. Arabian J. Sci. Eng. 1–13 (2021)

  28. Çağataylı, M.; Çelebi, E.: Estimating academic success in higher education using big five personality traits, a machine learning approach. Arabian J. Sci. Eng. 1–10 (2021)

  29. Alharbi, A.; Kalkatawi, M.; Taileb, M.: Arabic sentiment analysis using deep learning and ensemble methods. Arabian J. Sci. Eng. 1–11 (2021)

  30. Bozkurt, F.: A comparative study on classifying human activities using classical machine and deep learning methods. Arabian J. Sci. Eng. 1–15 (2021)

  31. Ngo, N.-T.; Pham, A.-D.; Truong, T. T. H.; Truong, N.-S.; Huynh, N.-T.; Pham, T. M.: An ensemble machine learning model for enhancing the prediction accuracy of energy consumption in buildings. Arabian J. Sci. Eng. 1–13 (2021)

  32. Tumbaz, M.N.M.; Ipek, M.: Energy demand forecasting: avoiding multi-collinearity. Arabian J. Sci. Eng. 46(2), 1663–1675 (2021)

    Article  Google Scholar 

  33. Depuru, S.S.S.R.; Wang, L.; Devabhaktuni, V.; Nelapati, P.: A hybrid neural network model and encoding technique for enhanced classification of energy consumption data. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE, (2011)

  34. Coma-Puig, B.; Carmona, J.: Bridging the gap between energy consumption and distribution through nontechnical loss detection. Energies 12, 1748 (2019)

    Article  Google Scholar 

  35. Jokar, P.; Arianpoo, N.; Leung, V.C.: Electricity theft detection AMI using customers’ consumption patterns. IEEE Trans. Smart Grid 7, 216–226 (2015)

    Article  Google Scholar 

  36. Nagi, J.; Mohammad, A. M.; Yap, K. S.; Tiong, S. K.; Ahmed, S. K.: Non-technical loss analysis for detection of electricity theft using support vector machines. In: 2008 IEEE 2nd International Power and Energy Conference (pp. 907–912). IEEE (2008, December)

  37. Di Martino, M.; Decia, F.; Molinelli, J.; Fernández, A.: Improving electric fraud detection using class imbalance strategies. In: ICPRAM (2) (pp. 135–141) (2012, February)

  38. Glauner, P.; Boechat, A.; Dolberg, L.; State, R.; Bettinger, F.; Rangoni, Y.; Duarte, D.: Large-scale detection of non-technical losses in imbalanced data sets. In: 2016 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT) (pp. 1–5). IEEE (2016, September)

  39. Nagi, J.; Yap, K. S.; Tiong, S. K.; Ahmed, S. K.; Mohammad, A. M.: Detection of abnormalities and electricity theft using genetic support vector machines. In: TENCON 2008–2008 IEEE Region 10 Conference (pp. 1–6). IEEE (2008, November)

  40. Bhat, R. R.; Trevizan, R. D.; Sengupta, R.; Li, X.; Bretas, A.: Identifying non-technical power loss via spatial and temporal deep learning. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 272–279). IEEE (2016, December)

  41. Zheng, Z.; Yang, Y.; Niu, X.; Dai, H.-N.; Zhou, Y.: Wide and deep convolutional neural networks for electricity-theft detection to secure smart grids. IEEE Trans. Ind. Inform. 14, 1606–1615 (2018)

    Article  Google Scholar 

  42. Muniz, C.; Figueiredo, K.; Vellasco, M.; Chavez, G.; Pacheco, M.: Irregularity detection on low tension electric installations by neural network ensembles. In: Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, GA, USA, 14–19 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 2176–2182

  43. Hasan, M.; Toma, R.N.; Nahid, A.A.; Islam, M.M.; Kim, J.M.: Electricity theft detection in smart grid systems: a CNN-LSTM based approach. Energies 12(17), 3310 (2019)

    Article  Google Scholar 

  44. Angelos, E.W.S.; Saavedra, O.R.; Cortés, O.A.C.; de Souza, A.N.: Detection and identification of abnormalities in customer consumptions in power distribution systems. IEEE Trans. Power Deliv. 26, 2436–2442 (2011)

    Article  Google Scholar 

  45. Nagi, J.; Yap, K.S.; Tiong, S.K.; Ahmed, S.K.; Nagi, F.: Improving SVM-based non-technical loss detection in power utility using the fuzzy inference system. IEEE Trans. Power Deliv. 26, 1284–1285 (2011)

    Article  Google Scholar 

  46. Toma, R. N.; Hasan, M. N.; Nahid, A.-A.; Li, B.: Electricity theft detection to reduce non-technical loss using support vector machine in smart grid. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–6. IEEE, (2019)

  47. Muniz, C.; Vellasco, M. M. B. R.; Tanscheit, R.; Figueiredo, K.: A neuro-fuzzy system for fraud detection in electricity distribution. In: IFSA/EUSFLAT Conf., pp. 1096–1101. (2009)

  48. Jindal, A.; Dua, A.; Kaur, K.; Singh, M.; Kumar, N.; Mishra, S.: Decision tree and SVM-based data analytics for theft detection in smart grid. IEEE Trans. Ind. Inf. 12(3), 1005–1016 (2016)

    Article  Google Scholar 

  49. Bohani, F. A.; Suliman, A.; Saripuddin, M.; Sameon, S. S.; Salleh, N.S. M.; Nazeri, S.: A comprehensive analysis of supervised learning techniques for electricity theft detection. J. Electr. Comput. Eng. 2021 (2021)

  50. Khan, Z.A.; Adil, M.; Javaid, N.; Saqib, M.N.; Shafiq, M.; Choi, J.-G.: Electricity theft detection using supervised learning techniques on smart meter data. Sustainability 12(19), 8023 (2020)

    Article  Google Scholar 

  51. Adil, M.; Javaid, N.; Qasim, U.; Ullah, I.; Shafiq, M.; Choi, J.-G.: LSTM and bat-based RUSBoost approach for electricity theft detection. Appl. Sci. 10(12), 4378 (2020)

    Article  Google Scholar 

  52. Finardi, P.; Campiotti, I.; Plensack, G.; de Souza, R. D.; Nogueira, R.; Pinheiro, G.; Lotufo, R.: Electricity theft detection with self-attention arXiv preprint http://arxiv.org/abs/2002.06219 (2020)

  53. Huang, Y.; Xu, Q.: Electricity theft detection based on stacked sparse denoising autoencoder. Int. J. Electr. Power Energy Syst. 125, 106448 (2021)

    Article  Google Scholar 

  54. Kocaman, B.; Tümen, V.: Detection of electricity theft using data processing and LSTM method in distribution systems. Sādhanā 45(1), 1–10 (2020)

    Article  Google Scholar 

  55. Aziz, S.; Naqvi, S. Z. H.; Khan, M. U.; Aslam, T.: Electricity theft detection using empirical mode decomposition and K-Nearest neighbors. In: 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), pp. 1–5. IEEE, (2020)

  56. Lemaître, G.; Nogueira, F.; Aridas, C.K.: Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)

    Google Scholar 

  57. Singh, D.; Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. (2019): 105524

  58. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  59. Nguyen, H.M.; Cooper, E.W.; Kamei, K.: Borderline over-sampling for imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradigms 3(1), 4–21 (2011)

    Article  Google Scholar 

  60. Drummond, C.; Holte, R. C.: C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on learning from imbalanced datasets II, vol. 11, pp. 1–8. Washington DC: Citeseer, (2003)

  61. Batista, G.E.A.P.A.; Prati, R.C.; Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newslett. 6(1), 20–29 (2004)

    Article  Google Scholar 

  62. Batista, G. EAPA; Bazzan, A. L.C.; Monard, M. C.: Balancing training data for automated annotation of keywords: a case study. In: WOB, pp. 10–18. (2003)

  63. He, H.; Bai, Y.; Garcia, E.A.; Li, S.: ADASYN: Adaptive synthetic sampling approach for imbalanced learning." In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), pp. 1322–1328. IEEE, (2008)

  64. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. : Lightgbm: a highly efficient gradient boosting decision tree,” in Advances in neural information processing systems, pp. 3146–3154 (2017)

  65. Ridgeway, G.: Generalized boosted models: A guide to the gbm package. Update 1(1), 2007 (2007)

    Google Scholar 

  66. Geurts, P.; Ernst, D.; Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  67. Liaw, A.; Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  68. Aha, D.W.; Kibler, D.; Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Article  Google Scholar 

  69. Holzinger, A.: Introduction to machine learning & knowledge extraction (make). Mach. Learn. Knowl. Extract. 1(1), 1–20 (2019)

    Google Scholar 

  70. Rish, I.: An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22, pp. 41–46. (2001)

  71. Kégl, B.: The return of AdaBoost. MH: multi-class Hamming trees. arXiv preprint https://arxiv.org/abs/1312.6086 (2013)

  72. Walczak, S.: Artificial neural networks,” Advanced Methodologies and Technologies in Artificial Intelligence, Computer Simulation, and Human-Computer Interaction. IGI Global, pp. 40–53 (2019)

  73. Hastie, T.; Tibshirani, R.; Friedman, J.: The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin (2009)

    Book  MATH  Google Scholar 

  74. Safavian, S.R.; Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)

    Article  MathSciNet  Google Scholar 

  75. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    Article  MATH  Google Scholar 

  76. Ouyang, Z.; Sun, X.; Chen, J.; Yue, D.; Zhang, T.: Multi-view stacking ensemble for power consumption anomaly detection in the context of industrial internet of things. IEEE Access 6, 9623–9631 (2018)

    Article  Google Scholar 

  77. Polikar, R.: Ensemble learning. In: Ensemble Machine Learning, pp. 1–34. Springer, Boston, MA, (2012)

  78. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M., et al.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  79. Onan, A.; Korukoğlu, S.; Bulut, H.: A hybrid ensemble pruning approach based on consensus clustering and multi- objective evolutionary algorithm for sentiment classification. Inf. Process. Manage. 53(4), 814–833 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alisha Banga.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banga, A., Ahuja, R. & Sharma, S.C. Accurate Detection of Electricity Theft Using Classification Algorithms and Internet of Things in Smart Grid. Arab J Sci Eng 47, 9583–9599 (2022). https://doi.org/10.1007/s13369-021-06313-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-021-06313-z

Keywords

Navigation