Selecting Electrical Billing Attributes: Big Data Preprocessing Improvements

Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 643)


The attribute selection is a very relevant activity of data preprocessing when discovering knowledge on databases. Its main objective is to eliminate irrelevant and/or redundant attributes to obtain computationally treatable issues, without affecting the quality of the solution. Various techniques are proposed, mainly from two approaches: wrapper and ranking. This article evaluates a novel approach proposed by Bradley and Mangasarian (Machine learning ICML. Morgan Kaufmann, Sn Fco, CA, pp. 82–90, 1998 [1]) which uses concave programming for minimizing the classification error and the number of attributes required to perform the task. The technique is evaluated using the electric service billing database in Colombia. The results are compared against traditional techniques for evaluating: attribute reduction, processing time, discovered knowledge size, and solution quality.


Electric billing Concave programming Data mining Electric service billing 


  1. 1.
    Bradley P, Mangasarian O (1998) Feature selection via concave minimization and support vector machines. In: Shavlik J (ed) Machine learning ICML. Morgan Kaufmann, Sn Fco, CA, pp 82–90Google Scholar
  2. 2.
    Hu C, Du S, Su J et al (2016) Discussion on the ways of purchasing and selling electricity and the mode of operation in China’s electricity sales companies under the background of new electric power reform. Power Netw Technol 40(11):3293–3299Google Scholar
  3. 3.
    Xue Y, Lai Y (2016) The integration of great energy thinking and big datas thinking: big data and electricity big data. Power Syst Autom 40(1):1–8Google Scholar
  4. 4.
    Wang Y, Chen Q, Kang C et al (2017) Clustering of electricity consumption behaviour dynamics toward big data applications. IEEE Trans Smart Grid 7(5):2437–2447CrossRefGoogle Scholar
  5. 5.
    Liu R, Feng G, Ding W (2011) Statistical analysis and application of SAS. China Machine Press, ChinaGoogle Scholar
  6. 6.
    Ozger M, Cetinkaya O, Akan OB (2017) Energy harvesting cognitive radio networking for IoT-enabled smart grid. Mob Netw Appl 23(4):956–966CrossRefGoogle Scholar
  7. 7.
    Isasi P, Galván I (2004) Redes de Neuronas Artificiales. Un enfoque Práctico. Pearson, London. ISBN 8420540250Google Scholar
  8. 8.
    Mangasarian O (1997) Arbitrary-norm separating plane. Technical report 97-07, Computer Science Dept., Univ. Wisconsin MadisonGoogle Scholar
  9. 9.
    Bradley P, Fayyad U, Mangasarian O (1999) Mathematical programming for data mining: formulations and challenges. INFORMS J Comput 11:217–238MathSciNetCrossRefGoogle Scholar
  10. 10.
    Rahmani AM, Liljeberg P, Preden J, Jantsch A (2018) Fog computing in the internet of things. Springer, New York ISBN: 978-3-319-57638-1, ISBN: 978-3-319-57639-8 (eBook)CrossRefGoogle Scholar
  11. 11.
    Gangurde HD (2014) Feature selection using clustering approach for big data. Int J Comput Appl Innov Trends Comput Commun Eng (ITCCE):1–3Google Scholar
  12. 12.
    Abualigah LM, Khader AT, Al-Beta MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36CrossRefGoogle Scholar
  13. 13.
    Sanchez L, Vásquez C, Viloria A, Cmeza-estrada (2018) Conglomerates of Latin American countries and public policies for the sustainable development of the electric power generation sector. In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, ChamGoogle Scholar
  14. 14.
    Sánchez L, Vásquez C, Viloria A, Rodríguez Potes L (2018) Greenhouse gases emissions and electric power generation in Latin American countries in the period 2006–2013. In: Tan Y, Shi Y, Tang Q (eds) Data mining and big data. DMBD 2018. Lecture notes in computer science, vol 10943. Springer, ChamGoogle Scholar
  15. 15.
    Perez R et al (2018) Fault diagnosis on electrical distribution systems based on fuzzy logic. In: Tan Y, Shi Y, Tang Q (eds) Advances in swarm intelligence. ICSI 2018. Lecture notes in computer science, vol 10942. Springer, ChamGoogle Scholar
  16. 16.
    Silva V, Jesús A (2013) Indicators systems for evaluating the efficiency of political awareness of rational use of electricity. In: Advanced materials research, vol 601. Trans Tech Publications, Switzerland, pp 618–625Google Scholar
  17. 17.
    Perez R, Inga E, Aguila A, Vásquez C, Lima L, Viloria A, Henry MA (2018) Fault diagnosis on electrical distribution systems based on fuzzy logic. In: International conference on sensing and imaging, June. Springer, Cham, pp 174–185Google Scholar
  18. 18.
    Perez R, Vásquez C, Viloria A (2019) An intelligent strategy for faults location in distribution networks with distributed generation. J Intell Fuzzy Syst 36(2):1627–1637 (Preprint)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Universidad de la CostaBarranquillaColombia
  2. 2.Universidad Simón BolívarBarranquillaColombia
  3. 3.Corporación Universitaria Minuto de Dios—UNIMINUTOBarranquillaColombia
  4. 4.Corporación Universitaria LatinoamericanaBarranquillaColombia
  5. 5.Universidad Tecnológica Centroamericana (UNITEC)San Pedro SulaHonduras

Personalised recommendations