A comparative study of optimization models in genetic programming-based rule extraction problems

  • Marconi de Arruda Pereira
  • Eduardo Gontijo Carrano
  • Clodoveu Augusto Davis Júnior
  • João Antônio de Vasconcelos
Methodologies and Application
  • 66 Downloads

Abstract

In this manuscript, we identify and evaluate some of the most used optimization models for rule extraction using genetic programming-based algorithms. Six different models, which combine the most common fitness functions, were tested. These functions employ well-known metrics such as support, confidence, sensitivity, specificity, and accuracy. The models were then applied in the assessment of the performance of a single algorithm in several real classification problems. Results were compared using two different criteria: accuracy and sensitivity/specificity. This comparison, which was supported by statistical analysis, pointed out that the use of the product of sensitivity and specificity provides a more realistic estimation of classifier performance. It was also shown that the accuracy metric can make the classifier biased, especially in unbalanced databases.

Keywords

Classification rules Genetic programming Multi-objective optimization Optimization model assessment 

References

  1. Assis C, Pereira A, Pereira M, Carrano E (2013) Using genetic programming to detect fraud in electronic transactions. In: Proceedings of the 19th Brazilian symposium on Multimedia and the web. ACM, pp 337–340Google Scholar
  2. Assis C, Pereira A, Pereira M, Carrano EG, et al (2014) A genetic programming approach for fraud detection in electronic transactions. In: 2014 IEEE symposium on computational intelligence in cyber security (CICS). IEEE, pp 1–8Google Scholar
  3. Aydogan EK, Karaoglan I, Pardalos PM (2012) Hga: hybrid genetic algorithm in fuzzy rule-based classification systems for high-dimensional problems. Appl Soft Comput 12(2):800–806. doi:10.1016/j.asoc.2011.10.010 CrossRefGoogle Scholar
  4. Berlanga F, Rivera A, del Jesus M, Herrera F (2010) Gp-coach: genetic programming-based learning of compact and accurate fuzzy rule-based classification systems for high-dimensional problems. Inf Sci 180(8):1183–1200. doi:10.1016/j.ins.2009.12.020 CrossRefGoogle Scholar
  5. Carrano E, Wanner E, Takahashi R (2011) A multicriteria statistical based comparison methodology for evaluating evolutionary algorithms. IEEE Trans Evol Comput 15(6):848–870. doi:10.1109/TEVC.2010.2069567 CrossRefGoogle Scholar
  6. Chan K, Ling S, Dillon T, Nguyen H (2011) Diagnosis of hypoglycemic episodes using a neural network based rule discovery system. Expert Syst Appl 38(8):9799–9808. doi:10.1016/j.eswa.2011.02.020 CrossRefGoogle Scholar
  7. Choi WJ, Choi TS (2012) Genetic programming-based feature transform and classification for the automatic detection of pulmonary nodules on computed tomography images. Inf Sci 212:57–78. doi:10.1016/j.ins.2012.05.008 CrossRefGoogle Scholar
  8. Coenen F, Leng P (2007) The effect of threshold values on association rule based classification accuracy. Data Knowl Eng 60(2):345–360CrossRefGoogle Scholar
  9. Cohen PR (1995) Empirical methods for artificial intelligence. MIT Press, CambridgeMATHGoogle Scholar
  10. Edwards D, Metz C (2007) Optimization of restricted roc surfaces in three-class classification tasks. IEEE Trans Med Imaging 26(10):1345–1356. doi:10.1109/TMI.2007.898578 CrossRefGoogle Scholar
  11. Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, New YorkCrossRefMATHGoogle Scholar
  12. García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977. doi:10.1007/s00500-008-0392-y CrossRefGoogle Scholar
  13. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San FranciscoMATHGoogle Scholar
  14. Ishibuchi H, Yamamoto T (2004) Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining. Fuzzy Sets Syst 141(1):59–88. doi:10.1016/S0165-0114(03)00114-3 CrossRefMATHGoogle Scholar
  15. Izmailov R, Bassu D, McIntosh A, Ness L, Shallcross D (2015) Application of multi-scale singular vector decomposition to vessel classification in overhead satellite imagery. In: Seventh international conference on digital image processing (ICDIP15). International Society for Optics and Photonics, pp 963,108–963,108Google Scholar
  16. Jabeen H, Baig AR (2013) Two-stage learning for multi-class classification using genetic programming. Neurocomputing 116:311–316. doi:10.1016/j.neucom.2012.01.048 CrossRefGoogle Scholar
  17. Jovic A, Bogunovic N (2011) Electrocardiogram analysis using a combination of statistical, geometric, and nonlinear heart rate variability features. Artif Intell Med 51(3):175–186. doi:10.1016/j.artmed.2010.09.005 CrossRefGoogle Scholar
  18. Jowett D (1976) SIAM Rev 18(1):134–137. http://www.jstor.org/stable/2029021
  19. Koshiyama AS, Vellasco MM, Tanscheit R (2015) Gpfis-class: a genetic fuzzy system based on genetic programming for classification problems. Appl Soft Comput 37:561–571. doi:10.1016/j.asoc.2015.08.055 CrossRefGoogle Scholar
  20. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, CambridgeMATHGoogle Scholar
  21. Kumudha P, Venkatesan R, Radhika E (2015) Product metrics based predictive classification of software using rar mining and naive bayesapproach. Int J Appl Eng Res 10(7):17375–17391Google Scholar
  22. Kuo CS, Hong TP, Chen CL (2007) Applying genetic programming technique in classification trees. Soft Comput 11(12):1165–1172. doi:10.1007/s00500-007-0159-x CrossRefMATHGoogle Scholar
  23. Márquez-Vera C, Cano A, Romero C, Ventura S (2013) Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell 38(3):315–330. doi:10.1007/s10489-012-0374-8 CrossRefGoogle Scholar
  24. Pereira MA, Davis-Júnior CA, Vasconcelos JA (2010) A niched genetic programming algorithm for classification rules discovery in geographic databases. Simulated Evolution and Learning, vol 6457. Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 260–269CrossRefGoogle Scholar
  25. Pereira MA, Davis-Júnior CA, Carrano EG, Vasconcelos JA (2014) A niching genetic programming-based multi-objective algorithm for hybrid data classification. Neurocomputing 133:342–357. doi:10.1016/j.neucom.2013.12.048 CrossRefGoogle Scholar
  26. Prasenna P, Ramana AR, Kumar RK, Devanbu A (2012) Network programming and mining classifier for intrusion detection using probability classification. In: 2012 International conference on pattern recognition, informatics and medical engineering (PRIME), pp 204–209. doi:10.1109/ICPRIME.2012.6208344
  27. Romero C, Zafra A, Luna JM, Ventura S (2013) Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst 30(2):162–172CrossRefGoogle Scholar
  28. Shimada K, Hirasawa K, Hu J (2006) Class association rule mining with chi-squared test using genetic network programming. In: SMC. IEEE, pp 5338–5344Google Scholar
  29. Sikora M (2011) Induction and pruning of classification rules for prediction of microseismic hazards in coal mines. Expert Syst Appl 38(6):6748–6758. doi:10.1016/j.eswa.2010.11.059 CrossRefGoogle Scholar
  30. Touati H, Ras Z, Studnicki J (2015) Meta-actions as a tool for action rules evaluation. Stud Comput Intell 584:177–197Google Scholar
  31. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann series in data management systems, 2nd edn. Morgan Kaufmann Publishers Inc., San FranciscoGoogle Scholar
  32. Yang G, Mabu S, Shimada K, Hirasawa K (2011) An evolutionary approach to rank class association rules with feedback mechanism. Expert Syst Appl 38(12):15,040–15,048. doi:10.1016/j.eswa.2011.05.042
  33. Zafra A, Ventura S (2010) G3p-mi: a genetic programming algorithm for multiple instance learning. Inf Sci 180(23):4496–4513. doi:10.1016/j.ins.2010.07.031 CrossRefGoogle Scholar
  34. Zafra A, Ventura S (2012) Multi-objective approach based on grammar-guided genetic programming for solving multiple instance problems. Soft Comput 16(6):955–977CrossRefGoogle Scholar
  35. Zafra A, Romero C, Ventura S (2013) Dral a tool for discovering relevant e-activities for learners. Knowl Inf Syst 36(1):211–250. doi:10.1007/s10115-012-0531-8 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Department of Technologies of Civil Engineering, Computation and HumanitiesUFSJ/CAPOuro BrancoBrazil
  2. 2.Electrical Engineering Department (DEE/UFMG)UFMGBelo HorizonteBrazil
  3. 3.Computer Science Department (DCC/UFMG)UFMGBelo HorizonteBrazil
  4. 4.Evolutionary Computation Laboratory (LCE/UFMG), Electrical Engineering Department (DEE/UFMG)UFMGBelo HorizonteBrazil

Personalised recommendations