In this manuscript, we identify and evaluate some of the most used optimization models for rule extraction using genetic programming-based algorithms. Six different models, which combine the most common fitness functions, were tested. These functions employ well-known metrics such as support, confidence, sensitivity, specificity, and accuracy. The models were then applied in the assessment of the performance of a single algorithm in several real classification problems. Results were compared using two different criteria: accuracy and sensitivity/specificity. This comparison, which was supported by statistical analysis, pointed out that the use of the product of sensitivity and specificity provides a more realistic estimation of classifier performance. It was also shown that the accuracy metric can make the classifier biased, especially in unbalanced databases.
Classification rules Genetic programming Multi-objective optimization Optimization model assessment
This is a preview of subscription content, log in to check access
We thank the Laboratory of Evolutionary Computation (UFMG) and the UCI Machine Learning Repository for having provided the datasets used in the experiments. This work has been supported in part by CNPq, CAPES and FAPEMIG. They are Brazilian agencies in charge of Fostering Scientific and Technological Development. They are governmental agencies of Brazil and the Minas Gerais state. They have so only the interest in generating relevant knowledge to the general society, so any kind of conflict of interest between them is discarded.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
Assis C, Pereira A, Pereira M, Carrano E (2013) Using genetic programming to detect fraud in electronic transactions. In: Proceedings of the 19th Brazilian symposium on Multimedia and the web. ACM, pp 337–340Google Scholar
Assis C, Pereira A, Pereira M, Carrano EG, et al (2014) A genetic programming approach for fraud detection in electronic transactions. In: 2014 IEEE symposium on computational intelligence in cyber security (CICS). IEEE, pp 1–8Google Scholar
Berlanga F, Rivera A, del Jesus M, Herrera F (2010) Gp-coach: genetic programming-based learning of compact and accurate fuzzy rule-based classification systems for high-dimensional problems. Inf Sci 180(8):1183–1200. doi:10.1016/j.ins.2009.12.020CrossRefGoogle Scholar
García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977. doi:10.1007/s00500-008-0392-yCrossRefGoogle Scholar
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San FranciscoMATHGoogle Scholar
Izmailov R, Bassu D, McIntosh A, Ness L, Shallcross D (2015) Application of multi-scale singular vector decomposition to vessel classification in overhead satellite imagery. In: Seventh international conference on digital image processing (ICDIP15). International Society for Optics and Photonics, pp 963,108–963,108Google Scholar
Márquez-Vera C, Cano A, Romero C, Ventura S (2013) Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell 38(3):315–330. doi:10.1007/s10489-012-0374-8CrossRefGoogle Scholar
Pereira MA, Davis-Júnior CA, Vasconcelos JA (2010) A niched genetic programming algorithm for classification rules discovery in geographic databases. Simulated Evolution and Learning, vol 6457. Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 260–269CrossRefGoogle Scholar
Prasenna P, Ramana AR, Kumar RK, Devanbu A (2012) Network programming and mining classifier for intrusion detection using probability classification. In: 2012 International conference on pattern recognition, informatics and medical engineering (PRIME), pp 204–209. doi:10.1109/ICPRIME.2012.6208344
Romero C, Zafra A, Luna JM, Ventura S (2013) Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst 30(2):162–172CrossRefGoogle Scholar
Shimada K, Hirasawa K, Hu J (2006) Class association rule mining with chi-squared test using genetic network programming. In: SMC. IEEE, pp 5338–5344Google Scholar
Touati H, Ras Z, Studnicki J (2015) Meta-actions as a tool for action rules evaluation. Stud Comput Intell 584:177–197Google Scholar
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann series in data management systems, 2nd edn. Morgan Kaufmann Publishers Inc., San FranciscoGoogle Scholar
Yang G, Mabu S, Shimada K, Hirasawa K (2011) An evolutionary approach to rank class association rules with feedback mechanism. Expert Syst Appl 38(12):15,040–15,048. doi:10.1016/j.eswa.2011.05.042