Abstract
In the context of data mining, classification rule discovering is the task of designing accurate rule based systems that model the useful knowledge that differentiate some data classes from others, and is present in large data sets.
Iterated greedy search is a powerful metaheuristic, successfully applied to different optimisation problems, which to our knowledge, has not previously been used for classification rule mining.
In this work, we analyse the convenience of using iterated greedy algorithms for the design of rule classification systems. We present and study different alternatives and compare the results with state-of-the-art methodologies from the literature. The results show that iterated greedy search may generate accurate rule classification systems with acceptable interpretability levels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco (2005)
Liao, S.-H., Chu, P.-H., Hisao, P.-Y.: Data mining techniques and applications - A decade review from 2000 to 2011. Expert Syst. Appl. 39(12), 11303–113011 (2012)
Richards, D.: Two decades of ripple down rules research. Knowl. Eng. Rev. 24, 159–184 (2009)
Cano, A., Zafra, A., Ventura, S.: An interpretable classification rule mining algorithm. Inform. Sciences 240, 1–20 (2013)
Cano, J., Herrera, F., Lozano, M.: Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability. Data Knowl. Eng. 60, 90–108 (2007)
García, S., Fernández, A., Luengo, J., Herrera, F.: A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13, 959–977 (2009)
Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B.: An empiricial evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51, 141–154 (2011)
Culberson, J., Luo, F.: Exploring the k-colorable landscape with iterated greedy. In: Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, vol. 26, pp. 245–284 (1996)
Ruiz, R., Stützle, T.: A simple and effective iterated greedy algorithm for the permutation flowshop scheduling problem. Eur. J. Oper. Res. 177, 2033–2049 (2007)
Lozano, M., Molina, D., García-Martínez, C.: Iterated greedy for the maximum diversity problem. Eur. J. Oper. Res. 214, 31–38 (2010)
Rodriguez, F., Lozano, M., Blum, C., García-Martínez, C.: An Iterated greedy algorithm for the large-scale unrelated parallel machines scheduling problem. Comput. Oper. Res. 40(7), 1829–1841 (2013)
García-Martínez, C., Rodriguez, F.J., Lozano, M.: Tabu-enhanced iterated greedy algorithm: A case study in the quadratic multiple knapsack problem. Eur. J. Oper. Res. 232, 454–463 (2014)
Ying, K.-C., Cheng, H.-M.: Dynamic parallel machine scheduling with sequence-dependent setup times using an iterated greedy heuristic. Expert Syst. Appl. 37(4), 2848–2852 (2010)
Lozano, M., Molina, D., García-Martínez, C.: Iterated greedy for the maximum diversity problem. Eur. J. Oper. Res. 214, 31–38 (2011)
García-Martínez, C., Rodriguez, F.J., Lozano, M.: Tabu-enhanced iterated greedy algorithm: A case study in the quadratic multiple knapsack problem. Eur. J. Oper. Res. 232, 454–463 (2014)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inform. Process. Manag. 45(4), 427–437 (2009)
Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30(1), 27–38 (2009)
Zafra, A., Ventura, S.: Multi-instance genetic programming for predicting student performance in web based educational environments. Appl. Soft. Comput. 12(8), 2693–2706 (2012)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemannr, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11, 10–18 (2009)
Bache, K., Lichman, M.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences (2013), http://archive.ics.uci.edu/ml
Garcia, S., Molina, D., Lozano, M., Herrera, F.: A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: A case study on the CEC’2005 special session on real parameter optimization. J. Heuristics 15(6), 617–644 (2009)
Iman, R., Davenport, J.: Approximation of the critical region of the Friedman statistic. Communications in Statistics, 571–595 (1980)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Bacardit, J., Krasnogor, N.: Performance and efficiency of memetic Pittsburgh learning classifier systems. Evol. Comput. 17, 307–342 (2009)
Guan, S., Zhu, F.: An incremental approach to genetic-algorithms-based classification. IEEE T. Syst. Man. Cy. B 35, 227–239 (2005)
Tan, K., Yu, Q., Ang, J.: A coevolutionary algorithm for rules discovery in data mining. Int. J. Syst. Sci. 37, 835–864 (2006)
González, A., Perez, R.: Selection of relevant features in a fuzzy genetic learning algorithm. IEEE T. Syst. Man. Cy. B 31, 417–425 (2001)
Sánchez, L., Couse, I., Corrales, J.: Combining GP operators with SA search to evolve fuzzy rule based classifiers. Inform. Sciences 136, 175–192 (2001)
Carvalho, D., Freitas, A.: A hybrid decision tree/genetic algorithm method for data mining. Inform. Sciences 163, 13–35 (2004)
Parpinelli, R., Lopes, H., Freitas, A.: Data mining with an ant colony optimization algorithm. IEEE T. Evolut. Comput. 6, 321–332 (2002)
Cohen, W.: Fast effective rule induction. In: Proc. of the 12th International Conference on Machine Learning, pp. 1–10 (1995)
Quinlan, J.: C4.5: Programs for Machine Learning (1993)
Nauc, D.D.: Measuring interpretability in rule-based classification systems. In: Proc. of the IEEE International Conference on Fuzzy Systems, pp. 196–201 (2002)
Luna, J.M., Romero, J.R., Ventura, S.: Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl. Inf. Syst. 32(1), 53–76 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Pedraza, J.A., García-Martínez, C., Cano, A., Ventura, S. (2014). Classification Rule Mining with Iterated Greedy. In: Polycarpou, M., de Carvalho, A.C.P.L.F., Pan, JS., Woźniak, M., Quintian, H., Corchado, E. (eds) Hybrid Artificial Intelligence Systems. HAIS 2014. Lecture Notes in Computer Science(), vol 8480. Springer, Cham. https://doi.org/10.1007/978-3-319-07617-1_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-07617-1_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07616-4
Online ISBN: 978-3-319-07617-1
eBook Packages: Computer ScienceComputer Science (R0)