Advertisement

Classification Rule Mining with Iterated Greedy

  • Juan A. Pedraza
  • Carlos García-Martínez
  • Alberto Cano
  • Sebastián Ventura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8480)

Abstract

In the context of data mining, classification rule discovering is the task of designing accurate rule based systems that model the useful knowledge that differentiate some data classes from others, and is present in large data sets.

Iterated greedy search is a powerful metaheuristic, successfully applied to different optimisation problems, which to our knowledge, has not previously been used for classification rule mining.

In this work, we analyse the convenience of using iterated greedy algorithms for the design of rule classification systems. We present and study different alternatives and compare the results with state-of-the-art methodologies from the literature. The results show that iterated greedy search may generate accurate rule classification systems with acceptable interpretability levels.

Keywords

Classification rule mining iterated greedy interpretability 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco (2005)Google Scholar
  2. 2.
    Liao, S.-H., Chu, P.-H., Hisao, P.-Y.: Data mining techniques and applications - A decade review from 2000 to 2011. Expert Syst. Appl. 39(12), 11303–113011 (2012)CrossRefGoogle Scholar
  3. 3.
    Richards, D.: Two decades of ripple down rules research. Knowl. Eng. Rev. 24, 159–184 (2009)CrossRefGoogle Scholar
  4. 4.
    Cano, A., Zafra, A., Ventura, S.: An interpretable classification rule mining algorithm. Inform. Sciences 240, 1–20 (2013)CrossRefGoogle Scholar
  5. 5.
    Cano, J., Herrera, F., Lozano, M.: Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability. Data Knowl. Eng. 60, 90–108 (2007)CrossRefGoogle Scholar
  6. 6.
    García, S., Fernández, A., Luengo, J., Herrera, F.: A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13, 959–977 (2009)CrossRefGoogle Scholar
  7. 7.
    Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B.: An empiricial evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51, 141–154 (2011)CrossRefGoogle Scholar
  8. 8.
    Culberson, J., Luo, F.: Exploring the k-colorable landscape with iterated greedy. In: Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, vol. 26, pp. 245–284 (1996)Google Scholar
  9. 9.
    Ruiz, R., Stützle, T.: A simple and effective iterated greedy algorithm for the permutation flowshop scheduling problem. Eur. J. Oper. Res. 177, 2033–2049 (2007)CrossRefzbMATHGoogle Scholar
  10. 10.
    Lozano, M., Molina, D., García-Martínez, C.: Iterated greedy for the maximum diversity problem. Eur. J. Oper. Res. 214, 31–38 (2010)CrossRefGoogle Scholar
  11. 11.
    Rodriguez, F., Lozano, M., Blum, C., García-Martínez, C.: An Iterated greedy algorithm for the large-scale unrelated parallel machines scheduling problem. Comput. Oper. Res. 40(7), 1829–1841 (2013)CrossRefMathSciNetGoogle Scholar
  12. 12.
    García-Martínez, C., Rodriguez, F.J., Lozano, M.: Tabu-enhanced iterated greedy algorithm: A case study in the quadratic multiple knapsack problem. Eur. J. Oper. Res. 232, 454–463 (2014)CrossRefGoogle Scholar
  13. 13.
    Ying, K.-C., Cheng, H.-M.: Dynamic parallel machine scheduling with sequence-dependent setup times using an iterated greedy heuristic. Expert Syst. Appl. 37(4), 2848–2852 (2010)CrossRefGoogle Scholar
  14. 14.
    Lozano, M., Molina, D., García-Martínez, C.: Iterated greedy for the maximum diversity problem. Eur. J. Oper. Res. 214, 31–38 (2011)CrossRefzbMATHGoogle Scholar
  15. 15.
    García-Martínez, C., Rodriguez, F.J., Lozano, M.: Tabu-enhanced iterated greedy algorithm: A case study in the quadratic multiple knapsack problem. Eur. J. Oper. Res. 232, 454–463 (2014)CrossRefGoogle Scholar
  16. 16.
    Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inform. Process. Manag. 45(4), 427–437 (2009)CrossRefGoogle Scholar
  17. 17.
    Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30(1), 27–38 (2009)CrossRefGoogle Scholar
  18. 18.
    Zafra, A., Ventura, S.: Multi-instance genetic programming for predicting student performance in web based educational environments. Appl. Soft. Comput. 12(8), 2693–2706 (2012)CrossRefGoogle Scholar
  19. 19.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemannr, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11, 10–18 (2009)CrossRefGoogle Scholar
  20. 20.
    Bache, K., Lichman, M.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences (2013), http://archive.ics.uci.edu/ml
  21. 21.
    Garcia, S., Molina, D., Lozano, M., Herrera, F.: A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: A case study on the CEC’2005 special session on real parameter optimization. J. Heuristics 15(6), 617–644 (2009)CrossRefzbMATHGoogle Scholar
  22. 22.
    Iman, R., Davenport, J.: Approximation of the critical region of the Friedman statistic. Communications in Statistics, 571–595 (1980)Google Scholar
  23. 23.
    Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)zbMATHMathSciNetGoogle Scholar
  24. 24.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)CrossRefGoogle Scholar
  25. 25.
    Bacardit, J., Krasnogor, N.: Performance and efficiency of memetic Pittsburgh learning classifier systems. Evol. Comput. 17, 307–342 (2009)CrossRefGoogle Scholar
  26. 26.
    Guan, S., Zhu, F.: An incremental approach to genetic-algorithms-based classification. IEEE T. Syst. Man. Cy. B 35, 227–239 (2005)CrossRefGoogle Scholar
  27. 27.
    Tan, K., Yu, Q., Ang, J.: A coevolutionary algorithm for rules discovery in data mining. Int. J. Syst. Sci. 37, 835–864 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  28. 28.
    González, A., Perez, R.: Selection of relevant features in a fuzzy genetic learning algorithm. IEEE T. Syst. Man. Cy. B 31, 417–425 (2001)CrossRefGoogle Scholar
  29. 29.
    Sánchez, L., Couse, I., Corrales, J.: Combining GP operators with SA search to evolve fuzzy rule based classifiers. Inform. Sciences 136, 175–192 (2001)CrossRefzbMATHGoogle Scholar
  30. 30.
    Carvalho, D., Freitas, A.: A hybrid decision tree/genetic algorithm method for data mining. Inform. Sciences 163, 13–35 (2004)CrossRefGoogle Scholar
  31. 31.
    Parpinelli, R., Lopes, H., Freitas, A.: Data mining with an ant colony optimization algorithm. IEEE T. Evolut. Comput. 6, 321–332 (2002)CrossRefGoogle Scholar
  32. 32.
    Cohen, W.: Fast effective rule induction. In: Proc. of the 12th International Conference on Machine Learning, pp. 1–10 (1995)Google Scholar
  33. 33.
    Quinlan, J.: C4.5: Programs for Machine Learning (1993)Google Scholar
  34. 34.
    Nauc, D.D.: Measuring interpretability in rule-based classification systems. In: Proc. of the IEEE International Conference on Fuzzy Systems, pp. 196–201 (2002)Google Scholar
  35. 35.
    Luna, J.M., Romero, J.R., Ventura, S.: Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl. Inf. Syst. 32(1), 53–76 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Juan A. Pedraza
    • 1
  • Carlos García-Martínez
    • 2
  • Alberto Cano
    • 2
  • Sebastián Ventura
    • 2
  1. 1.I+D Dpt.Yerbabuena SoftwareMálagaEspaña
  2. 2.Dpt. of Computing and Numerical AnalysisUniversity of CórdobaCórdobaEspaña

Personalised recommendations