Soft Computing

, Volume 16, Issue 12, pp 2143–2163 | Cite as

Classification rule mining using ant programming guided by grammar with multiple Pareto fronts

Original Paper

Abstract

This paper proposes a multi-objective ant programming algorithm for mining classification rules, MOGBAP, which focuses on optimizing sensitivity, specificity, and comprehensibility. It defines a context-free grammar that restricts the search space and ensures the creation of valid individuals, and its heuristic function presents two complementary components. Moreover, the algorithm addresses the classification problem from a new multi-objective perspective specifically suited for this task, which finds an independent Pareto front of individuals per class, so that it avoids the overlapping problem that appears when measuring the fitness of individuals from different classes. A comparative analysis of MOGBAP using two and three objectives is performed, and then its performance is experimentally evaluated throughout 15 varied benchmark data sets and compared to those obtained using another eight relevant rule extraction algorithms. The results prove that MOGBAP outperforms the other algorithms in predictive accuracy, also achieving a good trade-off between accuracy and comprehensibility.

Keywords

Ant programming (AP) Grammar-based automatic programming Multi-objective ant colony optimization (MOACO) Classification Data mining (DM) 

Notes

Acknowledgments

This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, projects P08-TIC-3720 and TIN-2011-22408, and FEDER funds.

References

  1. Abbass HA, Hoai X, Mckay RI (2002) AntTAG: a new method to compose computer programs using colonies of ants. In: IEEE Congress on evolutionary computation (IEEE CEC), pp 1654–1659Google Scholar
  2. Alatas B, Akin E (2009) Multi-objective rule mining using a chaotic particle swarm optimization algorithm. Knowledge-Based Syst 22(6):455–460Google Scholar
  3. Angus D, Woodward C (2009) Multiple objective ant colony optimisation. Swarm Intell 3(1):69–85CrossRefGoogle Scholar
  4. Barron A, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760MathSciNetMATHCrossRefGoogle Scholar
  5. Blum C, Puchinger J, Raidl GR, Roli A (2011) Hybrid metaheuristics in combinatorial optimization: a survey. Appl Soft Comput 11(6):4135–4151CrossRefGoogle Scholar
  6. Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308CrossRefGoogle Scholar
  7. Bojarczuk CC, Lopes HS, Freitas AA, Michalkiewicz EL (2004) A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets. Artif Intell Med 30:27–48CrossRefGoogle Scholar
  8. Bonabeu E, Eric T, Dorigo M (1999) Swarm intelligence: from natural to artificial systems. Oxford University, OxfordGoogle Scholar
  9. Boryczka M (2005) Eliminating introns in ant colony programming. Fundamenta Informaticae 68(1–2):1–19MathSciNetMATHGoogle Scholar
  10. Boryczka M (2008) Ant colony programming with the candidate list. In: KES International conference on agent and multi-agent systems: technologies and applications (KES-AMSTA). LNAI, vol 4953, pp 302–311Google Scholar
  11. Boryczka M, Czech ZJ (2002) Solving approximation problems by ant colony programming. In: Genetic and evolutionary computing late breaking papers, pp 39–46Google Scholar
  12. Boryczka M, Czech ZJ, Wieczorek W (2003) Ant colony programming for approximation problems. In: Genetic and evolutionary computation conference (GECCO), pp 142–143Google Scholar
  13. Chen Y, Yang B, Dong J (2004) Evolving flexible neural networks using ant programming and PSO algorithms. In: Advances in neural networks ISSN. LNCS, vol 3173Google Scholar
  14. Cios K, Pedrycz W, Swiniarski R, Kurgan L (2010) Data mining: a knowledge discovery approach. Springer, BerlinGoogle Scholar
  15. Cohen W (1995) Fast effective rule induction. In: International conference on machine learning (ICML), pp 115–123 Google Scholar
  16. Crepinšek M, Kosar T, Mernik M, Cervelle J, Forax R, Roussel G (2010) On automata and language based grammar metrics. Comput Sci Inf Syst 7(2):309–329CrossRefGoogle Scholar
  17. Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist nondominated sorting genetic algorithm for multi-objective optimisation: NSGA-II. In: International conference on parallel problem solving from nature (PPSN). Springer, Berlin, pp 849–858Google Scholar
  18. Dehuri S, Patnaik S, Ghosh A, Mall R (2008) Application of elitist multi-objective genetic algorithm for classification rule generation. Appl Soft Comput 8:477–487CrossRefGoogle Scholar
  19. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATHGoogle Scholar
  20. Dorigo M, Maniezzo V, Colorni A (1996) The ant system: Optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B 26:29–41CrossRefGoogle Scholar
  21. Dorigo M, Stützle T (2002) The ant colony optimization metaheuristic: algorithms, applications and advances. International series in operations research and management science. Kluwer Academic Publishers, DordrechtGoogle Scholar
  22. Espejo P, Ventura S, Herrera F (2010) A survey on the application of genetic programming to classification. IEEE Trans Syst Man Cybern Part C Appl Rev 40(2):121–144CrossRefGoogle Scholar
  23. Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: International joint conference on uncertainly in artificial intelligence (IJCAI), pp 1022–1029 Google Scholar
  24. Floreano D, Drr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intell 1:47–62CrossRefGoogle Scholar
  25. Frank A, Asuncion A UCI machine learning repository (2010). URL http://archive.ics.uci.edu/ml
  26. Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: International conference on machine learning (ICML), pp 144–151Google Scholar
  27. García-Martínez C, Cordón O, Herrera F (2007) A taxonomy and an empirical analysis of multiple objective ant colony optimization algorithms for the bi-criteria TSP. Eur J Oper Res 180(1):116–148MATHCrossRefGoogle Scholar
  28. Geyer-Schulz A (1995) Fuzzy rule-based expert systems and genetic machine learning, studies in fuzziness, vol 3. Physica, Heidelberg Google Scholar
  29. Green J, Whalley J, Johnson C (2004) Automatic programming with ant colony optimization. In: UK workshop on computational intelligence, pp 70–77Google Scholar
  30. Holden N, Freitas AA (2008) A hybrid PSO/ACO algorithm for discovering classification rules in data mining. J Artif Evol Appl 2008:2:1–2:11Google Scholar
  31. Huang TM, Kecman V, Kopriva I (2006) Support vector machines in classification and regression—an introduction. In: Kernel based algorithms for mining huge data sets: supervised, semi-supervised, and unsupervised learning, studies in computational intelligence. Springer, New YorkGoogle Scholar
  32. Ishibuchi H, Nojima Y (2007) Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int J Approx Reason 44(1):4–31MathSciNetMATHCrossRefGoogle Scholar
  33. Kapočite-Dzikiene J, Raškinis A (2008) Hierarchical classificator: a cognitive approach to decision tree building. Inf Technol Control 37:43–51Google Scholar
  34. Keber C, Schuster MG (2002) Option valuation with generalized ant programming. In: Genetic and evolutionary computation conference (GECCO), pp 74–81Google Scholar
  35. Kosar T, Oliveira N, Mernik M, Pereira MJV, Črepinšek M, da Cruz DC, Henriques PR (2010) Comparing general-purpose and domain-specific languages: An empirical study. Comput Sci Inf Syst 7(2):247–264CrossRefGoogle Scholar
  36. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. The MIT Press, CambridgeGoogle Scholar
  37. Kumaresan N (2011) Optimal control for stochastic linear quadratic singular periodic neuro Takagi-Sugeno (T-S) fuzzy system with singular cost using ant colony programming. Appl Math Model 35(8):3797–3808MathSciNetMATHCrossRefGoogle Scholar
  38. Lanzi P (2008) Learning classifier systems: then and now. Evol Intell 1:63–82CrossRefGoogle Scholar
  39. Martens D, Baesens B, Fawcett T (2011) Editorial survey: swarm intelligence for data mining. Mach Learn 82:1–42CrossRefGoogle Scholar
  40. Martens D, De Backer M, Vanthienen J, Snoeck M, Baesens B (2007) Classification with ant colony optimization. IEEE Trans Evol Comput 11:651–665CrossRefGoogle Scholar
  41. Mernik M, Heering J, Sloane AM (2005) When and how to develop domain-specific languages. ACM Comput Surv 37(4):316–344CrossRefGoogle Scholar
  42. Mernik M, Črepinšek M, Kosar T, Rebernak D, Žumer V (2004) Grammar-based systems: definition and examples. Informatica 28(3):245–255MATHGoogle Scholar
  43. Mullen RJ, Monekosso D, Barman S, Remagnino P (2009) A review of ant algorithms. Expert Syst Appl 36:9608–9617CrossRefGoogle Scholar
  44. Olmo JL, Romero JR, Ventura S (2010) A grammar based ant programming algorithm for mining classification rules. In: IEEE congress on evolutionary computation (IEEE CEC), pp 225–232Google Scholar
  45. Olmo JL, Romero JR, Ventura S (2011) Using ant programming guided by grammar for building rule-based classifiers. IEEE Trans Syst Man Cybern Part B Cybern 41(6):1585–1599CrossRefGoogle Scholar
  46. Otero F, Freitas AA, Johnson C (2008) cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes. In: International conference on Swarm Intelligence (ANTS). LNCS, vol 5217, pp 48–59Google Scholar
  47. Otero FEB, Freitas AA, Johnson CG (2009) Handling continuous attributes in ant colony classification algorithms. In: IEEE symposium on computational intelligence and data mining (IEEE CIDM), pp 225–231Google Scholar
  48. Parpinelli R, Freitas AA, Lopes HS (2002) Data mining with an ant colony optimization algorithm. IEEE Transactions Evol Comput 6:321–332CrossRefGoogle Scholar
  49. Roux O, Fonlupt C (2000) Ant programming: or how to use ants for automatic programming. In: International conference on Swarm Intelligence (ANTS), pp 121–129Google Scholar
  50. Rudokaite-Margelevičiene D, Pranevičius H, Margelevičius M (2006) Data classification using Dirichlet mixtures. Inf Technol Control 35:157–166Google Scholar
  51. Salama K, Abdelbar A, Freitas A (2011) Multiple pheromone types and other extensions to the ant-miner classification rule discovery algorithm. Swarm Intell 5:149–182 Google Scholar
  52. Salehi-Abari A, White T (2008) Enhanced generalized ant programming (EGAP). In: Genetic and evolutionary computation conference (GECCO), pp 111–118Google Scholar
  53. Salehi-Abari A, White T (2009) The uphill battle of ant programming vs. genetic programming. In: International joint conference on computational intelligence (IJCCI), pp 171–176Google Scholar
  54. Shirakawa S, Ogino S, Nagao T (2008) Dynamic ant programming for automatic construction of programs. IEEE Trans Electr Electron Eng 3(5):540–548CrossRefGoogle Scholar
  55. Stützle T, Hoos HH (2000) MAX–MIN ant system. Future Gener Comput Syst 16:889–914CrossRefGoogle Scholar
  56. Torácio A (2009) Multiobjective particle swarm optimization in classification-rule learning, chap. 3. Springer, Berlin, pp 37–64Google Scholar
  57. Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2007) JCLEC: a java framework for evolutionary computation. Soft Comput 12(4):381–392CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Department of Computer Science and Numerical AnalysisUniversity of CordobaCordobaSpain

Personalised recommendations