Improving a multi-objective evolutionary algorithm to discover quantitative association rules

Abstract

This work aims at correcting flaws existing in multi-objective evolutionary schemes to discover quantitative association rules, specifically those based on the well-known non-dominated sorting genetic algorithm-II (NSGA-II). In particular, a methodology is proposed to find the most suitable configurations based on the set of objectives to optimize and distance measures to rank the non-dominated solutions. First, several quality measures are analyzed to select the best set of them to be optimized. Furthermore, different strategies are applied to replace the crowding distance used by NSGA-II to sort the solutions for each Pareto-front since such distance is not suitable for handling many-objective problems. The proposed enhancements have been integrated into the multi-objective algorithm called MOQAR. Several experiments have been carried out to assess the algorithm’s performance by using different configuration settings, and the best ones have been compared to other existing algorithms. The results obtained show a remarkable performance of MOQAR in terms of quality measures.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. 1.

    Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 207–216

  2. 2.

    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases, pp 478–499

  3. 3.

    Aguirre H, Tanaka K (2009) Space partitioning with adaptive ranking and substitute distance assignments: a comparative study on many-objective mnk-landscapes. In: Proceedings of the annual conference on genetic and evolutionary computation, pp 547–554

  4. 4.

    Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10(3):230–237

    Article  Google Scholar 

  5. 5.

    Alatas B, Akin E, Karci A (2008) MODENAR: multi-objective differential evolution algorithm for mining numeric association rules. Appl Soft Comput 8(1):646–656

    Article  Google Scholar 

  6. 6.

    Alcalá-Fdez J, Sánchez L, García S, del Jesús MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318

    Article  Google Scholar 

  7. 7.

    Anand R, Vaid A, Singh PK (2009) Association rule mining using multi-objective evolutionary algorithms: strengths and challenges. In: Proceedings of the IEEE world congress on nature biologically inspired computing, pp 385–390

  8. 8.

    Brin S, Motwani R, Silverstein C (1997) Beyond market baskets, generalizing association rules to correlations. In: Proceedings of the ACM SIGMOD, pp 265–276

  9. 9.

    Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD, pp 265–276

  10. 10.

    Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evolut Comput 6(2):182–197

    Article  Google Scholar 

  11. 11.

    Dehuri S, Jagadev AK, Ghosh A, Mall R (2006) Multi-objective genetic algorithm for association rule mining using a homogeneous dedicated cluster of workstations. Am J Appl Sci 3(11):2086–2095

    Article  Google Scholar 

  12. 12.

    del Jesús MJ, Gámez JA, González P, Puerta JM (2011) On the discovery of association rules by means of evolutionary algorithms. Wiley Interdiscip Rev Data Min Knowl Discov 1(5):397–415

    Article  Google Scholar 

  13. 13.

    García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977

    Article  Google Scholar 

  14. 14.

    Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3):1–42

    Article  Google Scholar 

  15. 15.

    Ghosh A, Nath B (2004) Multi-objective rule mining using genetic algorithms. Inf Sci 163:123–133

    MathSciNet  Article  Google Scholar 

  16. 16.

    Guvenir HA, Uysal I (2000) Bilkent university function approximation repository. http://funapp.cs.bilkent.edu.tr

  17. 17.

    Köppen M, Yoshida K (2007) Substitute distance assignments in NSGA-II for handling many-objective optimization problems. In: Evolutionary multi-criterion optimization, volume 4403 of Lecture Notes in Computer Science. Springer, Berlin, pp 727–741

  18. 18.

    Li D, Deogun J, Spaulding W, Shuart B (2004) Towards missing data imputation: a study of fuzzy k-means clustering method. In: Rough sets and current trends in computing, volume 3066 of Lecture Notes on Computer Science, pp 573–579

  19. 19.

    Luna JM, Romero JR, Ventura S (2012) Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl Inf Syst 32(1):53–76

    Article  Google Scholar 

  20. 20.

    Luna JM, Romero JR, Ventura S (2013) Grammar-based multi-objective algorithms for mining association rules. Data Knowl Eng 86:19–37

    Article  Google Scholar 

  21. 21.

    Martín D, Rosete A, Alcalá-Fdez J, Herrera F (2014) QAR-CIP-NSGA-II: a new multi-objective evolutionary algorithm to mine quantitative association rules. Inf Sci 258:1–28

    MathSciNet  Article  Google Scholar 

  22. 22.

    Martínez-Ballesteros M, Martínez-Álvarez F, Troncoso A, Riquelme JC (2009) Quantitative association rules applied to climatological time series forecasting. In: Proceedings of the international conference on intelligent data engineering and automated learning, volume 5788 of Lecture Notes in Computer Science, pp 284–291

  23. 23.

    Martínez-Ballesteros M, Martínez-Álvarez F, Troncoso A, Riquelme JC (2011) An evolutionary algorithm to discover quantitative association rules in multidimensional time series. Soft Comput 15(10):2065–2084

    Article  Google Scholar 

  24. 24.

    Martínez-Ballesteros M, Martínez-Álvarez F, Troncoso A, Riquelme JC (2014) Selecting the best measures to discover quantitative association rules. Neurocomputing 126:3–14

    Article  Google Scholar 

  25. 25.

    Martínez-Ballesteros M, Salcedo-Sanz S, Riquelme JC, Casanova-Mateo C, Camacho JL (2011) Evolutionary association rules for total ozone content modeling from satellite observations. Chemom Intell Lab Syst 109(2):217–227

    Article  Google Scholar 

  26. 26.

    Mata J, Álvarez J, Riquelme JC (2001) Mining numeric association rules with genetic algorithms. In: Proceedings of the international conference on adaptive and natural computing algorithms, pp 264–267

  27. 27.

    Miller BL, Goldberg DE (1995) Genetic algorithms, tournament selection, and the effects of noise. Complex Syst 9(3):193–212

    MathSciNet  Google Scholar 

  28. 28.

    Pachón Álvarez V, Vázquez JM (2012) An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization. Expert Syst Appl 39(1):585–593

    Article  Google Scholar 

  29. 29.

    Pears R, Koh YS, Dobbie G, Yeap W (2013) Weighted association rule mining via a graph based connectivity model. Inf Sci 218:61–84

    MathSciNet  Article  MATH  Google Scholar 

  30. 30.

    Piatetsky-Shapiro G (1991) Discovery, analysis and presentation of strong rules. In: Proceedings of knowledge discovery in databases. AAAI Press, pp 229–248

  31. 31.

    Qodmanan HR, Nasiri M, Minaei-Bidgoli B (2011) Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence. Expert Syst Appl 38(1):288–298

    Article  Google Scholar 

  32. 32.

    Shortliffe E, Buchanan B (1975) A model of inexact reasoning in medicine. Math Biosci 23:351–379

    MathSciNet  Article  Google Scholar 

  33. 33.

    Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attribute based concepts. In: Proceedings of the European conference on machine learning, pp 280–296

  34. 34.

    Wakabi-Waiswa PP, Baryamureeba V (2008) Extraction of interesting association rules using genetic algorithms. Int J Comput ICT Res 2(1):26–33

    Google Scholar 

  35. 35.

    Yan X, Zhang C, Zhang S (2009) Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Expert Syst Appl 36(2):3066–3076

    Article  Google Scholar 

  36. 36.

    Zhou A, Qu B-Y, Li H, Zhao S-Z, Suganthan PN, Zhang Q (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evol Comput 1(1):32–49

    Article  Google Scholar 

  37. 37.

    Zitzler E, Laumanns M, Thiele L (2001) SPEA2: improving the strength pareto evolutionary algorithm. EUROGEN 3242(103):95–100

    Google Scholar 

  38. 38.

    Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans Evolut Comput 3(4):257–271

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Spanish Ministry of Science and Technology, Junta de Andalucia and University Pablo de Olavide for the support under Projects TIN2011-28956-C02, TIN2014-55894-C2-R, P12-TIC-1728 and APPB813097, respectively.

Author information

Affiliations

Authors

Corresponding author

Correspondence to M. Martínez-Ballesteros.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Martínez-Ballesteros, M., Troncoso, A., Martínez-Álvarez, F. et al. Improving a multi-objective evolutionary algorithm to discover quantitative association rules. Knowl Inf Syst 49, 481–509 (2016). https://doi.org/10.1007/s10115-015-0911-y

Download citation

Keywords

  • Association rules
  • Data mining
  • Evolutionary computation
  • Pareto-optimization