Advertisement

Genetic Programming for Mining Association Rules in Relational Database Environments

  • J. M. Luna
  • A. Cano
  • S. Ventura

Abstract

Most approaches for the extraction of association rules look for associations from a dataset in the form of a single table. However, with the growing interest in the storage of information, relational databases comprising a series of relations (tables) and relationships have become essential. We present the first grammar-guided genetic programming approach for mining association rules directly from relational databases. We represent the relational databases as trees by means of genetic programming, preserving the original database structure and enabling rules to be defined in an expressive and very flexible way. The proposed model deals with both positive and negative items, and also with both discrete and quantitative attributes. We exemplify the utility of the proposed approach with an artificial generated database having different characteristics. We also analyse a real case study, discovering interesting students’ behaviors from a moodle database.

Keywords

Genetic Programming Association Rule Leaf Node Relational Database Association Rule Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This research was supported by the Spanish Ministry of Science and Technology, project TIN-2011-22408, and by FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU grants AP2010-0041 and AP2010-0042.

References

  1. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, pages 307–328. American Association for Artificial Intelligence, Menlo Park, CA, USA, 1996.Google Scholar
  2. R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In J. B. Bocca, M. Jarke, and C. Zaniolo, editors, VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, pages 487–499. San Francisco: Morgan Kaufmann, September 1994.Google Scholar
  3. A. Alashqur. RDB-MINER: A SQL-Based Algorithm for Mining Rrue Relational Databases. Journal of Software, 5(9):998–1005, 2010.CrossRefGoogle Scholar
  4. V. Crestana-Jensen and N. Soporkar. Frequent Itemset Counting Across Multiple Tables. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PADKK ’00), Kyoto, Japan, pages 49–61, April 2000.Google Scholar
  5. P.G. Espejo, S. Ventura, and F. Herrera. A Survey on the Application of Genetic Programming to Classification. IEEE Transactions on Systems, Man and Cybernetics: Part C, 40(2):121–144, 2010.CrossRefGoogle Scholar
  6. F. Berzal and I. Blanco and D. Sánchez and M.A. Vila. Measuring the Accuracy and Interest of Association Rules: A new Framework. Intelligent Data Analysis, 6(3):221–235, 2002.zbMATHGoogle Scholar
  7. A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag Berlin Heidelberg, 2002.zbMATHCrossRefGoogle Scholar
  8. B. Goethals and J. Van den Bussche. Relational Association Rules: Getting WARMeR. In Proceedings of 2002 Pattern Detection and Discovery, ESF Exploratory Workshop, London, UK, pages 125–139, September 2002.Google Scholar
  9. B. Goethals, D. Laurent, W. Le Page, and C. T. Dieng. Mining frequent conjunctive queries in relational databases through dependency discovery. Knowledge and Information Systems, 33(3):655–684, 2012.CrossRefGoogle Scholar
  10. B. Goethals, W. Le Page, and M. Mampaey. Mining Interesting Sets and Rules in Relational Databases. In Proceedings of the ACM Symposium on Applied Computing, Sierre, Switzerland, pages 997–1001, March 2010.Google Scholar
  11. F. Gruau. On using Syntactic Constraints with Genetic Programming. Advances in genetic programming, 2:377–394, 1996.Google Scholar
  12. P. Hájek, I. Havel, and M. Chytil. The GUHA Method of Automatic Hypotheses Determination. Computing, 1(4):293–308, 1966.zbMATHCrossRefGoogle Scholar
  13. J. Han, J. Pei, Y. Yin, and R. Mao. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, 8:53–87, 2004.MathSciNetCrossRefGoogle Scholar
  14. A. Jiménez, F. Berzal, and J.C. Cubero. Using Trees to Mine Multirelational Databases. Data Mining and Knowledge Discovery, pages 1–39, 2011.Google Scholar
  15. A.R. Konan, T.I. GÜndem, and M.E. Kaya. Assignment query and its implementation in moving object databases. International Journal of Information Technology and Decision Making, 9(3):349–372, 2010.zbMATHCrossRefGoogle Scholar
  16. J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems). The MIT Press, December 1992.Google Scholar
  17. J. R. Koza. Introduction to Genetic Programming: Tutorial. In GECCO’08, Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation, Atlanta, Georgia, USA, pages 2299–2338. ACM, July 2008.Google Scholar
  18. J. M. Luna, J. R. Romero, and S. Ventura. Design and Behavior Study of a Grammar-guided Genetic Programming Algorithm for Mining Association Rules. Knowledge and Information Systems, 32(1):53–76, 2012.CrossRefGoogle Scholar
  19. J. Mata, J. L. Alvarez, and J. C. Riquelme. Discovering Numeric Association Rules via Evolutionary Algorithm. Advances in Knowledge Discovery and Data Mining, 2336/2002:40–51, 2002.Google Scholar
  20. R. McKay, N. Hoai, P. Whigham, Y. Shan, and M. ONeill. Grammar-based Genetic Programming: a Survey. Genetic Programming and Evolvable Machines, 11:365–396, 2010.CrossRefGoogle Scholar
  21. E. Ng, A. Fu, and K. Wang. Mining Association Rules from Stars. In Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 2002.Google Scholar
  22. N.F. Papè, J. Alcalá-Fdez, A. Bonarini, and F. Herrera. Evolutionary Extraction of Association Rules: A Preliminary Study on Their Effectiveness, volume 5572/2009 of Lecture Notes in Computer Science, pages 646–653. 2009.Google Scholar
  23. A. Ratle and M. Sebag. Genetic Programming and Domain Knowledge: Beyond the Limitations of Grammar-Guided Machine Discovery. In PPSN VI, Proceedings of the 6th International Conference on Parallel Problem Solving from Nature, Paris, France, pages 211–220, September 2000.Google Scholar
  24. C. Romero, S. Ventura, and P. De Bra. Knowledge Discovery with Genetic Programming for Providing Feedback to Courseware Authors. User Modeling and User-Adapted Interaction, 14:425–464, 2004.CrossRefGoogle Scholar
  25. E. Spyropoulou and T. De Bie. Interesting Multi-relational Patterns. In ICDM 2011, Proceedings of 11th IEEE International Conference on Data Mining, Vancouver, Canada, pages 675–684, December 2011.Google Scholar
  26. S. Ventura, C. Romero, A. Zafra, J.A. Delgado, and C. Hervás. JCLEC: A Framework for Evolutionary Computation, volume 12 of Soft Computing, pages 381–392. Springer Berlin / Heidelberg, 2007.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer Science and Numerical AnalysisUniversity of CrdobaCordobaSpain

Personalised recommendations