A Query-Driven Exploration of Discovered Association Rules

  • Krzysztof Świder
  • Bartosz Jędrzejec
  • Marian Wysocki
Part of the Studies in Computational Intelligence book series (SCI, volume 102)

Summary

The paper concerns the presentation phase of a knowledge discovery process with use of association rules. The rules, once obtained, have normally to be explained and interpreted in order to make use of them. The authors propose an approach based on the employment of Predictive Model Markup Language (PMML) to facilitate an environment for the systematic examination of complex mining models. The PMML is an XML application developed by the Data Mining Group dedicated to data analysis models. We start with a short description of PMML, and show an example of an automatically encoded mining model. Then XQuery language is involved to demonstrate how to explore a model by querying its PMML structure. Preliminary results for a real association rule model are presented in the final part of the paper. Three approaches are considered: (1) simple direct querying of the PMML structure of the discovered model, (2) interactive browsing the rule base using set-theoretic operations, (3) automatic query formulation with genetic programming.

Keywords

Association Rule Rule Base Frequent Itemsets Association Rule Mining Minimum Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul S, Buneman P, Suciu D (1999) Data on the Web. From Relations to Semistructured Data and XML. Morgan KaufmannGoogle Scholar
  2. 2.
    Agrawal R, Imieliński T, Swami A (1993) Mining Association Rules between Sets of Items in Large Databases. Proc. ACM SIGMOD International Conference on Management of Data, Washington, USA:207–216Google Scholar
  3. 3.
    Baragoin C, Chan R, Gottschalk H, Meyer G, Pereira P, Verhees J (2002) Enhance Your Business Applications. Simple Integration of Advanced Data Mining Functions. IBM Corporation. Available via http://www.redbooks.ibm.com/redbooks/pdfs/sg246879.pdf
  4. 4.
    Boag S, Chamberlin D, Fernandez MF, Florescu D, Robie J, Simeon J (2007) XQuery 1.0: An XML Query Language. Available via http://www.w3.org/TR/xquery/
  5. 5.
    Bray T, Paoli J, Sperberg-McQueen CE, Maler E, Yergeau F (2004) Extensible Markup Language (XML) 1.0 Third Edition. W3C Recommendation. Available via http://www.w3.org/TR/2004/REC-xml-20040204/
  6. 6.
    Cordon O, Herrera-Viedma E, Lopez-Pujalte C, Luque M, Zarco C (2003) A Review on the Application of Evolutionary Computation to Information Retrieval. Int. J. of Approximate Reasoning 34(3):241–264MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Gupta GK, Strehl A, Ghosh J (1999) Distance Based Clustering of Association Rules. Proc. of ANNIE 1999. ASME Press 9:759–764Google Scholar
  8. 8.
    Han J, Fu Y, Wang W, Koperski K, Zaiane O (1996) DMQL: a data mining query language for relational databases. In SIGMOD’96 Workshop on Research Issues in DMKD, Montreal, CanadaGoogle Scholar
  9. 9.
    Han J, Kamber M (2001) Data Mining. Concepts and Techniques. Morgan KaufmannGoogle Scholar
  10. 10.
    Hipp J et al. (2002) Efficient Rule Retrieval and Postponed Restrict Operations for Association Rule Mining. Pacific-Asia Conference on Knowledge Discovery and Data Mining:52–65Google Scholar
  11. 11.
    Imieliński T, Mannila H (1996) A database perspective on knowledge discovery. Communications of the ACM 39(11):58–64CrossRefGoogle Scholar
  12. 12.
    Imieliński T, Virmani A (1999) MSQL: A Query Language for Database Mining. Data Mining and Knowledge Discovery 3(2):373–408CrossRefGoogle Scholar
  13. 13.
    Imieliński T, Virmani A, Abdulghani A (1999) DMajor – Application Programming Interface for Database Mining. Data Mining and Knowledge Discovery 3(2):347–372CrossRefGoogle Scholar
  14. 14.
    Koza JR et al. (2003) Genetic Programming IV Routine Human-Competitive Machine Intelligence. Kluwer Academic PublishersGoogle Scholar
  15. 15.
    Kraft DH, Petry FE, Bucles BP, Sadasivan T (1994) The Use of Genetic Programming to Build Queries for Information Retrieval. Proc. of the 1994 IEEE World Congress on Computational Intelligence:468–473Google Scholar
  16. 16.
    Michalewicz Z (1996) Genetic Algorithms + Data Structures = Evolution Programs. Springer–Verlag, Berlin HeidelbergMATHGoogle Scholar
  17. 17.
    Meo R et al. (1996) A New SQL–like Operator for Mining Association Rules. The VLDB Journal:122–133Google Scholar
  18. 18.
    Meo R et al. (1998) A tightlycoupled architecture for data mining. In ICDE’98. IEEE Computer Society Press:316–322Google Scholar
  19. 19.
    Oracle (2003) Oracle9i Data Mining. Oracle Corporation. Available via http://otn.oracle.com/products/bi/pdf/o9i2dm_ds.pdf
  20. 20.
    PMML (2003) Predictive Model Markup Language (PMML) Project Page. Available via http://sourceforge.net/projects/pmml
  21. 21.
    Saaty TL (1996) The Analytic Hierarchy Process. McGraw Hill, New York 1980, reprinted by RWS Publications, PittsburghGoogle Scholar
  22. 22.
    Tuzhilin A, Liu B (2002) Querying Multiple Sets of Discovered Rules. In Proc. of the 8th ACM SIGKDD Int. Conf. On Knowledge Discovery and Data Mining. SIGKDD’02:52–60Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Krzysztof Świder
    • 1
  • Bartosz Jędrzejec
    • 1
  • Marian Wysocki
    • 1
  1. 1.Rzeszów University of TechnologyRzeszówPoland

Personalised recommendations