Advertisement

Methodology and Computing in Applied Probability

, Volume 9, Issue 3, pp 447–463 | Cite as

A Probabilistic Framework Towards the Parameterization of Association Rule Interestingness Measures

  • Stéphane Lallich
  • Benoît Vaillant
  • Philippe Lenca
Article

Abstract

In this paper, we first present an original and synthetic overview of the most commonly used association rule interestingness measures. These measures usually relate the confidence of a rule to an independence reference situation. Yet, some relate it to indetermination, or impose a minimum confidence threshold. We propose a systematic generalization of these measures, taking into account a reference point chosen by an expert in order to appreciate the confidence of a rule. This generalization introduces new connections between measures, and leads to the enhancement of some of them. Finally we propose new parameterized possibilities.

Keywords

Interestingness measure Association rule Independence Indetermination Probabilistic models 

AMS 2000 Subject Classification

62H15 62H17 62H20 68T10 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large databases.” In P. Buneman and S. Jajodia (eds.), ACM SIGMOD International Conference on Management of Data, pp. 207–216, ACM Press: Washington, D.C., USA, 1993.Google Scholar
  2. R. Agrawal, and R. Srikant, “Fast algorithms for mining association rules.” In J. Bocca, M. Jarke and C. Zaniolo (eds.), Proceedings of the 20th Very Large Data Bases Conference, pp. 487–499, Morgan Kaufmann: Santiago de Chile, Chile, 1994.Google Scholar
  3. J. Azé, and Y. Kodratoff, “Evaluation de la résistance au bruit de quelques mesures d’extraction de règles d’assocation.” In D. Hérin and D. Zighed (eds.), Extraction des connaissances et apprentissage, vol. 1 pp. 143–154, Hermes: Paris, 2002.Google Scholar
  4. J. Blanchard, F. Guillet, H. Briand, and R. Gras, “Assessing the interestingness of rules with a probabilistic measure of deviation from equilibrium.” In J. Janssen, and P. Lenca (eds.), The XIth International Symposium on Applied Stochastic Models and Data Analysis, pp. 191–200, Brest: France, 2005.Google Scholar
  5. J. Blanchard, P. Kuntz, F. Guillet, and R. Gras, “Mesure de la qualité des règles d’association par l’intensité d’implication entropique. Revue des Nouvelles Technologies de l’Information (Mesures de Qualité pour la Fouille de Données),” (RNTI-E-1):33–43, 2004.Google Scholar
  6. S. Brin, R. Motwani, and C. Silverstein, “Beyond market baskets: generalizing association rules to correlations.” In ACM SIGMOD/PODS’97 Joint Conference, pp. 265–276, 1997a.Google Scholar
  7. S. Brin, R. Motwani, J. Ullman, and S. Tsur, “Dynamic itemset counting and implication rules for market basket data.” In J. Peckham (ed.), ACM SIGMOD International Conference on Management of Data, pp. 255–264, ACM: Tucson, Arizona, USA, 1997b.Google Scholar
  8. K. Church, and P. Hanks, “Word association norms, mutual information an lexicography,” Computational Linguistics, vol. 16(1) pp. 22–29, 1990.Google Scholar
  9. A. Freitas, “On rule interestingness measures,” Knowledge-Based Systems Journal, vol. 12 pp. 309–315, 1999.CrossRefGoogle Scholar
  10. A. Freitas, “Understanding the crucial differences between classification and discovery of association rules—a position paper.” In ACM SIGKDD Explorations, vol. 2 pp. 65–69, ACM Press: New York, NY, USA, 2000.Google Scholar
  11. T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama, “Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization.” In ACM SIGMOD International Conference on Management of Data, ACM Press: Montreal, Quebec, Canada, pp. 13–23, 1996.Google Scholar
  12. J.-G. Ganascia, “Charade: une sémantique cognitive pour les heuristiques d’apprentissage.” In Journées Internationales les Systèmes Experts et leurs Applications, Avignon, 1988.Google Scholar
  13. R. Gras, Contribution à l’étude expérimentale et à l’analyse de certaines acquisitions cognitives et de certains objectifs didactiques en mathématiques. Ph.D. thesis, Université de Rennes I, 1979Google Scholar
  14. R. Gras, S. Ag. Almouloud, M. Bailleuil, A. Larher, M. Polo, H. Ratsimba-Rajohn, and A. Totohasina, L’implication Statistique, Nouvelle Méthode Exploratoire de Données. Application à la Didactique, Travaux et Thèses. La Pensée Sauvage, 1996.Google Scholar
  15. R. Gras, R. Couturier, J. Blanchard, H. Briand, P. Kuntz, and P. Peter, “Quelques critères pour une mesure de qualité de règles d’association - un exemple: l’intensité d’implication. Revue des Nouvelles Technologies de l’Information (Mesures de Qualité pour la Fouille de Données),” (RNTI-E-1):3–31, 2004.Google Scholar
  16. R. Gras, P. Kuntz, R. Couturier, and F. Guillet, “Une version entropique de l’intensité d’implication pour les corpus volumineux.” In H. Briand and F. Guillet (eds.), Extraction des connaissances et apprentissage, vol. 1 pp. 69–80. Hermes: Paris, 2001.Google Scholar
  17. F. Guillet, Mesures de la qualité des connaissances en ECD. Atelier, Extraction et gestion des connaissances, 2004.Google Scholar
  18. R. Hilderman, and H. Hamilton, “Applying objective interestingness measures in data mining systems.” In Fourth European Symposium on Principles of Data Mining and Knowledge Discovery, pp. 432–439. Springer Verlag: Berlin Heidelberg New York, 2000.CrossRefGoogle Scholar
  19. R. Hilderman, and H. Hamilton, “Measuring the interestingness of discovered knowledge: a principled approach,” Intelligent Data Analysis vol. 7(4) pp. 347–382, 2003.zbMATHGoogle Scholar
  20. R. J. Hilderman, and H. J. Hamilton, “Knowledge discovery and interestingness measures: a survey. Technical Report 99-4,” Department of Computer Science, University of Regina, 1999.Google Scholar
  21. H. Jeffreys, “Some tests of significance treated by the theory of probability.” In Proceedings of the Cambridge Philosophical Society, no. 31 pp. 203–222, 1935.Google Scholar
  22. S. Lallich, “Mesure et validation en extraction des connaissances à partir des données. Habilitation à Diriger des Recherches,” Université Lyon 2, 2002.Google Scholar
  23. S. Lallich, E. Prudhomme, and O. Teytaud, “Contrôle du risque multiple en sélection de règles d’association significatives.” In G. Hébrail, L. Lebart and J.-M. Petit, (eds.), Extraction et gestion des connaissances, vols. 1–2, pp. 305–316. Cépaduès Editions, 2004.Google Scholar
  24. S. Lallich, and O. Teytaud, “Évaluation et validation de l’intérêt des règles d’association. Revue des Nouvelles Technologies de l’Information (Mesures de Qualité pour la Fouille de Données),” (RNTI-E-1):193–217, 2004.Google Scholar
  25. S. Lallich, B. Vaillant, and P. Lenca, “Parametrised measures for the evaluation of association rule interestingness.” In J. Janssen, and P. Lenca (eds.), The XIth International Symposium on Applied Stochastic Models and Data Analysis, pp. 220–229, Brest: France, 2005.Google Scholar
  26. N. Lavrac, P. Flach, and B. Zupan, “Rule evaluation measures: a unifying view.” In S. Dzeroski and P. Flach (eds.), Ninth International Workshop on Inductive Logic Programming, vol. 1634 of Lecture Notes in Computer Science, pp. 174–185. Springer-Verlag: Berlin Heidelberg New York, 1999.CrossRefGoogle Scholar
  27. P. Lenca, P. Meyer, B. Vaillant, and S. Lallich, “A multicriteria decision aid for interestingness measure selection. Technical Report LUSSI-TR-2004-01-EN,” Département LUSSI, ENST Bretagne, 2004a.Google Scholar
  28. P. Lenca, P. Meyer, B. Vaillant, P. Picouet, and S. Lallich, “Évaluation et analyse multicritère des mesures de qualité des règles d’association. Revue des Nouvelles Technologies de l’Information (Mesures de Qualité pour la Fouille de Données),” (RNTI-E-1):219–246, 2004b.Google Scholar
  29. P. Lenca, B. Vaillant, P. Meyer, and S. Lallich, Quality Measures in Data Mining, chapter Association rule interestingness measures: experimental and theoretical studies. Studies in Computational Intelligence, In F. Guillet, and H. J. Hamilton (eds.). Springer: Berlin Heidelberg New York, 2007.Google Scholar
  30. I. Lerman, and Azé, J., “Une mesure probabiliste contextuelle discriminante de qualité des règles d’association.” In M.-S. Hacid, Y. Kodratoff, and D. Boulanger (eds.), Extraction et gestion des connaissances, vol. 17 of RSTI-RIA pp. 247–262. Lavoisier, 2003.Google Scholar
  31. I. Lerman, R. Gras, and H. Rostam, Elaboration d’un indice d’implication pour les données binaires, i et ii. Mathématiques et Sciences Humaines, (74, 75):5–35, 5–47, 1981.Google Scholar
  32. J. Loevinger, “A systemic approach to the construction and evaluation of tests of ability,” Psychological monographs vol. 61(4), 1947.Google Scholar
  33. K. McGarry, “A survey of interestingness measures for knowledge discovery,” Knowledge Engineering Review Journal vol. 20(1) pp. 39–61, 2005.CrossRefGoogle Scholar
  34. K. Pearson, “Mathematical contributions to the theory of evolution. iii. regression, heredity and panmixia,” Philosophical Transactions of the Royal Society, A, 1896.Google Scholar
  35. G. Piatetsky-Shapiro, “Discovery, analysis and presentation of strong rules.” In G. Piatetsky-Shapiro and W. Frawley (eds.), Knowledge Discovery in Databases, pp. 229–248. AAAI/MIT Press, 1991.Google Scholar
  36. M. Sebag, and M. Schoenauer, “Generation of rules with certainty and confidence factors from incomplete and incoherent learning bases.” In J. Boose, B. Gaines and M. Linster (eds.), The European Knowledge Acquisition Workshop, pp. 28–1–28–20. Gesellschaft für Mathematik und Datenverarbeitung mbH, 1988.Google Scholar
  37. E. Suzuki, “In pursuit of interesting patterns with undirected discovery of exception rules.” In S. Arikawa, and A. Shinohara (eds.), Progresses in Discovery Science, vol. 2281 of Lecture Notes in Computer Science, pp. 504–517. Springer-Verlag: Berlin Heidelberg New York, 2002.CrossRefGoogle Scholar
  38. P.-N. Tan, V. Kumar, and J. Srivastava, “Selecting the right objective measure for association analysis,” Information Systems vol. 4(29) pp. 293–313, 2004.CrossRefGoogle Scholar
  39. B. Vaillant, P. Lenca, and S. Lallich, “A clustering of interestingness measures.” In E. Suzuki and S. Arikawa (eds.), Discovery Science, vol. 3245 of Lecture Notes in Artificial Intelligence, pp. 290–297, Springer-Verlag: Padova, Italy, 2004.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Stéphane Lallich
    • 1
  • Benoît Vaillant
    • 2
    • 3
  • Philippe Lenca
    • 2
  1. 1.Laboratoire ERICUniversité Lyon 2Bron CedexFrance
  2. 2.GET–ENST Bretagne–Département LUSSI, CNRS UMR 2872 TAMCICBrest CedexFrance
  3. 3.UBS–IUT de Vannes–Département STID Laboratoire VALORIAVannesFrance

Personalised recommendations