Advertisement

Data Mining pp 75-98 | Cite as

Mining Interesting Rules Without Support Requirement: A General Universal Existential Upward Closure Property

  • Yannick Le Bras
  • Philippe Lenca
  • Stéphane Lallich
Chapter
Part of the Annals of Information Systems book series (AOIS, volume 8)

Abstract

Many studies have shown the limits of support/confidence framework used in Apriori-like algorithms to mine association rules. There are a lot of efficient implementations based on the antimonotony property of the support. But candidate set generation is still costly and many rules are uninteresting or redundant. In addition one can miss interesting rules like nuggets. We are thus facing a complexity issue and a quality issue.

One solution is to get rid of frequent itemset mining and to focus as soon as possible on interesting rules. For that purpose algorithmic properties were first studied, especially for the confidence. They allow to find all confident rules without a preliminary support pruning.

Recently, in the case of class association rules, the universal existential upward closure property of confidence has been exploited in an efficient manner. Indeed, it allows to use a pruning strategy for an Apriori-like but top-down associative classification rules algorithm.

We present a new formal framework which allows us to make the link between analytic and algorithmic properties of the measures. We then apply this framework to propose a general universal existential upward closure. We demonstrate a necessary condition and a sufficient condition of existence for this property. These results are then applied to 32 measures and we show that 13 of them do have the GUEUC property.

Keywords

Data Mining Association Rule Knowledge Discovery Descriptor System Frequent Itemsets 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abe, H., Tsumoto, S.: Analyzing correlation coefficients of objective rule evaluation indices on classification rules. In: G. Wang, T. rui Li, J.W. Grzymala-Busse, D. Miao, A. Skowron, Y. Yao (eds.) 3rd International Conference on Rough Sets and Knowledge Technology, Chengdu, China, Lecture Notes in Computer Science, vol. 5009, pp. 467–474. Springer (2008)Google Scholar
  2. 2.
    Abe, H., Tsumoto, S., Ohsaki, M., Yamaguchi, T.: Finding functional groups of objective rule evaluation indices using pca. In: T. Yamaguchi (ed.) 7th International Conference on Practical Aspects of Knowledge Management Yokohama, Japan, Lecture Notes in Computer Science, vol. 5345, pp.197–206. Springer (2008)Google Scholar
  3. 3.
    Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. In: P. Buneman, S. Jajodia (eds.) ACM SIGMOD International Conference on Management of Data, Washington Washington, D.C., United States, pp. 207–216. ACM Press, New York, NY, USA (1993)Google Scholar
  4. 4.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy (eds.) Advances in Knowledge Discovery and DataMining, pp. 307–328. AAAI/MIT Press, Menlo Park, CA, USA (1996)Google Scholar
  5. 5.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: J.B. Bocca, M. Jarke, C. Zaniolo (eds.) 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp. 478–499. Morgan Kaufmann (1994)Google Scholar
  6. 6.
    Barthélemy, J.P., Legrain, A., Lenca, P., Vaillant, B.: Aggregation of valued relations applied to association rule interestingness measures. In: V. Torra, Y. Narukawa, A. Valls, J. Domingo-Ferrer (eds.) 3rd International Conference on Modeling Decisions for Artificial Intelligence Tarragona, Spain, Lecture Notes in Computer Science, vol. 3885, pp. 203–214. Springer (2006)Google Scholar
  7. 7.
    Bayardo Jr, R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. In: 15th International Conference on Data Engineering, Sydney, Australia, pp. 188–197. IEEE Computer Society, Washington, DC, USA (1999)Google Scholar
  8. 8.
    Bhattacharyya, R., Bhattacharyya, B.: High confidence association mining without support pruning. In: A. Ghosh, R.K. De, S.K. Pal (eds.) 2nd International Conference on Pattern Recognition and Machine Intelligence, Kolkata, India, Lecture Notes in Computer Science, vol. 4815, pp. 332–340. Springer (2007)Google Scholar
  9. 9.
    Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: T.B. Ho, D.W.L. Cheung, H. Liu (eds.) 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, vol. 3518, pp. 114–124. Springer (2005)Google Scholar
  10. 10.
    Bonchi, F., Lucchese, C.: Extending the state-of-the-art of constraint-based pattern discovery. Data and Knowledge Engineering 60(2), 377–399 (2007)CrossRefGoogle Scholar
  11. 11.
    Borgelt, C., Kruse, R.: Induction of association rules: apriori implementation. In: 15th Conference on Computational Statistics, Berlin, Germany, pp. 395–400. Physika Verlag, Heidelberg, Germany (2002)Google Scholar
  12. 12.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)Google Scholar
  13. 13.
    Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: J. Peckham (ed.) ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, USA, pp. 255–264. ACM Press, New York, NY, USA (1997)CrossRefGoogle Scholar
  14. 14.
    Cheung, Y.L., Fu, A.W.C.: Mining frequent itemsets without support threshold: With and without item constraints. IEEE Transaction on Knowledge and Data Engineering 16(9), 1052–1069 (2004)CrossRefGoogle Scholar
  15. 15.
    Cohen, E., Datar, M., Fujiwara, S., Gionis, A., Indyk, P., Motwani, R., Ullman, J.D., Yang, C.: Finding interesting associations without support pruning. IEEE Transaction on Knowledge and Data Engineering 13(1), 64–78 (2001)CrossRefGoogle Scholar
  16. 16.
    Diatta, J., Ralambondrainy, H., Totohasina, A.: Towards a unifying probabilistic implicative normalized quality measure for association rules. In: F. Guillet, H.J. Hamilton (eds.) Quality Measures in Data Mining, Studies in Computational Intelligence, vol. 43, pp. 237–250. Springer (2007)Google Scholar
  17. 17.
    Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Computing Surveys 38(3, Article 9) (2006)Google Scholar
  18. 18.
    Goethals, B.: Frequent set mining. In: O. Maimon, L. Rokach (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 377–397. Springer, New York (2005)CrossRefGoogle Scholar
  19. 19.
    Goethals, B., Zaki, M.J.: Advances in frequent itemset mining implementations: report on FIMI’03. SIGKDD Explorations 6(1), 109–117 (2004)CrossRefGoogle Scholar
  20. 20.
    Gras, R., Couturier, R., Blanchard, J., Briand, H., Kuntz, P., Peter, P.: Quelques critères pour une mesure de qualité de règles d’association – un exemple : l’intensité d’implication. RNTIE- 1 (Mesures de qualité pour la fouille de données) pp. 3–31 (2004)Google Scholar
  21. 21.
    Guillaume, S.: Discovery of ordinal association rules. In: M.S. Cheng, P.S. Yu, B. Liu (eds.) AKDD6thTaipei, Taiwan, Lecture Notes in Computer Science, vol. 2336, pp. 322–327. Springer-Verlag, London, UK (2002)Google Scholar
  22. 22.
    Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)CrossRefGoogle Scholar
  23. 23.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: W. Chen, J.F. Naughton, P.A. Bernstein (eds.) ACM SIGMOD International Conference on Management of Data, Dallas, Texas, pp. 1–12. ACM New York, NY, USA, Dallas, Texas, USA (2000)CrossRefGoogle Scholar
  24. 24.
    Hébert, C., Crémilleux, B.: A unified view of objective interestingness measures. In: P. Perner (ed.) 5th International Conference on Machine Learning and Data Mining, Leipzig,, Germany, Lecture Notes in Computer Science, vol. 4571, pp. 533–547. Springer (2007)Google Scholar
  25. 25.
    Karel, F.: Quantitative and ordinal association rules mining (qar mining). In: B. Gabrys, R.J. Howlett, L.C. Jain (eds.) 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, Bournemouth,UK, Lecture Notes in Computer Science, vol. 4251, pp. 195–202. Springer (2006)Google Scholar
  26. 26.
    Lallich, S., Vaillant, B., Lenca, P.: Parametrised measures for the evaluation of association rule interestingness. In: J. Janssen, P. Lenca (eds.) 11th International Symposium on Applied Stochastic Models and Data Analysis, Brest, France, pp. 220–229 (2005)Google Scholar
  27. 27.
    Lallich, S., Vaillant, B., Lenca, P.: A probabilistic framework towards the parameterization of association rule interestingness measures. Methodology and Computing in Applied Probability 9, 447–463 (2007)CrossRefGoogle Scholar
  28. 28.
    Le Bras, Y., Lenca, P., Lallich, S.: On optimal rules discovery: a framework and a necessary and sufficient condition of antimonotonicity. In: T. Theeramunkong, B. Kijsirikul, N. Cercone, H.T. Bao (eds.) 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand. Springer (2009)Google Scholar
  29. 29.
    Le Bras, Y., Lenca, P., Lallich, S., Moga, S.: Généralisation de la propriété de monotonie de la all-confidence pour l’extraction de motifs intéressants non fréquents. In: 5th Workshop on Qualité des Données et des Connaissances, in conjunction with the 9th Extraction et Gestion des Connaissances conference, Strasbourg, France (2009)Google Scholar
  30. 30.
    Lenca, P., Meyer, P., Picouet, P., Vaillant, B.: Aide multicritére à la décision pour évaluer les indices de qualité des connaissances – modélisation des préférences de l’utilisateur. In: M.S. Hacid, Y. Kodratoff, D. Boulanger (eds.) Revue des Sciences et Technologies de l’Information sèrie RIA ECA, vol. 17, pp. 271–282. Hermes Science Publications (2003)Google Scholar
  31. 31.
    Lenca, P., Meyer, P., Vaillant, B., Lallich, S.: On selecting interestingness measures for association rules: user oriented description and multiple criteria decision aid. European Journal of Operational Research 184(2), 610–626 (2008)CrossRefGoogle Scholar
  32. 32.
    Lenca, P., Meyer, P., Vaillant, B., Picouet, P., Lallich, S.: évaluation et analyse multicritére des mesures de qualité des régles d’association. RNTI-E-1 (Mesures de qualitè pour la fouille de donnèes) pp. 219–246 (2004)Google Scholar
  33. 33.
    Leung, C.K.S., Lakshmanan, L.V.S., Ng, R.T.: Exploiting succinct constraints using FP-trees. SIGKDD Explorations 4(1), 40–49 (2002)CrossRefGoogle Scholar
  34. 34.
    Li, J.: On optimal rule discovery. IEEE Transaction on Knowledge and Data Engineering 18(4), 460–471 (2006)CrossRefGoogle Scholar
  35. 35.
    Li, J., Zhang, X., Dong, G., Ramamohanarao, K., Sun, Q.: Efficient mining of high confidience association rules without support thresholds. In: J.M. Zytkow, J. Rauch (eds.) 3rd European Conference on Principles of Data Mining and Knowledge Discovery, Prague, Czech Republic, Lecture Notes in Computer Science, vol. 1704, pp. 406–411. Springer (1999)Google Scholar
  36. 36.
    Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: N. Cercone, T.Y. Lin, X. Wu (eds.) 1st IEEE International Conference on DataMining, San Jose, California, USA, pp. 369–376. IEEE Computer Society, Washington, DC, USA (2001)Google Scholar
  37. 37.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: R. Agrawal, P.E. Stolorz, G. Piatetsky-Shapiro (eds.) 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York City, USA, pp. 80–86. AAAI Press (1998)Google Scholar
  38. 38.
    Morishita, S., Sese, J.: Transversing itemset lattices with statistical metric pruning. In: 19th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Dallas, Texas, United States, pp. 226–236. ACM, New York, NY, USA (2000)Google Scholar
  39. 39.
    Ohsaki, M., Kitaguchi, S., Okamoto, K., Yokoi, H., Yamaguchi, T.: Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In: J.F. Boulicaut, F. Esposito, F. Giannotti, D. Pedreschi (eds.) 8th European Conference on Principles of Data Mining and Knowledge Discovery, Pisa Italy, Lecture Notes in Computer Science, vol. 3202, pp. 362–373. Springer, New York, NY, USA (2004)Google Scholar
  40. 40.
    Omiecinski, E.: Alternative interest measures for mining associations in databases. IEEE Transaction on Knowledge and Data Engineering 15(1), 57–69 (2003)CrossRefGoogle Scholar
  41. 41.
    Pei, J., Han, J.: Can we push more constraints into frequent pattern mining? In: 6th ACM SIGKDD International Conference on Knowledge Discovery and DataMining, Boston,Massachusetts, United States, pp. 350–354. ACM, New York, NY, USA (2000)Google Scholar
  42. 42.
    Pei, J., Han, J.: Constrained frequent pattern mining: A pattern-growth view. SIGKDD Explorations 4(1), 31–39 (2002)CrossRefGoogle Scholar
  43. 43.
    Pei, J., Han, J., Lakshmanan, L.V.: Mining frequent itemsets with convertible constraints. In: 17th International Conference on Data Engineering Heidelberg, Germany, pp. 433–442. IEEE Computer Society, Washington, DC, USA (2001)Google Scholar
  44. 44.
    Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30. ACM, Dallas, TX, USA (2000)Google Scholar
  45. 45.
    Piatetsky-Shapiro, G.: Discovery, analysis, and presentation of strong rules. In: Knowledge Discovery in Databases, pp. 229–248. AAAI/MIT Press (1991)Google Scholar
  46. 46.
    Plasse, M., Niang, N., Saporta, G., Villeminot, A., Leblond, L.: Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set. Computational Statistics & Data Analysis 52(1), 596–613 (2007)CrossRefGoogle Scholar
  47. 47.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)Google Scholar
  48. 48.
    Sebag, M., Schoenauer, M.: Generation of rules with certainty and confidence factors from incomplete and incoherent learning bases. In: J. Boose, B. Gaines, M. Linster (eds.) European Knowledge Acquisition Workshop, pp. 28–1 – 28–20. Gesellschaft für Mathematik und Datenverarbeitung mbH, Sankt Augustin, Germany (1988)Google Scholar
  49. 49.
    Slowiński, R., Greco, S., Szczȩ ech, I.: Analysis of monotonicity properties of new normalized rule interestingness measures. In: P. Brézillon, G. Coppin, P. Lenca (eds.) International Conference on Human Centered Processes, vol. 1, pp. 231–242. TELECOM Bretagne, Delft, The Netherlands (2008)Google Scholar
  50. 50.
    Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Information Systems 4(29), 293–313 (2004)CrossRefGoogle Scholar
  51. 51.
    Toivonen, H.: Sampling large databases for association rules. In: T. Vijayaraman, A.P. Buchmann, C. Mohan, N. Sarda (eds.) 22nd International Conference on Very Large Data Bases, Bombay, India, pp. 134–145. Morgan Kaufman (1996)Google Scholar
  52. 52.
    Tsumoto, S.: Clinical knowledge discovery in hospital information systems: Two case studies. In: D.A. Zighed, H.J. Komorowski, J.M. Zytkow (eds.) 4th European Conference on Principles of Data Mining and Knowledge Discovery, Lyon, France, pp. 652–656. Springer (2000)Google Scholar
  53. 53.
    Vaillant, B., Lenca, P., Lallich, S.: A clustering of interestingness measures. In: E. Suzuki, S. Arikawa (eds.) 7th International Conference on Discovery Science, Padova, Italy, Lecture Notes in Computer Science, vol. 3245, pp. 290–297. Springer (2004)Google Scholar
  54. 54.
    Wang, K., He, Y., Cheung, D.W.: Mining confident rules without support requirement. In: 10th International Conference on Information and Knowledge Management, Atlanta, Georgia, USA, pp. 89–96. ACM, New York, NY, USA (2001)Google Scholar
  55. 55.
    Wang, K., He, Y., Han, J.: Mining frequent itemsets using support constraints. In: A.E. Abbadi, M.L. Brodie, S. Chakravarthy, U. Dayal, N. Kamel, G. Schlageter, K.Y. Whang (eds.) 26th International Conference on Very Large Data Bases, Egypt, pp. 43–52. Morgan Kaufmann (2000)Google Scholar
  56. 56.
    Wang, K., Zhou, S., He, Y.: Growing decision trees on support-less association rules. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, Massachusetts, United States, pp. 265–269. ACM, New York, NY, USA (2000)Google Scholar
  57. 57.
    Webb, G.I.: Efficient search for association rules. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, Massachusetts, United States, pp. 99–107. ACM, New York, NY, USA (2000)Google Scholar
  58. 58.
    Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)Google Scholar
  59. 59.
    Xiong, H., Tan, P.N., Kumar, V.: Mining strong affinity association patterns in data sets with skewed support distribution. In: 3rd IEEE International Conference on Data Mining, Melbourne,Florida, USA, pp. 387–394. IEEE Computer Society, Washington, DC, USA (2003)Google Scholar
  60. 60.
    Yao, Y., Chen, Y., Yang, X.D.: A measurement-theoretic foundation of rule interestingness evaluation. In: T.Y. Lin, S. Ohsuga, C.J. Liau, X. Hu (eds.) Foundations and Novel Approaches in Data Mining, Studies in Computational Intelligence, vol. 9, pp. 41–59. Springer (2006)Google Scholar
  61. 61.
    Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: D. Barbará, C. Kamath (eds.) 3dr SIAM International Conference on Data Mining,San Francisco, CA, USA, pp. 331–335. SIAM (2003)Google Scholar
  62. 62.
    Zaki, M.J.: Mining non-redundant association rules. Data Mining and Knowledge Discovery 9(3), 223–248 (2004)CrossRefGoogle Scholar
  63. 63.
    Zaki, M.J., Hsiao, C.J.: CHARM: An efficient algorithm for closed itemset mining. In: R.L. Grossman, J. Han, V. Kumar, H. Mannila, R. Motwani (eds.) 2nd SIAM International Conference on Data Mining, Arlington, VA, USA, pp. 457–473. SIAM (2002)Google Scholar
  64. 64.
    Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Francisco, California, pp. 401–406. ACM, New York, NY, USA (2001)Google Scholar
  65. 65.
    Zighed, D.A., Rakotomalala, R.: Graphes d’induction : apprentissage et data mining. Hermès, Paris (2000). 475 p.Google Scholar
  66. 66.
    Zimmermann, A., De Raedt, L.: CorClass: Correlated association rule mining for classification. In: E. Suzuki, S. Arikawa (eds.) 7th International Conference on Discovery Science,Padova, Italy, Lecture Notes in Computer Science, vol. 3245, pp. 60–72. Springer (2004)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Institut Telecom; Telecom BretagneUMR CNRS 3192 Lab-STICCBrest Cedex 3France
  2. 2.Université Européenne de BretagneRennesFrance
  3. 3.Laboratoire ERICUniversité de LyonLyon 2France

Personalised recommendations