Objectively Evaluating Interestingness Measures for Frequent Itemset Mining

Zimmermann, Albrecht

doi:10.1007/978-3-642-40319-4_31

Albrecht Zimmermann²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7867))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3455 Accesses
3 Citations

Abstract

Itemset mining approaches, while having been studied for more than 15 years, have been evaluated only on a handful of data sets. In particular, they have never been evaluated on data sets for which the ground truth was known. Thus, it is currently unknown whether itemset mining techniques actually recover underlying patterns. Since the weakness of the algorithmically attractive support/confidence framework became apparent early on, a number of interestingness measures have been proposed. Their utility, however, has not been evaluated, except for attempts to establish congruence with expert opinions. Using an extension of the Quest generator proposed in the original itemset mining paper, we propose to evaluate these measures objectively for the first time, showing how many non-relevant patterns slip through the cracks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th VLDB, pp. 487–499 (1994)
Google Scholar
Bayardo, R., Goethals, B., Zaki, M. (eds.): FIMI 2004, ICDM Workshop on Frequent Itemset Mining Implementations (2004)
Google Scholar
Bie, T.D.: Maximum entropy models and subjective interestingness: an application to tiles in binary databases. Data Min. Knowl. Discov. 23(3), 407–446 (2011)
Article MathSciNet MATH Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Boulicaut, J.-F., Jeudy, B.: Mining free itemsets under constraints. In: IDEAS 2001, pp. 322–329 (2001)
Google Scholar
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlations. In: SIGMOD Conference, pp. 265–276 (1997)
Google Scholar
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: SIGMOD Conference, pp. 255–264 (1997)
Google Scholar
Carvalho, D.R., Freitas, A.A., Ebecken, N.F.F.: Evaluating the correlation between objective rule interestingness measures and real human interest. In: Jorge, A., Torgo, L., Brazdil, P., et al. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 453–461. Springer, Heidelberg (2005)
Chapter Google Scholar
Cooper, C., Zito, M.: Realistic synthetic data for testing association rule mining algorithms for market basket databases. In: Kok, J., Koronacki, J., Lopez de Mantaras, R., et al. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 398–405. Springer, Heidelberg (2007)
Chapter Google Scholar
Gouda, K., Zaki, M.J.: Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005)
Article MathSciNet Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD Conference, pp. 1–12 (2000)
Google Scholar
Heikinheimo, H., Seppänen, J.K., Hinkkanen, E., Mannila, H., Mielikäinen, T.: Finding low-entropy sets and trees from binary data. In: KDD, pp. 350–359 (2007)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Chapter Google Scholar
Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: DMKD, pp. 21–30 (2000)
Google Scholar
Pei, Y., Zaïane, O.: A synthetic data generator for clustering and outlier analysis. Tech. rep (2006)
Google Scholar
Ramesh, G., Zaki, M.J., Maniatty, W.: Distribution-based synthetic database generation techniques for itemset mining. In: IDEAS, pp. 307–316 (2005)
Google Scholar
Vreeken, J., Leeuwen, M.v., Siebes, A.: Preserving privacy through data generation. In: ICDM, pp. 685–690 (2007)
Google Scholar
Wu, T., Chen, Y., Han, J.: Re-examination of interestingness measures in pattern mining: a unified framework. Data Min. Knowl. Discov. 21(3), 371–397 (2010)
Article MathSciNet Google Scholar
Zaki, M.J., Hsiao, C.J.: ChArm: An efficient algorithm for closed association rule mining. Tech. rep., CS Dept., Rensselaer Polytechnic Institute (Oct 1999)
Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article MathSciNet Google Scholar
Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: SDM (2002)
Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: KDD, pp. 401–406 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

KU Leuven, Celestijnenlaan 200A, B-3001, Leuven, Belgium
Albrecht Zimmermann

Authors

Albrecht Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Mathematical Sciences, University of South Australia, 1 Mawson Lakes Boulevard, 5095, Adelaide, SA, Australia
Jiuyong Li
Advanced Analytics Institute, University of Technology, 2-12 Blackfriars Street, Chippendale, Blackfriars Campus, 2008, Sydney, NSW, Australia
Longbing Cao & Can Wang &
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117576, Singapore, Singapore
Kay Chen Tan
School of Automation, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, 510006, Guangzhou, China
Bo Liu
School of Computing Science, Simon Fraser University, 8888 University Drive, V5A 1S6, Burnaby, BC, Canada
Jian Pei
Department of Computer Science and Information Engineering, National Cheng Kung University, No.1, University Road, 701, Tainan, Taiwan
Vincent S. Tseng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zimmermann, A. (2013). Objectively Evaluating Interestingness Measures for Frequent Itemset Mining. In: Li, J., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science(), vol 7867. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40319-4_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-40319-4_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40318-7
Online ISBN: 978-3-642-40319-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics