Skip to main content

Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 4265)

Abstract

The purpose of this paper is two-fold: First, we give efficient algorithms for answering itemset support queries for collections of itemsets from various representations of the frequency information. As index structures we use itemset tries of transaction databases, frequent itemsets and their condensed representations. Second, we evaluate the usefulness of condensed representations of frequent itemsets to answer itemset support queries using the proposed query algorithms and index structures. We study analytically the worst-case time complexities of querying condensed representations and evaluate experimentally the query efficiency with random itemset queries to several benchmark transaction databases.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/11893318_18
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-540-46493-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) SIGMOD Conference, pp. 207–216 (1993)

    Google Scholar 

  2. Goethals, B.: Frequent set mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 377–397. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  3. Goethals, B., Zaki, M.J. (eds.): FIMI 2003, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, USA, December 19, 2003. CEUR Workshop Proceedings, vol. 90 (2003)

    Google Scholar 

  4. Bayardo Jr., R.J., Goethals, B., Zaki, M.J. (eds.): FIMI 2004, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004. CEUR Workshop Proceedings, vol. 126 (2004)

    Google Scholar 

  5. Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations (extended abstract). In: KDD, pp. 189–194 (1996)

    Google Scholar 

  6. Calders, T., Rigotti, C., Boulicaut, J.F.: A survey on condensed representations for frequent sets. In: [30], pp. 64–80

    Google Scholar 

  7. Mielikäinen, T.: Transaction databases, frequent itemsets, and their condensed representations. In: [31], pp. 139–164

    Google Scholar 

  8. Boulicaut, J.-F.: Inductive databases and multiple uses of frequent itemsets: The cInQ approach. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 1–23. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  9. Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39, 58–64 (1996)

    CrossRef  Google Scholar 

  10. Mannila, H.: Inductive databases and condensed representations for data mining. In: ILPS, pp. 21–30 (1997)

    Google Scholar 

  11. Siebes, A.: Data mining in inductive databases. In: [31], pp. 1–23

    Google Scholar 

  12. Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)

    Google Scholar 

  13. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1992)

    Google Scholar 

  14. Maron, M.E.: Automatic indexing: An experimental inquiry. J. ACM 8, 404–417 (1961)

    MATH  CrossRef  Google Scholar 

  15. Panov, P., Džeroski, S., Blockeel, H., Loškovska, S.: Predictive data mining using itemset frequencies. In: Proceedings of the 8th International Multiconference Information Society, pp. 224–227 (2005)

    Google Scholar 

  16. Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. J. ACM 45, 983–1006 (1998)

    MATH  CrossRef  MathSciNet  Google Scholar 

  17. Pavlov, D., Mannila, H., Smyth, P.: Beyond independence: Probabilistic models for query approximation on binary transaction data. IEEE Transactions on Knowledge and Data Engineering 15, 1409–1421 (2003)

    CrossRef  Google Scholar 

  18. Seppänen, J.K., Mannila, H.: Boolean formulas and frequent sets. In: [30], pp. 348–361

    Google Scholar 

  19. Mielikäinen, T.: Separating structure from interestingness. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 476–485. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  20. Toivonen, H.: Sampling large databases for association rules. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.): VLDB 1996, pp. 134–145 (1996)

    Google Scholar 

  21. Kubat, M., Hafez, A., Raghavan, V.V., Lekkala, J.R., Chen, W.K.: Itemset trees for targeted association querying. IEEE Transactions on Knowledge and Data Engineering 15, 1522–1534 (2003)

    CrossRef  Google Scholar 

  22. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)

    CrossRef  MathSciNet  Google Scholar 

  23. Moore, A.W., Lee, M.S.: Cached sufficient statistics for efficient machine learning with large datasets. JAIR 8, 67–91 (1998)

    MATH  MathSciNet  Google Scholar 

  24. Mielikäinen, T.: Implicit enumeration of patterns. In: [32], pp. 150–172

    Google Scholar 

  25. Laur, S., Lipmaa, H., Mielikäinen, T.: Private itemset support counting. In: Qing, S., Mao, W., López, J., Wang, G. (eds.) ICICS 2005. LNCS, vol. 3783, pp. 97–111. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  26. Mielikäinen, T.: An automata approach to pattern collections. In: [32], pp. 130–149

    Google Scholar 

  27. Calders, T., Goethals, B.: Quick inclusion-exclusion. In: [31], pp. 86–103

    Google Scholar 

  28. Geerts, F., Goethals, B., Mielikäinen, T.: What you store is what you get. In: [33], pp. 60–69

    Google Scholar 

  29. Mielikäinen, T.: Finding all occurring patterns of interest. In: [33], pp. 97–106

    Google Scholar 

  30. Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.): Constraint-Based Mining and Inductive Databases. LNCS, vol. 3848. Springer, Heidelberg (2006)

    Google Scholar 

  31. Bonchi, F., Boulicaut, J.-F. (eds.): KDID 2005. LNCS, vol. 3933. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  32. Goethals, B., Siebes, A. (eds.): KDID 2004 (Revised Selected and Invited Papers). LNCS, vol. 3377. Springer, Heidelberg (2005)

    Google Scholar 

  33. Boulicaut, J.F., Dzeroski, S. (eds.): Proceedings of the Second International Workshop on Inductive Databases, Cavtat-Dubrovnik, Croatia, September 22 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mielikäinen, T., Panov, P., Džeroski, S. (2006). Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds) Discovery Science. DS 2006. Lecture Notes in Computer Science(), vol 4265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893318_18

Download citation

  • DOI: https://doi.org/10.1007/11893318_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-46491-4

  • Online ISBN: 978-3-540-46493-8

  • eBook Packages: Computer ScienceComputer Science (R0)