Advertisement

Itemset Support Queries Using Frequent Itemsets and Their Condensed Representations

  • Taneli Mielikäinen
  • Panče Panov
  • Sašo Džeroski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4265)

Abstract

The purpose of this paper is two-fold: First, we give efficient algorithms for answering itemset support queries for collections of itemsets from various representations of the frequency information. As index structures we use itemset tries of transaction databases, frequent itemsets and their condensed representations. Second, we evaluate the usefulness of condensed representations of frequent itemsets to answer itemset support queries using the proposed query algorithms and index structures. We study analytically the worst-case time complexities of querying condensed representations and evaluate experimentally the query efficiency with random itemset queries to several benchmark transaction databases.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) SIGMOD Conference, pp. 207–216 (1993)Google Scholar
  2. 2.
    Goethals, B.: Frequent set mining. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 377–397. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Goethals, B., Zaki, M.J. (eds.): FIMI 2003, Frequent Itemset Mining Implementations, Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, USA, December 19, 2003. CEUR Workshop Proceedings, vol. 90 (2003)Google Scholar
  4. 4.
    Bayardo Jr., R.J., Goethals, B., Zaki, M.J. (eds.): FIMI 2004, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Brighton, UK, November 1, 2004. CEUR Workshop Proceedings, vol. 126 (2004)Google Scholar
  5. 5.
    Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations (extended abstract). In: KDD, pp. 189–194 (1996)Google Scholar
  6. 6.
    Calders, T., Rigotti, C., Boulicaut, J.F.: A survey on condensed representations for frequent sets. In: [30], pp. 64–80Google Scholar
  7. 7.
    Mielikäinen, T.: Transaction databases, frequent itemsets, and their condensed representations. In: [31], pp. 139–164Google Scholar
  8. 8.
    Boulicaut, J.-F.: Inductive databases and multiple uses of frequent itemsets: The cInQ approach. In: Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.) Constraint-Based Mining and Inductive Databases. LNCS (LNAI), vol. 3848, pp. 1–23. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39, 58–64 (1996)CrossRefGoogle Scholar
  10. 10.
    Mannila, H.: Inductive databases and condensed representations for data mining. In: ILPS, pp. 21–30 (1997)Google Scholar
  11. 11.
    Siebes, A.: Data mining in inductive databases. In: [31], pp. 1–23Google Scholar
  12. 12.
    Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)Google Scholar
  13. 13.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1992)Google Scholar
  14. 14.
    Maron, M.E.: Automatic indexing: An experimental inquiry. J. ACM 8, 404–417 (1961)zbMATHCrossRefGoogle Scholar
  15. 15.
    Panov, P., Džeroski, S., Blockeel, H., Loškovska, S.: Predictive data mining using itemset frequencies. In: Proceedings of the 8th International Multiconference Information Society, pp. 224–227 (2005)Google Scholar
  16. 16.
    Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. J. ACM 45, 983–1006 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Pavlov, D., Mannila, H., Smyth, P.: Beyond independence: Probabilistic models for query approximation on binary transaction data. IEEE Transactions on Knowledge and Data Engineering 15, 1409–1421 (2003)CrossRefGoogle Scholar
  18. 18.
    Seppänen, J.K., Mannila, H.: Boolean formulas and frequent sets. In: [30], pp. 348–361Google Scholar
  19. 19.
    Mielikäinen, T.: Separating structure from interestingness. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 476–485. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Toivonen, H.: Sampling large databases for association rules. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.): VLDB 1996, pp. 134–145 (1996)Google Scholar
  21. 21.
    Kubat, M., Hafez, A., Raghavan, V.V., Lekkala, J.R., Chen, W.K.: Itemset trees for targeted association querying. IEEE Transactions on Knowledge and Data Engineering 15, 1522–1534 (2003)CrossRefGoogle Scholar
  22. 22.
    Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Moore, A.W., Lee, M.S.: Cached sufficient statistics for efficient machine learning with large datasets. JAIR 8, 67–91 (1998)zbMATHMathSciNetGoogle Scholar
  24. 24.
    Mielikäinen, T.: Implicit enumeration of patterns. In: [32], pp. 150–172Google Scholar
  25. 25.
    Laur, S., Lipmaa, H., Mielikäinen, T.: Private itemset support counting. In: Qing, S., Mao, W., López, J., Wang, G. (eds.) ICICS 2005. LNCS, vol. 3783, pp. 97–111. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  26. 26.
    Mielikäinen, T.: An automata approach to pattern collections. In: [32], pp. 130–149Google Scholar
  27. 27.
    Calders, T., Goethals, B.: Quick inclusion-exclusion. In: [31], pp. 86–103Google Scholar
  28. 28.
    Geerts, F., Goethals, B., Mielikäinen, T.: What you store is what you get. In: [33], pp. 60–69Google Scholar
  29. 29.
    Mielikäinen, T.: Finding all occurring patterns of interest. In: [33], pp. 97–106Google Scholar
  30. 30.
    Boulicaut, J.-F., De Raedt, L., Mannila, H. (eds.): Constraint-Based Mining and Inductive Databases. LNCS, vol. 3848. Springer, Heidelberg (2006)Google Scholar
  31. 31.
    Bonchi, F., Boulicaut, J.-F. (eds.): KDID 2005. LNCS, vol. 3933. Springer, Heidelberg (2006)zbMATHGoogle Scholar
  32. 32.
    Goethals, B., Siebes, A. (eds.): KDID 2004 (Revised Selected and Invited Papers). LNCS, vol. 3377. Springer, Heidelberg (2005)Google Scholar
  33. 33.
    Boulicaut, J.F., Dzeroski, S. (eds.): Proceedings of the Second International Workshop on Inductive Databases, Cavtat-Dubrovnik, Croatia, September 22 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Taneli Mielikäinen
    • 1
  • Panče Panov
    • 2
  • Sašo Džeroski
    • 2
  1. 1.HIIT BRU, Department of Computer ScienceUniversity of HelsinkiFinland
  2. 2.Department of Knowledge TechnologiesJožef Stefan InstituteLjubljanaSlovenia

Personalised recommendations