Advertisement

The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining

  • Jana Schmidt
  • Stefan Kramer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6926)

Abstract

This paper introduces an approach for incremental maximal frequent pattern (MFP) mining in sparse binary data, where instances are observed one by one. For this purpose, we propose the Augmented Itemset Tree (AIST), a data structure that incorporates features of the FP-tree into the itemset tree. In the given setting, we assume that just the data structure is maintained in main memory, and each instance is observed only once. The AIST not only stores observed frequent patterns, but also allows for quick frequency updates of relevant subpatterns. In order to quickly identify the current set of exact MFPs, potential candidates are extracted from former MFPs and patterns that occur in the new instance. The presented approach is evaluated concerning the runtime and memory requirements depending on the number of instances, minimum support and different settings of pattern properties. The obtained results suggest that AISTs are useful for mining maximal frequent itemsets in an online setting whenever larger patterns can be expected.

Keywords

Incremental pattern mining maximal frequent itemsets 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data 1993, pp. 207–216. ACM, New York (1993)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994)Google Scholar
  3. 3.
    Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: An incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering (ICDE), pp. 106–114. IEEE Computer Society, Los Alamitos (1996)CrossRefGoogle Scholar
  4. 4.
    Cheung, W., Zaiane, O.R.: Incremental mining of frequent patterns without candidate generation or support. In: IDEAS 2003: Proceedings of the 7th International Database Engineering and Applications Symposium 2003, pp. 111–116. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  5. 5.
    Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 59–66. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  6. 6.
    Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. The VLDB Journal 18, 303–327 (2009)CrossRefGoogle Scholar
  7. 7.
    Floratou, A., Tata, S., Patel, J.M.: Efficient and accurate discovery of patterns in sequence datasets. In: ICDE 2010: Proceedings of the 26th International Conference on Data Engineering, pp. 461–472. IEEE Computer Society, Los Alamitos (2010)CrossRefGoogle Scholar
  8. 8.
    Hafez, A., Deogun, J., Raghavan, V.V.: The item-set tree: A data structure for data mining. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 183–192. Springer, Heidelberg (1999)Google Scholar
  9. 9.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2000)CrossRefGoogle Scholar
  10. 10.
    Lee, D., Lee, W.: Finding maximal frequent itemsets over online data streams adaptively. In: ICDM, pp. 266–273 (2005)Google Scholar
  11. 11.
    Lee, H.S.: Incremental association mining based on maximal itemsets. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3681, pp. 365–371. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Leung, C.K.S., Khan, Q.I., Li, Z., Hoque, T.: Cantree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)CrossRefGoogle Scholar
  13. 13.
    Lian, W., Cheung, D.W., Yiu, S.M.: Maintenance of maximal frequent itemsets in large databases. In: Proceedings of the 2007 ACM Symposium on Applied Computing, SAC 2007, pp. 388–392. ACM, New York (2007)Google Scholar
  14. 14.
    Mozafari, B., Thakkar, H., Zaniolo, C.: Verifying and Mining Frequent Patterns from Large Windows over Data Streams. In: ICDE 2008: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 179–188. IEEE Computer Society, Los Alamitos (2008)Google Scholar
  15. 15.
    Omari, A., Langer, R., Conrad, S.: Tartool: A temporal dataset generator for market basket analysis. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 400–410. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Savasere, A., Omiecinski, E., Navathe, S.B.: An efficient algorithm for mining association rules in large databases. In: Dayal, U., Gray, P.M.D., Nishio, S. (eds.) Proceedings of 21th International Conference on Very Large Data Bases, VLDB 1995, pp. 432–444. Morgan Kaufmann, San Francisco (1995)Google Scholar
  17. 17.
    Schmidt, J., Kramer, S.: The augmented itemset tree: A data structure for online maximum frequent pattern mining. techreport (2011), http://drehscheibe.in.tum.de/forschung/pub/reports/2011/TUM-I1114.pdf
  18. 18.
    Seeland, M., Girschick, T., Buchwald, F., Kramer, S.: Online structural graph clustering using frequent subgraph mining. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 213–228. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Valtchev, P., Missaoui, R., Godin, R.: A framework for incremental generation of closed itemsets. Discrete Applied Mathematics 156, 924–949 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Valtchev, P., Missaoui, R., Godin, R., Meridji, M.: Generating frequent itemsets incrementally: two novel approaches based on galois lattice theory. Journal of Experimental & Theoretical Artificial Intelligence 14(2-3), 115–142 (2002)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jana Schmidt
    • 1
  • Stefan Kramer
    • 1
  1. 1.Institut für Informatik/I12TU MünchenGarching b. MünchenGermany

Personalised recommendations