Advertisement

Abstract

In this paper we present extensions for continuous pattern mining. Our previous continuous pattern mining algorithm mines the set of all frequent sequences satisfying the minSup condition. However, those sequences contain an explosive number of frequent subsequences, which makes the analysis and understanding of patterns very difficult. In order to overcome these difficulties, we propose four new algorithms for mining maximal and closed continuous patterns. These algorithms return a superset of the result patterns and then a post-pruning algorithm is performed to eliminate redundant sequences. For each type of patterns (maximal or closed) two algorithms are presented (with and without some improvements). The key idea is to omit as many redundant sequences as possible during the exploration. The proposed algorithms allow one to reduce the size of the result set when input sequences have low uniqueness.

Keywords

Sequential Pattern Pattern Mining Continuous Sequence Redundant Sequence Continuous Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of the 17th Int. Conf. on Data Engineering, pp. 215–224. IEEE CS, Heidelberg (2001)Google Scholar
  2. 2.
    Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42, 31–60 (2001)CrossRefzbMATHGoogle Scholar
  3. 3.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential PAttern mining using a bitmap representation. In: Proc. of the 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 429–435. ACM, Edmonton (2002)Google Scholar
  4. 4.
    Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. In: Proc. of the 2001 IEEE Int. Conf. on Data Mining, pp. 163–170. IEEE CS, San Jose (2001)CrossRefGoogle Scholar
  5. 5.
    Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: Proc. of the Sixth SIAM Int. Workshop on High Performance Data Mining, pp. 135–143 (2003)Google Scholar
  6. 6.
    Pei, J., Han, J., Mao, R.: CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)Google Scholar
  7. 7.
    Zaki, M.J., Hsiao, C.-J.: CHARM: An Efficient Algorithm for Closed Itemset Mining. In: Proc. of the Second SIAM Int. Conf. on Data Mining. SIAM, Arlington (2002)Google Scholar
  8. 8.
    Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Databases. In: Proc. of the Third SIAM Int. Conf. on Data Mining. SIAM, San Francisco (2003)Google Scholar
  9. 9.
    Wang, J., Han, J.: BIDE: Efficient Mining of Frequent Closed Sequences. In: Proc. of the 20th Int. Conf. on Data Engineering, pp. 79–90. IEEE CS, Boston (2004)CrossRefGoogle Scholar
  10. 10.
    Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H.: Mining Access Patterns Efficiently from Web Logs. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 396–407. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    Tseng, V.S., Lin, K.W.: Efficient mining and prediction of user behavior patterns in mobile web systems. Information and Software Technology 48, 357–369 (2006)CrossRefGoogle Scholar
  12. 12.
    Gorawski, M., Jureczek, P., Gorawski, M.: Exploration of continuous sequential patterns using the CPGrowth algorithm. In: The 7-th Int. Conf. on Multimedia and Network Information Systems, pp. 165–172 (2010)Google Scholar
  13. 13.
    Spiliopoulou, M., Faulstich, L.C.: WUM: A Tool for Web Utilization Analysis. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 184–203. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  14. 14.
    Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. of the 2000 ACM SIGMOD Int. Conf. on Management of Data, pp. 1–12, Dallas (2000)Google Scholar
  15. 15.
    Brinkhoff, T.A.: A Framework for Generating Network-Based Moving Objects. Geoinformatica, 153–180 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Marcin Gorawski
    • 1
    • 2
  • Pawel Jureczek
    • 1
  1. 1.Institute of Computer ScienceSilesian University of TechnologyGliwicePoland
  2. 2.Institute of Computer ScienceWroclaw University of TechnologyWrocławPoland

Personalised recommendations