Advertisement

Fast Discovery of Time-Constrained Sequential Patterns Using Time-Indexes

  • Ming-Yen Lin
  • Sue-Chen Hsueh
  • Chia-Wen Chang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)

Abstract

Sequential pattern mining is to find out all the frequent sub-sequences in a sequence database. In order to have more accurate results, constraints in addition to the support threshold need to be specified in the mining. Time-constraints cannot be managed by retrieving patterns because the support computation of patterns must validate the time attributes for every data sequence in the mining process. In this paper, we propose a memory time-indexing approach (called METISP) to discover sequential patterns with time constraints including minimum/maximum/exact gaps, sliding window, and duration. METISP scans the database into memory and constructs time-index sets for effective processing. Utilizing the index sets and the pattern-growth strategy, METISP efficiently mines the desired patterns without generating any candidate or sub-database. The comprehensive experiments show that METISP outperforms GSP and DELISP in the discovery of time-constrained sequential patterns, even with low support thresholds and very large databases.

Keywords

Sequential Pattern Time Index Minimum Support Frequent Item Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential PAttern Mining using A Bitmap Representation. In: Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining, pp. 429–435 (2002)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3–14 (1995)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, pp. 3–17 (1996)Google Scholar
  4. 4.
    Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: An Efficient Algorithm for Mining Frequent Sequences by a New Strategy without Support Counting. In: Proceedings of the 20th International Conference on Data Engineering, pp. 375–386 (2004)Google Scholar
  5. 5.
    Garofalakis, M.N., Rastogi, R., Shim, K.: SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. In: Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, September 1999, pp. 223–234 (1999)Google Scholar
  6. 6.
    Lin, M.Y., Lee, S.Y.: Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning. Journal of Information Science and Engineering 21(1), 109–128 (2005)Google Scholar
  7. 7.
    Lin, M.Y., Lee, S.Y.: Efficient Mining of Sequential Patterns with Time Constraints by Delimited Pattern-Growth. Knowledge and Information Systems 7(4), 499–514 (2005)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Masseglia, F., Cathala, F., Poncelet, P.: The PSP Approach for Mining Sequential Patterns. In: Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery, Nantes, France, pp. 176–184 (1998)Google Scholar
  9. 9.
    Orlando, S., Perego, R., Silvestri, C.: A new algorithm for gap constrained sequence mining. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 540–547 (2004)Google Scholar
  10. 10.
    Pei, J., Han, J., Moryazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2001, pp. 215–224 (2001)Google Scholar
  11. 11.
    Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 18–25 (2002)Google Scholar
  12. 12.
    Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning Journal 42, 31–60 (2001)CrossRefMATHGoogle Scholar
  13. 13.
    Zaki, M.J.: Sequence Mining in Categorical Domains: Incorporating Constraints. In: Proceedings of the 9th International Conference on Information and Knowledge Management, November 2000, pp. 422–429. Washington, DC (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ming-Yen Lin
    • 1
  • Sue-Chen Hsueh
    • 2
  • Chia-Wen Chang
    • 1
  1. 1.Department of Information Engineering and Computer ScienceFeng-Chia UniversityTaiwan
  2. 2.Department of Information ManagementChaoyang University of TechnologyTaiwan

Personalised recommendations