Path Tree: Mining Sequential Patterns Efficiently in Data Streams Environments
Although issues of data streams have been widely studied and utilized, it is nevertheless challenging to deal with sequential mining of data streams. In this paper, we assume that the transaction of a user is partially coming and that there is no auxiliary for buffering and integrating. We adopt the Path Tree for mining frequent sequential patterns over data streams and integrate the user’s sequences efficiently. Algorithms with regards to accuracy (PAlgorithm) and space (PSAlgorithm) are proposed to meet the different aspects of users. Many pruning properties are used to further reduce the space usage and improve the accuracy of our algorithms. We also prove that PAlgorithm mine frequent sequential patterns with the approximate support of error guarantee. Through experiments, synthetic dataset is utilized to verify the feasibility of our algorithms.
KeywordsData Mining Sequential Patterns Frequent Patterns
Unable to display preview. Download preview PDF.
- 1.Chen, Y.C., Lee, G.: Mining Sequential Association Rules Efficiently by Using Prefix Projected Databases. Journal of Computers 22(2), 33–47 (2011)Google Scholar
- 2.Chang, L., Wang, T., Yang, D., Luan, H.: SeqStream: Mining Closed Sequential Patterns over Stream Sliding Windows. In: ICDM, pp. 83–92 (2008)Google Scholar
- 3.Dai, B.R., Jiang, H.L., Chung, C.H.: Mining Top-K Sequential Patterns in the Data Stream Environment. In: International Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 142–149 (November 2010)Google Scholar
- 4.Li, H., Lee, S., Shan, M.: On Mining Webclick Streams for Path Traversal Patterns. In: International World Wide Web Conference on Alternate Track Papers & Posters (2004)Google Scholar
- 5.Li, H., Lee, S., Shan, M.: DSM-TKP: Mining Top-K Path Traversal Patterns over Web Click-Streams. In: IEEE/WIC/ACM International Conference on Web Intelligence (2005)Google Scholar
- 6.Yang, S.Y., Chao, C.M., Chen, P.Z., Sun, C.H.: Incremental Mining of Closed Sequential Patterns in Multiple Data Streams. Journal of Networks 6 (2011)Google Scholar