AI 2003: Advances in Artificial Intelligence pp 621-623 | Cite as
A Quick Look at Methods for Mining Long Subsequences
Abstract
Pattern discovery, or the search for frequently occurring subsequences (called sequential patterns) in sequences, is a well-known data-mining task. Sequences of events occur naturally in many domains. We address an abstract version of the problem of finding frequent sequences of page accesses in a log file by considering the problem of finding frequent subsequences in a sequence dataset. In the abstract problem, we use the 26 uppercase letters to represent the possible web pages, and examine the problem of finding frequently occurring subsequences of items in a very long sequence. The particular problem studied is to find all frequently occurring substrings of length K or less in a very long string. The advantage of Heuristic Depth-first (HDF) algorithm based on the Depth-First (DF) algorithm is explained by comparing with Breadth-First (BF) algorithm.
Keywords
Sequential Pattern Pattern Discovery Uppercase Letter Frequent Sequence Abstract ProblemPreview
Unable to display preview. Download preview PDF.
References
- [AS1995]Agrawal, R., and Srikant, R., “Mining Sequential Patterns.” Proceedings IEEE International Conference on Data Engineering, Taipei, Taiwan, 1995.Google Scholar
- [Jia2003]Jiang, L., and Hamilton, H.J., “Methods for Mining Frequent Sequential Patterns.” Proceedings AI.’2003, this volume.Google Scholar
- [PCY1995]Park, J.S., Chen, M.S., and Yu, P.S., “An Effective Hash-Based Algorithm for mining Association Rules.” Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, May 1995.Google Scholar
- [Vil1998]Vilo, J., Discovery Frequent Patterns from Strings, Technical Report C-1998–9, Department of Computer Science, University of Helsinki, FIN-00014, University of Helsinki, May 1998.Google Scholar