Mining Maximal Flexible Patterns in a Sequence

  • Hiroki Arimura
  • Takeaki Uno
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4914)

Abstract

We consider the problem of enumerating all maximal flexible patterns in an input sequence database for the class of flexible patterns, where a maximal pattern (also called a closed pattern) is the most specific pattern among the equivalence class of patterns having the same list of occurrences in the input. Since our notion of maximal patterns is based on position occurrences, it is weaker than the traditional notion of maximal patterns based on document occurrences. Based on the framework of reverse search, we present an efficient depth-first search algorithm MaxFlex for enumerating all maximal flexible patterns in a given sequence database without duplicates in \(O(||{\mathcal{T}}||\times|\Sigma|)\) time per pattern and \(O(||{\mathcal T}||)\) space, where \(||{\mathcal T}||\) is the size of the input sequence database \(\mathcal T\) and |Σ| is the size of the alphabet on which the sequences are defined. This means that the enumeration problem for maximal flexible patterns is shown to be solvable in polynomial delay and polynomial space.

Keywords

Sequence Database Pattern Discovery Polynomial Space Constant Symbol Minimum Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Avis, D., Fukuda, K.: Reverse Search for Enumeration. Discrete Appl. Math. 65, 21–46 (1996)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Arimura, H., Fujino, R., Shinohara, T.: Protein motif discovery from positive examples by minimal multiple generalization over regular patterns. In: Proc. GIW 1994, pp. 39–48 (1994)Google Scholar
  3. 3.
    Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)Google Scholar
  4. 4.
    Arimura, H., Uno, T.: A polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1(3), 259–289 (1997)CrossRefGoogle Scholar
  6. 6.
    Parida, L., Rigoutsos, I., et al.: Pattern discovery on character sets and real-valued data: Linear-bound on irredandant motifs and efficient polynomial time algorithms. In: Proc. SODA 2000, SIAM-ACM (2000)Google Scholar
  7. 7.
    Pisanti, N., et al.: A basis of tiling motifs for generating repeated patterns and its complexity of higher quorum. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, Springer, Heidelberg (2003)Google Scholar
  8. 8.
    Shapiro, E.Y.: Algorithmic Program Debugging. MIT Press, Cambridge (1982)Google Scholar
  9. 9.
    Shimozono, S., Arimura, H., Arikawa, S.: Efficient discovery of optimal word-association patterns in large text databases. New Generation Comput. 18(1), 49–60 (2000)CrossRefGoogle Scholar
  10. 10.
    Shinohara, T.: Polynomial time inference of extended regular pattern Languages. In: Proc. RIMS Symp. on Software Sci. & Eng., pp. 115–127 (1982)Google Scholar
  11. 11.
    Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large databases. In: Proc. SDM 2003, SIAM (2003)Google Scholar
  12. 12.
    Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: Proc. ICDE 2004 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Hiroki Arimura
    • 1
  • Takeaki Uno
    • 2
  1. 1.Graduate School of Information Science and TechnologyHokkaido UniversitySapporoJapan
  2. 2.National Institute of InformaticsChiyoda-kuJapan

Personalised recommendations