Advertisement

Flexible Framework for Time-Series Pattern Matching over Multi-dimension Data Stream

  • Takuya Kida
  • Tomoya Saito
  • Hiroki Arimura
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5433)

Abstract

In this paper, we study a complex time-series pattern matching problem over a multi-dimension continuous data stream. For each data stream, a pattern is given as a sequence of predicates, which specify a sequence of element sets on the stream. The pattern matching problem over such a multi-dimension data stream, is to find all occurrences where all predicates in the patterns are satisfied. We propose a flexible and extensible framework to solve the problem, which is based on bit-parallel pattern matching method that simulates NFAs for the pattern matching efficiently by a few logical bit operations. We consider four types of data streams especially: textual, categorical, ordered, and numeric, that is, those are a sequence of strings, concepts with taxonomic information, small integers, and real numbers (or large integers), respectively. We also present the time complexities to do pattern matching for those data types.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Morikawa, H., Asai, T., Arimura, H.: Efficient xpath processing for semi-structured data streams. In: Proc. DBWS 2003, IPSJ-IEICE (2003) 28 (in Japanese)Google Scholar
  2. 2.
    Motwani, R., Widom, J., et al.: Query processing, approximation, and resource management in a data stream management system. In: Proc. CIDR 2003 (2003)Google Scholar
  3. 3.
    Stonebraker, M., Cetintemel, U.: One size fits all: An idea whose time has come and gone. In: Proc. ICDE 2005, pp. 2–11 (2005)Google Scholar
  4. 4.
    Harada, L.: Complex temporal patterns detection over continuous data streams. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 401–414. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Harada, L.: Pattern matching over multi-attribute data streams. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 187–193. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: Optimization of sequence queries in database systems. In: Proc. PODS 2001. ACM, New York (2001)Google Scholar
  7. 7.
    Saito, T., Kida, T., Arimura, H.: An efficient algorithm for complex pattern matching over continuous data streams based on bit-parallel method. In: Proc. of The Third IEEE International Workshop on Databases for Next-Generation Researchers (SWOD 2007), pp. 13–18 (April 2007)Google Scholar
  8. 8.
    Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings: Practical on-line search algorithms for texts and biological sequences. Cambridge University Press, Cambridge (2002)CrossRefGoogle Scholar
  9. 9.
    Kida, T., Arimura, H.: Pattern matching with taxonomic information. In: Myaeng, S.-H., Zhou, M., Wong, K.-F., Zhang, H.-J. (eds.) AIRS 2004. LNCS, vol. 3411, pp. 265–268. Springer, Heidelberg (2005)Google Scholar
  10. 10.
    Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)CrossRefGoogle Scholar
  11. 11.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Comm. ACM 20(10), 62–72 (1977)CrossRefGoogle Scholar
  12. 12.
    Mäkinen, V., Navarro, G., Ukkonen, E.: Matching numeric strings under noise. In: Proceedings of the 8th Prague Stringology Conference (PSC 2003), pp. 99–110 (2003)Google Scholar
  13. 13.
    Crochemore, M., Iliopoulos, C.S., Navarro, G., Pinzon, Y.J.: A bit-parallel suffix automaton approach for (δ,γ)-matching in music retrieval. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 211–223. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Crochemorea, M., Iliopoulos, C.S., Navarro, G., Pinzon, Y.J., Salingerc, A.: Bit-parallel (δ,γ)-matching and suffix automata. Journal of Discrete Algorithms 3(2-4), 198–214 (June 2005); (Combinatorial Pattern Matching (CPM) Special Issue)Google Scholar
  15. 15.
    Fredriksson, K., Mäinen, V., Navarro, G.: Flexible music retrieval in sublinear time. International Journal of Foundations of Computer Science (IJFCS) 17(6), 1345–1364 (2006)CrossRefGoogle Scholar
  16. 16.
    Hyrro, H., Takaba, J., Shinohara, A., Takeda, M.: On bit-parallel processing of multi-byte text. In: Myaeng, S.-H., Zhou, M., Wong, K.-F., Zhang, H.-J. (eds.) AIRS 2004. LNCS, vol. 3411, pp. 289–300. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Aho, A.V., Corasick, M.J.: Efficient string matching: An aid to bibliographic search. Comm. ACM 18(6), 333–340 (1975)CrossRefGoogle Scholar
  18. 18.
    Arikawa, S., Shinohara, T., Takeya, S.: Sigma: A text database management system. Berliners Informatik Tag 5, 72–81 (1989)Google Scholar
  19. 19.
    Matsumoto, Y., Asahara, M., Kawabe, K., Takahashi, Y., Tono, Y., Ohtani, A., Morita, T.: Chaki: An annotated corpora management and search system. In: Proceedings from the Corpus Linguistics Conference Series, vol. 1, Corpus Linguistics (July 2005), http://chasen.naist.jp/hiki/ChaKi/
  20. 20.
    Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The Design and Analysis of Computer Algorithms (1974)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Takuya Kida
    • 1
  • Tomoya Saito
    • 1
  • Hiroki Arimura
    • 1
  1. 1.Hokkaido UniversitySapporoJapan

Personalised recommendations