Non-contiguous Sequence Pattern Queries

  • Nikos Mamoulis
  • Man Lung Yiu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2992)

Abstract

Non-contiguous subsequence pattern queries search for symbol instances in a long sequence that satisfy some soft temporal constraints. In this paper, we propose a methodology that indexes long sequences, in order to efficiently process such queries. The sequence data are decomposed into tables and queries are evaluated as multiway joins between them. We describe non-blocking join operators and provide query preprocessing and optimization techniques that tighten the join predicates and suggest a good join order plan. As opposed to previous approaches, our method can efficiently handle a broader range of queries and can be easily supported by existing DBMS. Its efficiency is evaluated by experimentation on synthetic and real data.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM and Mc-Graw Hill (1999)Google Scholar
  2. 2.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. of International Conference on Intelligent Systems for Molecular Biology (2000)Google Scholar
  3. 3.
    Dechter, R., Meiri, I., Pearl, J.: Temporal constraint networks. Artificial Intelligence 49(1-3), 61–95 (1991)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    DeWitt, D.J., Naughton, J.F., Schneider, D.A.: An evaluation of non-equijoin algorithms. In: Proc. of VLDB Conference (1991)Google Scholar
  5. 5.
    Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proc. of ACM SIGMOD International Conference on Management of Data (1994)Google Scholar
  6. 6.
    Floyd, R.W.: ACM Algorithm 97: Shortest path. Communications of the ACM 5(6), 345 (1962)CrossRefGoogle Scholar
  7. 7.
    Graefe, G.: Query evaluation techniques for large databases. ACM Computing Surveys 25(2), 73–170 (1993)CrossRefGoogle Scholar
  8. 8.
    Kahveci, T., Singh, A.K.: Efficient index structures for string databases. In: Proc. of VLDB Conference (2001)Google Scholar
  9. 9.
    Mamoulis, N., Papadias, D.: Multiway spatial joins. ACM Transactions on Database Systems (TODS) 26(4), 424–475 (2001)MATHCrossRefGoogle Scholar
  10. 10.
    Moon, Y.-S., Whang, K.-Y., Han, W.-S.: General match: a subsequence matching method in time-series databases based on generalized windows. In: Proc. of ACM SIGMOD International Conference on Management of Data (2002)Google Scholar
  11. 11.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)CrossRefGoogle Scholar
  12. 12.
    Ramakrishnan, R., Gehrke, J.: Database Management Systems, 3rd edn. McGraw-Hill, New York (2003)MATHGoogle Scholar
  13. 13.
    Wang, H., Perng, C.-S., Fan, W., Park, S., Yu, P.S.: Indexing weighted-sequences in large databases. In: Proc. of Int’l Conf. on Data Engineering, ICDE (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Nikos Mamoulis
    • 1
  • Man Lung Yiu
    • 1
  1. 1.Department of Computer Science and Information SystemsUniversity of Hong KongHong Kong

Personalised recommendations