Advertisement

Efficient Algorithms for Pattern Matching with General Gaps and Character Classes

  • Kimmo Fredriksson
  • Szymon Grabowski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4209)

Abstract

We develop efficient dynamic programming algorithms for a pattern matching with general gaps and character classes. We consider patterns of the form p 0 g(a 0,b 0) p 1 g(a 1,b 1) ...p m − − 1, where p i ⊂Σ, where Σ is some finite alphabet, and g(a i ,b i ) denotes a gap of length a i ...b i between symbols p i and p i + 1. The text symbol t j matches p i iff t j p i . Moreover, we require that if p i matches t j , then p i + 1 should match one of the text symbols \(t_{j+a_{i}+1} \ldots t_{j+b_i+1}\). Either or both of a i and b i can be negative. We give algorithms that have efficient average and worst case running times. The algorithms have important applications in music information retrieval and computational biology. We give experimental results showing that the algorithms work well in practice.

Keywords

Pattern Match Binary Search Character Class Text Character Pattern Character 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cantone, D., Cristofaro, S., Faro, S.: An efficient algorithm for δ -approximate matching with α -bounded gaps in musical sequences. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 428–439. Springer, Heidelberg (2005)Google Scholar
  2. 2.
    Cantone, D., Cristofaro, S., Faro, S.: On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences. In: Proceedings of ISMIR 2005 (2005)Google Scholar
  3. 3.
    Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., Tsichlas, K.: Approximate string matching with gaps. Nordic J. of Computing 9(1), 54–65 (2002)MATHMathSciNetGoogle Scholar
  4. 4.
    Fredriksson, K., Grabowski, S.: Efficient bit-parallel algorithms for (δ,α)-matching. In: Àlvarez, C., Serna, M. (eds.) WEA 2006. LNCS, vol. 4007, pp. 170–181. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Johnson, D.B.: A priority queue in which initialization and queue operations take O(loglogD) time. Mathematical Systems Theory 15, 295–309 (1982)MATHCrossRefGoogle Scholar
  6. 6.
    Mäkinen, V.: Parameterized approximate string matching and local-similarity- based point-pattern matching. PhD thesis, Department of Computer Science, University of Helsinki (August 2003)Google Scholar
  7. 7.
    Mäkinen, V., Navarro, G., Ukkonen, E.: Transposition invariant string matching. Journal of Algorithms 56(2), 124–153 (2005)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Mehldau, G., Myers, G.: A system for pattern matching applications on biosequences. Comput. Appl. Biosci. 9(3), 299–314 (1993)Google Scholar
  9. 9.
    Myers, G.: Approximate matching of network expression with spacers. Journal of Computational Biology 3(1), 33–51 (1996)CrossRefGoogle Scholar
  10. 10.
    Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology 10(6), 903–923 (2003)CrossRefGoogle Scholar
  11. 11.
    Pinzón, Y.J., Wang, S.: Simple algorithm for pattern-matching with bounded gaps in genomic sequences. In: Proceedings of ICNAAM 2005, pp. 827–831 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kimmo Fredriksson
    • 1
  • Szymon Grabowski
    • 2
  1. 1.Department of Computer ScienceUniversity of JoensuuJoensuuFinland
  2. 2.Computer Engineering DepartmentTechnical University of ŁódźŁódźPoland

Personalised recommendations