Comparison of Exact String Matching Algorithms for Biological Sequences

  • Petri Kalsi
  • Hannu Peltola
  • Jorma Tarhio
Part of the Communications in Computer and Information Science book series (CCIS, volume 13)

Abstract

Exact matching of single patterns in DNA and amino acid sequences is studied. We performed an extensive experimental comparison of algorithms presented in the literature. In addition, we introduce new variations of earlier algorithms. The results of the comparison show that the new algorithms are efficient in practice.

Keywords

Search Time Biological Sequence String Match Multiple Reading Pattern Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R.: Improved string searching. Software: Practice and Experience 19(3), 257–271 (1989)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Berry, T., Ravindran, S.: A fast string matching algorithm and experimental results. Proc. of the Prague Stringology Club Workshop 1999, Czech Technical University, Prague, Czech Republic, Collaborative Report DC-99-05, pp. 16–28 (1999)Google Scholar
  3. 3.
    Boyer, R.S., Moore, J S.: A fast string searching algorithm. Communications of the ACM 20(10), 762–772 (1977)CrossRefGoogle Scholar
  4. 4.
    Fredriksson, K., Grabowski, Sz.: Practical and optimal string matching. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 376–387. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Fredriksson, K.: Personal communicationGoogle Scholar
  6. 6.
    Holub, J., Ďurian, B.: Fast variants of bit parallel approach to suffix automata (Unpublished Lecture) University of Haifa 04-05 (2005)Google Scholar
  7. 7.
    Horspool, R.N.: Practical fast searching in strings. Software: Practice and Experience 10(6), 501–506 (1980)CrossRefGoogle Scholar
  8. 8.
    Hume, A., Sunday, D.: Fast string searching. Software: Practice and Experience 21(11), 1221–1248 (1991)CrossRefGoogle Scholar
  9. 9.
    Hyyrö, H.: Personal communicationGoogle Scholar
  10. 10.
    Kim, J.Y., Shawe-Taylor, J.: Fast string matching using an n-gram algorithm. Software: Practice and Experience 24(1), 79–88 (1994)CrossRefGoogle Scholar
  11. 11.
    Kim, J.W., Kim, E., Park, K.: Fast matching method for DNA sequences. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 271–281. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM Journal on Computing 6(1), 323–350 (1977)MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Lecroq, T.: Fast exact string matching algorithms. Information Processing Letters 102(6), 229–235 (2007)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithms 5(4), 1–36 (2000)MathSciNetGoogle Scholar
  15. 15.
    Peltola, H., Tarhio, J.: Alternative algorithms for bit-parallel string matching. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 80–93. Springer, Heidelberg (2003)Google Scholar
  16. 16.
    Sheik, S.S., Aggarwal, S.K., Poddar, A., Balakrishnan, N., Sekar, K.: A FAST pattern matching algorithm. J. Chem. Inf. Comput. Sci. 44(4), 1251–1256 (2004)CrossRefGoogle Scholar
  17. 17.
    Sunday, D.M.: A very fast substring search algorithm. Communications of the ACM 33(8), 132–142 (1990)CrossRefGoogle Scholar
  18. 18.
    Tarhio, J., Peltola, H.: String matching in the DNA alphabet. Software: Practice and Experience 27(7), 851–861 (1997)CrossRefGoogle Scholar
  19. 19.
    Thathoo, R., Virmani, A., Sai Lakshmi, S., Balakrishnan, N., Sekar, K.: TVSBS: A fast exact pattern matching algorithm for biological sequences. Current Science 91(1), 47–53 (2006)Google Scholar
  20. 20.
    Wu, S., Manber, U.: A fast algorithm for multi-pattern searching, Report TR-94-17, Department of Computer Science, University of Arizona, Tucson, AZ (1994)Google Scholar
  21. 21.
    Zhu, R.F., Takaoka, T.: On improving the average case of the Boyer–Moore string matching algorithm. Journal of Information Processing 10(3), 173–177 (1987)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Petri Kalsi
    • 1
  • Hannu Peltola
    • 1
  • Jorma Tarhio
    • 1
  1. 1.Department of Computer Science and EngineeringHelsinki University of TechnologyFinland

Personalised recommendations