Advertisement

An Efficient Matching Algorithm for Encoded DNA Sequences and Binary Strings

  • Simone Faro
  • Thierry Lecroq
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5577)

Abstract

We present a new efficient algorithm for exact matching in encoded DNA sequences and on binary strings. Our algorithm combines a multi-pattern version of the Bndm algorithm and a simplified version of the Commentz-Walter algorithm. We performed also experimental comparisons with the most efficient algorithms presented in the literature. Experimental results show that the newly presented algorithm outperforms existing solutions in most cases.

Keywords

string matching binary strings DNA sequences experimental algorithms compressed text processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992)CrossRefGoogle Scholar
  2. 2.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)CrossRefzbMATHGoogle Scholar
  3. 3.
    Charras, C., Lecroq, T., Pehoushek, J.D.: A very fast string matching algorithm for small alphabets and long patterns. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 55–64. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  4. 4.
    Commentz-Walter, B.: A string matching algorithm fast on the average. In: Maurer, H.A. (ed.) ICALP 1979. LNCS, vol. 71, pp. 118–132. Springer, Heidelberg (1979)CrossRefGoogle Scholar
  5. 5.
    Faro, S., Lecroq, T.: Efficient pattern matching on binary strings. In: Current Trends in Theory and Practice of Computer Science, Poster (2009)Google Scholar
  6. 6.
    Holub, J., Durian, B.: Fast variants of bit parallel approach to suffix automata. Talk given in: The Second Haifa Annual International Stringology Research Workshop of the Israeli Science Foundation (2005), http://www.cri.haifa.ac.il/events/2005/string/presentations/Holub.pdf
  7. 7.
    Kim, J.W., Kim, E., Park, K.: Fast matching method for DNA sequences. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 271–281. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Klein, S.T., Ben-Nissan, M.K.: Accelerating Boyer Moore searches on binary texts. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 130–143. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Klein, S.T., Bookstein, A., Deerwester, S.: Storing text retrieval systems on cdrom: Compression and encryption considerations. ACM Trans. on Information Systems 7, 230–245 (1989)CrossRefGoogle Scholar
  10. 10.
    Lecroq, T.: Fast exact string matching algorithms. Inf. Process. Lett. 102(6), 229–235 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 14–33. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  12. 12.
    Sunday, D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Simone Faro
    • 1
  • Thierry Lecroq
    • 2
  1. 1.Dipartimento di Matematica e InformaticaUniversità di CataniaItaly
  2. 2.University of RouenMont-Saint-Aignan CedexFrance

Personalised recommendations