Approximate String Matching Using a Bidirectional Index

  • Gregory Kucherov
  • Kamil Salikhov
  • Dekel Tsur
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8486)

Abstract

We study strategies of approximate pattern matching that exploit bidirectional text indexes, extending and generalizing ideas of [5]. We introduce a formalism, called search schemes, to specify search strategies of this type, then develop a probabilistic measure for the efficiency of a search scheme, prove several combinatorial results on efficient search schemes, and finally, provide experimental computations supporting the superiority of our strategies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Belazzougui, D., Cunial, F., Kärkkäinen, J., Mäkinen, V.: Versatile succinct representations of the bidirectional burrows-wheeler transform. In: Bodlaender, H.L., Italiano, G.F. (eds.) ESA 2013. LNCS, vol. 8125, pp. 133–144. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Burrow, M., Wheeler, D.: A block-sorting lossless data compression algorithm. Technical report 124, Digital Equipment Corporation, California (1994)Google Scholar
  3. 3.
    Chen, L.H.Y.: Poisson approximation for dependent trials. The Annals of Probability, 534–545 (1975)Google Scholar
  4. 4.
    Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proc. 41st Symposium on Foundation of Computer Science (FOCS), pp. 390–398 (2000)Google Scholar
  5. 5.
    Lam, T.W., Li, R., Tam, A., Wong, S.C.K., Wu, E., Yiu, S.-M.: High throughput short read alignment via bi-directional BWT. In: Proc. IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 31–36 (2009)Google Scholar
  6. 6.
    Lam, T.-W., Sung, W.-K., Wong, S.-S.: Improved approximate string matching using compressed suffix data structures. In: Deng, X., Du, D.-Z. (eds.) ISAAC 2005. LNCS, vol. 3827, pp. 339–348. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3), R25 (2009)CrossRefGoogle Scholar
  8. 8.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  9. 9.
    Li, H., Homer, N.: A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11(5), 473–483 (2010)CrossRefGoogle Scholar
  10. 10.
    Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39(1) (2007)Google Scholar
  11. 11.
    Russo, L.M.S., Navarro, G., Oliveira, A.L., Morales, P.: Approximate string matching with compressed indexes. Algorithms 2(3), 1105–1136 (2009)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Schnattinger, T., Ohlebusch, E., Gog, S.: Bidirectional search in a string with wavelet trees and bidirectional matching statistics. Information and Computation 213, 13–22 (2012)CrossRefMATHMathSciNetGoogle Scholar
  13. 13.
    Simpson, J.T., Durbin, R.: Efficient de novo assembly of large genomes using compressed data structures. Genome Research 22(3), 549–556 (2012)CrossRefGoogle Scholar
  14. 14.
    Sung, W.-K.: Indexed approximate string matching. In: Kao, M.-Y. (ed.) Encyclopedia of Algorithms, pp. 1–99. Springer, US (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Gregory Kucherov
    • 1
    • 2
  • Kamil Salikhov
    • 1
    • 3
  • Dekel Tsur
    • 2
  1. 1.CNRS/LIGMUniversité Paris-Est Marne-la-ValléeFrance
  2. 2.Department of Computer ScienceBen-Gurion University of the NegevIsrael
  3. 3.Mechanics and Mathematics DepartmentLomonosov Moscow State UniversityRussia

Personalised recommendations