Advertisement

The Gapped Suffix Array: A New Index Structure for Fast Approximate Matching

  • Maxime Crochemore
  • German Tischler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6393)

Abstract

Approximate searching using an index is an important application in many fields. In this paper we introduce a new data structure called the gapped suffix array for approximate searching in the Hamming distance model. Building on the well known filtration approach for approximate searching, the use of the gapped suffix array can improve search speed by avoiding the merging of position lists.

Keywords

Linear Time Index Structure Lexicographical Order String Match Position List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Böckenhauer, H.-J., Bongartz, D.: Algorithmic Aspects of Bioinformatics. Springer, Heidelberg (2007)MATHGoogle Scholar
  2. 2.
    Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings, 392 p. Cambridge University Press, Cambridge (2007)CrossRefMATHGoogle Scholar
  3. 3.
    Fischer, J., Heun, V.: A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 459–470. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: SODA 2003: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 841–850. Society for Industrial and Applied Mathematics, Philadelphia (2003)Google Scholar
  5. 5.
    Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. SIAM J. Comput. 35(2), 378–407 (2005)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Gusfield, D.: Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge (1997)CrossRefMATHGoogle Scholar
  7. 7.
    Landau, G.M., Vishkin, U.: Efficient string matching with k mismatches. Theor. Comput. Sci. 43, 239–249 (1986)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Landau, G.M., Vishkin, U.: Fast string matching with k differences. J. Comput. Syst. Sci. 37(1), 63–78 (1988)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)CrossRefGoogle Scholar
  10. 10.
    Navarro, G., Baeza-Yates, R.A., Sutinen, E., Tarhio, J.: Indexing methods for approximate string matching. IEEE Data Eng. Bull. 24(4), 19–27 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Maxime Crochemore
    • 1
    • 2
  • German Tischler
    • 1
    • 3
  1. 1.Dept. of Computer ScienceKing’s College LondonLondonUK
  2. 2.Université Paris-EstFrance
  3. 3.Newton FellowUK

Personalised recommendations