Dotted Suffix Trees A Structure for Approximate Text Indexing

  • Luís Pedro Coelho
  • Arlindo L. Oliveira
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4209)


In this work, the problem we address is text indexing for approximate matching. Given a text \(\mathcal{T}\) which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space \(\mathcal{O}(n\log^kn)\) in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in \(\mathcal{O}(3^k m^{k+1})\) and to report the occurrences in \(\mathcal{O}(3^k m^{k+1} + \mbox{\it ed})\) time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by \(\mathcal{O}(kN|\Sigma|)\), where N is the number of nodes in the index and |Σ| the alphabet size.


string algorithms suffix trees approximate text matching text indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Weiner, P.: Linear pattern matching algorithms. In: FOCS, pp. 1–11. IEEE, Los Alamitos (1973)Google Scholar
  2. 2.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33 (2001)Google Scholar
  3. 3.
    Maaß, M.G., Nowak, J.: Text indexing with errors. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 21–32. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Chattaraj, A., Parida, L.: An inexact-suffix-tree-based algorithm for detecting extensible patterns. Theor. Comput. Sci. 335, 3–14 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Cole, R., Gottlieb, L.A., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: STOC, pp. 91–100 (2004)Google Scholar
  6. 6.
    McCreight, E.: A space-economical suffix tree construction algorithm. J. ACM 23, 262–272 (1976)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Apostolico, A., Szpankowski, W.: Self-alignments in words and their applications. J. Algorithms 13, 446–467 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Gusfield, D.: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, New York (1997)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Luís Pedro Coelho
    • 1
  • Arlindo L. Oliveira
    • 1
  1. 1.INESC-ID/IST 

Personalised recommendations