On the Suffix Automaton with Mismatches

  • Maxime Crochemore
  • Chiara Epifanio
  • Alessandra Gabriele
  • Filippo Mignosi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4783)

Abstract

In this paper we focus on the construction of the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. We present an algorithm that makes use of Sk in order to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r, where r is the value of the repetition index of w. Moreover, we give some experimental results on some well-known words, like prefixes of Fibonacci and Thue-Morse words, and we make a conjecture on the size of the suffix automaton with mismatches.

Keywords

combinatorics on words suffix automata languages with mismatches approximate string matching 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M., Seiferas, J.: The smallest automaton recognizing the subwords of a text. Theoretical Computer Science 40, 31–55 (1985)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Crochemore, M., Hancart, C., Lecroq, T.: Algorithmique du texte. Vuibert, pp. 347 (2001)Google Scholar
  3. 3.
    Carpi, A., de Luca, A.: Words and special factors. Theoretical Computer Science 259, 145–182 (2001)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Amir, A., Keselman, D., Landau, G.M., Lewenstein, M., Lewenstein, N., Rodeh, M.: Indexing and dictionary matching with one error. Journal of Algorithms 37, 309–325 (2000)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Buchsbaum, A.L., Goodrich, M.T., Westbrook, J.: Range searching over tree cross products. In: Paterson, M.S. (ed.) ESA 2000. LNCS, vol. 1879, pp. 120–131. Springer, Heidelberg (2000)Google Scholar
  6. 6.
    Chávez, E., Navarro, G.: A metric index for approximate string matching. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 181–195. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Cole, R., Gottlieb, L., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: STOC 2004. Proceedings of Annual ACM Symposium on Theory of Computing, ACM Press, New York (2004)Google Scholar
  8. 8.
    Epifanio, C., Gabriele, A., Mignosi, F.: Languages with mismatches and an application to approximate indexing. In: De Felice, C., Restivo, A. (eds.) DLT 2005. LNCS, vol. 3572, pp. 224–235. Springer, Heidelberg (2005)Google Scholar
  9. 9.
    Epifanio, C., Gabriele, A., Mignosi, F., Restivo, A., Sciortino, M.: Languages with mismatches (Theoretical Computer Science) (to appear)Google Scholar
  10. 10.
    Gabriele, A., Mignosi, F., Restivo, A., Sciortino, M.: Indexing structure for approximate string matching. In: Petreschi, R., Persiano, G., Silvestri, R. (eds.) CIAC 2003. LNCS, vol. 2653, pp. 140–151. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Huynh, T.N.D., Hon, W.K., Lam, T.W., Sung, W.K.: Approximate string matching using compressed suffix arrays. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 434–444. Springer, Heidelberg (2004)Google Scholar
  12. 12.
    Maass, M.G., Nowak, J.: Text indexing with errors. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 21–32. Springer, Heidelberg (2005)Google Scholar
  13. 13.
    Crochemore, M., Hancart, C.: Automata for Matching Patterns. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, Linear Modeling: Background and Application, vol. 2, pp. 399–462. Springer, HeidelbergGoogle Scholar
  14. 14.
    Inenaga, S., Hoshino, H., Shinohara, A., Takeda, M., Arikawa, S., Mauri, G., Pavesi, G.: On-line construction of compact directed acyclic word graphs. Discrete Applied Mathematics 146(2), 156–179 (2005)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)MATHGoogle Scholar
  16. 16.
    Lothaire, M.: Combinatorics on Words. Encyclopedia of Mathematics, vol. 17. Cambridge University Press, Cambridge (1983)MATHGoogle Scholar
  17. 17.
    Gabriele, A.: Combinatorics on words with mismatches, algorithms and data structures for approximate indexing with applications. PhD thesis, University of Palermo (2004)Google Scholar
  18. 18.
    Crochemore, M.: Reducing space for index implementation. Theoretical Computer Science 292, 185–197 (2003)MATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Maass, M.G., Nowak, J.: A new method for approximate indexing and dictionary lookup with one error. Information Processing Letters 96, 185–191 (2005)CrossRefMathSciNetGoogle Scholar
  20. 20.
    Amir, A., Keselman, D., Landau, G.M., Lewenstein, M., Lewenstein, N., Rodeh, M.: Indexing and dictionary matching with one error. In: Dehne, F., Gupta, A., Sack, J.-R., Tamassia, R. (eds.) WADS 1999. LNCS, vol. 1663, pp. 181–192. Springer, Heidelberg (1999)Google Scholar
  21. 21.
    Baeza-Yates, R., Navarro, G., Sutinen, E., Tarhio, J.: Indexing methods for approximate string matching. IEEE Data Engineering Bulletin 24, 19–27 (2001) Special issue on Managing Text Natively and in DBMSs. Invited paper. Google Scholar
  22. 22.
    Galil, Z., Giancarlo, R.: Data structures and algorithms for approximate string matching. Journal of Complexity 24, 33–72 (1988)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33, 31–88 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Maxime Crochemore
    • 1
  • Chiara Epifanio
    • 2
  • Alessandra Gabriele
    • 2
  • Filippo Mignosi
    • 3
  1. 1.Institut Gaspard-Monge, Université de Marne-la-Vallée, France and King’s College LondonUK
  2. 2.Dipartimento di Matematica e Applicazioni, Università di PalermoItaly
  3. 3.Dipartimento di Informatica, Università dell’AquilaItaly

Personalised recommendations