Factor Oracle: A New Structure for Pattern Matching

  • Cyril Allauzen
  • Maxime Crochemore
  • Mathieu Raffinot
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1725)

Abstract

We introduce a new automaton on a word p, sequence of letters taken in an alphabet Σ, that we call factor oracle. This automaton is acyclic, recognizes at least the factors of p, has m+1 states and a linear number of transitions. We give an on-line construction to build it. We use this new structure in string matching algorithms that we conjecture optimal according to the experimental results. These algorithms are as effecient as the ones that already exist using less memory and being more easy to implement.

Keywords

indexing finite automaton pattern matching algorithm design 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    C. Allauzen, M. Crochemore, and M. Raffinot. Factor oracle, Suffix oracle. Technical Report 99-08, Instituté Gaspard-Monge, Universitéde Marne-la-Vallée, 1999. http://www-igm.univ-mlv.fr/raffinot/ftp/IGM99-08-english.ps.gz.
  2. 2.
    C. Allauzen and M. Raffinot. Oracle des facteurs d’un ensemble de mots. Rapport technique 99-11, Instituté Gaspard Monge, Université de Marne-la-Vallée, 1999. http://www-igm.univ-mlv.fr/ raffinot/ftp/IGM99-11.ps.gz.
  3. 3.
    R. A. Baeza-Yates.Searching subsequences. Theor. Comput. Sci., 78(2):36–376, 1991.CrossRefMathSciNetGoogle Scholar
  4. 4.
    A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, M. T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci., 40(1): 31–55, 1985.MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    A. Blumer, A. Ehrenfeucht, and D. Haussler. Average size of suffix trees and DAWGS. Discret. Appl. Math., 24:37–45, 1989.MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    R. S. Boyer and J. S. Moore. A fast string searching algorithm. Commun. ACM, 20(10):762–772, 1977.CrossRefGoogle Scholar
  7. 7.
    M. Crochemore. Transducers and repetitions. Theor. Comput. Sci., 45(1):63–86, 1986.MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    M. Crochemore, F. Mignosi, and A. Restivo. Automata and forbidden words. Information Processing Letters, 67(3):111–117, 1998.CrossRefMathSciNetGoogle Scholar
  9. 9.
    M. Crochemore, F. Mignosi, and A. Restivo. Minimal forbidden words and factor automata. In L. Brim, J. Gruska, and J. Zlatuška, editors, Mathematical Foundations of Computer Science 1998, number 1450 in LNCS, pages 665–673. Springer-Verlag, 1998. Extended abstract of [8].CrossRefGoogle Scholar
  10. 10.
    M. Crochemore, F. Mignosi, A. Restivo, and S. Salemi. Text compression using antidictonaries. Rapport I.G.M. 98-10, Université de Marne-la-Vallée, 1998.Google Scholar
  11. 11.
    M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994.Google Scholar
  12. 12.
    M. Crochemore and R. Vérin. Direct construction of compact directed acyclic word graphs. In A Apostolico and J. Hein, editors, Combinatorial Pattern Matching, number 1264 in LNCS, pages 116–129. Springer-Verlag, 1997.Google Scholar
  13. 13.
    M. Crochemore and R. Vérin. On compact directed acyclic word graphs. In J. Mycielski, G. Rozenberg, and A. Salomaa, editors, Structures in Logic and Computer Science, number 1261 in LNCS, pages 192–211. Springer-Verlag, 1997.Google Scholar
  14. 14.
    A. Czumaj, M. Crochemore, L. Gasieniec, S. Jarominek, T. Lecroq, W. Plandowski, and W. Rytter. Speeding up two string-matching algorithms. Algorithmica, 12:247–267, 1994.MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    D. Sunday. A very fast substring search algorithm. CACM, 33(8):132–142, August 1990.Google Scholar

Copyright information

© springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Cyril Allauzen
    • 1
  • Maxime Crochemore
    • 1
  • Mathieu Raffinot
    • 1
  1. 1.Institut Gaspard-MongeUniversité de Marne-la-ValléeFrance

Personalised recommendations