Advertisement

A Compact Representation of Nondeterministic (Suffix) Automata for the Bit-Parallel Approach

  • Domenico Cantone
  • Simone Faro
  • Emanuele Giaquinta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6129)

Abstract

We present a novel technique, suitable for bit-parallelism, for representing both the nondeterministic automaton and the nondeterministic suffix automaton of a given string in a more compact way. Our approach is based on a particular factorization of strings which on the average allows to pack in a machine word of w bits automata state configurations for strings of length greater than w. We adapted the Shift-And and BNDM algorithms using our encoding and compared them with the original algorithms. Experimental results show that the new variants are generally faster for long patterns.

Keywords

Compact Representation String Match Shift Length Computer Word Word Variant 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arnold, R., Bell, T.: A corpus for the evaluation of lossless compression algorithms. In: DCC 1997: Proceedings of the Conference on Data Compression, Washington, DC, USA, p. 201. IEEE Computer Society, Los Alamitos (1997), http://corpus.canterbury.ac.nz/ CrossRefGoogle Scholar
  2. 2.
    Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992)CrossRefGoogle Scholar
  3. 3.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 762–772 (1977)CrossRefGoogle Scholar
  4. 4.
    Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Oxford (1994)zbMATHGoogle Scholar
  5. 5.
    Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM Journal on Computing 6(1), 323–350 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 14–33. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  7. 7.
    Nevill-Manning, C.G., Witten, I.H.: Protein is incompressible. In: DCC 1999: Proceedings of the Conference on Data Compression, Washington, DC, USA, p. 257. IEEE Computer Society, Los Alamitos (1999), http://data-compression.info/Corpora/ProteinCorpus/ Google Scholar
  8. 8.
    Peltola, H., Tarhio, J.: Alternative algorithms for bit-parallel string matching. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 80–94. Springer, Heidelberg (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Domenico Cantone
    • 1
  • Simone Faro
    • 1
  • Emanuele Giaquinta
    • 1
  1. 1.Dipartimento di Matematica e InformaticaUniversità di CataniaCataniaItaly

Personalised recommendations