Skip to main content

Direct construction of compact directed acyclic word graphs

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1264))

Included in the following conference series:

Abstract

The Directed Acyclic Word Graph (DAWG) is an efficient data structure to treat and analyze repetitions in a text, especially in DNA genomic sequences. Here, we consider the Compact Directed Acyclic Word Graph of a word. We give the first direct algorithm to construct it. It runs in time linear in the length of the string on a fixed alphabet. Our implementation requires half the memory space used by DAWGs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Anderson and S. Nilsson. Efficient implementation of suffix trees. Software, Practice and Experience, 25(2):129–141, Feb. 1995.

    Google Scholar 

  2. A. Apostolico. The myriad virtues of subword trees. In A. Apostolico & Z. Galil, editor, Combinatorial Algorithms on Words., pages 85–95. Springer-Verlag, 1985.

    Google Scholar 

  3. A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M.T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theoret. Comput. Sci., 40:31–55, 1985.

    Google Scholar 

  4. A. Blumer, J. Blumer, D. Haussler, and R. McConnell. Complete inverted files for efficient text retrieval and analysis. Journal of the Association for Computing Machinery, 34(3):578–595, July 1987.

    Google Scholar 

  5. A. Blumer, D. Haussler, and A. Ehrenfeucht. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24:37–45, 1989.

    Google Scholar 

  6. B. Clift, D. Haussler, R. McDonnell, T.D. Schneider, and G.D. Stormo. Sequence landscapes. Nucleic Acids Research, 4(1):141–158, 1986.

    Google Scholar 

  7. M. Crochemore. Transducers and repetitions. Theor. Comp. Sci., 45:63–86, 1986.

    Google Scholar 

  8. M. Crochemore and W. Rytter. Text Algorithms, chapter 5–6, pages 73–130. Oxford University Press, New York, 1994.

    Google Scholar 

  9. R. W. Irving. Suffix binary search trees. Technical report TR-1995-7, Computing Science Department, University of Glasgow, April 1995.

    Google Scholar 

  10. J. Karkkainen. Suffix cactus: a cross between suffix tree and suffix array. CPM, 937:191–204, July 1995.

    Google Scholar 

  11. C. Lefevre and J-E. Ikeda. The position end-set tree: A small automaton for word recognition in biological sequences. CABIOS, 9(3):343–348, 1993.

    Google Scholar 

  12. U. Manber and G. Myers. Suffix arrays: A new method for on-line string searches. SIAM J. Comput., 22(5):935–948, Oct. 1993.

    Google Scholar 

  13. E. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, Apr. 1976.

    Google Scholar 

  14. E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249–260, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Jotun Hein

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crochemore, M., Vérin, R. (1997). Direct construction of compact directed acyclic word graphs. In: Apostolico, A., Hein, J. (eds) Combinatorial Pattern Matching. CPM 1997. Lecture Notes in Computer Science, vol 1264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63220-4_55

Download citation

  • DOI: https://doi.org/10.1007/3-540-63220-4_55

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63220-7

  • Online ISBN: 978-3-540-69214-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics