Advertisement

Constructing suffix arrays for multi-dimensional matrices

  • Dong Kyue Kim
  • Yoo Ah Kim
  • Kunsoo Park
Session III
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1448)

Abstract

We propose multi-dimensional index data structures that generalize suffix arrays to square matrices and cubic matrices. Giancarlo proposed a two-dimensional index data structure, the Lsuffix tree, that generalizes suffix trees to square matrices. However, the construction algorithm for Lsuffix trees maintains complicated data structures and uses a large amount of space. We present simple and practical construction algorithms for multi-dimensional suffix arrays by applying a new partitioning technique to lexicographic sorting. Our contributions are the following:
  1. (1)

    We present the first algorithm for constructing two-dimensional suffix arrays directly. Our algorithm is ten times faster and five time space-efficient than Giancarlo's algorithm for Lsuffix trees.

     
  2. (2)

    We present an efficient algorithm for three-dimensional suffix arrays, which is the first algorithm for constructing three-dimensional index data structures.

     

Keywords

Equivalence Class Reference Class Suffix Tree Binary Search Tree Suffix Array 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, 1974.Google Scholar
  2. 2.
    A. Amir and M. Farach, Two-dimensional dictionary matching, Inform. Processing Letters 21 (1992), 233–239.Google Scholar
  3. 3.
    A. Amir, G. Benson and M. Farach, An alphabet independent approach to twodimensional pattern matching, SIAM J. Comput. 23 (1994), 313–323.Google Scholar
  4. 4.
    A. Apostolico, The myriad virtues of subword trees, Combinatorial Algorithms on Words, Springer-Verlag, (1985), 85–95Google Scholar
  5. 5.
    M. Crochemore, An optimal algorithm for computing the repetitions in a word, Inform. Processing Letters 12 (1981), 244–250.Google Scholar
  6. 6.
    Z. Galil and K. Park, Alphabet-independent two-dimensional witness computation, SIAM J. Comput. 25, (1996), 907–935.Google Scholar
  7. 7.
    R. Giancarlo, A generalization of the suffix tree to square matrices with application, SIAM J. Comput. 24, (1995), 520–562.Google Scholar
  8. 8.
    D. Gusfield, Algorithms on Strings, Trees, and Sequences, Cambridge Univ. Press, 1997.Google Scholar
  9. 9.
    J.E. Hopcroft, An n log n algorithm for minimizing states in a finite automaton, in Kohavi and Paz, eds., Theory of Machines and Computations, Academic Press, New York, 1971.Google Scholar
  10. 10.
    C.S. Iliopoulos, D.W.G. Moore and K. Park, Covering a string, Algorithmica 16 (1996),288–297.Google Scholar
  11. 11.
    J. Kärkkäinen, A cross between suffix tree and suffix array, Symp. Combinatorial Pattern Matching (1995), 191–204.Google Scholar
  12. 12.
    G.M. Landau and U. Vishkin, Fast parallel and serial approximate string matching, J. Algorithms 10 (1989), 157–169.Google Scholar
  13. 13.
    S.E. Lee and K. Park, A new algorithm for constructing suffix arrays, J. Korea Information Science Society 24 (1997), 697–703 (written in Korean).Google Scholar
  14. 14.
    U. Manber and G. Myers, Suffix arrays: A new method for on-line string searches, SIAM J. Comput. 22, (1993), 935–938.Google Scholar
  15. 15.
    E.M. McCreight, A space-economical suffix tree construction algorithms, J. ACM 23 (1976), 262–272.Google Scholar
  16. 16.
    R. Paige and R.E. Tarjan, Three partition refinement algorithms, SIAM J. Comput. 16, (1987),973–989.Google Scholar
  17. 17.
    R. Paige, R.E. Tarjan and R. Bonic, A linear time solution to the single function coarsest partition problem, Theoret. Comput. Sci. 40, (1985), 67–84.Google Scholar
  18. 18.
    B. Schieber and U. Vishkin, On finding lowest common ancestors: simplification and parallelization, SIAM J. Comput. 17, (1988), 1253–1262.Google Scholar
  19. 19.
    D.D. Sleater and R.E. Tarjan, A data structure for dynamic trees, J. Comput. System Sci. 26, (1983), 362–391.Google Scholar
  20. 20.
    D.D. Sleater and R.E. Tarjan, Self-adjusting binary search trees, J. ACM 32, (1985),652–686.Google Scholar
  21. 21.
    E. Ukkonen and D. Wood, Approximate string matching with suffix automata, Algorithmica 10 (1993), 353–364.Google Scholar
  22. 22.
    P. Weiner, Linear pattern matching algorithms, Proc. 14th IEEE Symp. Switching and Automata Theory (1973), 1–11.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Dong Kyue Kim
    • 1
  • Yoo Ah Kim
    • 1
  • Kunsoo Park
    • 1
  1. 1.Department of Computer EngineeringSeoul National UniversitySeoulKorea

Personalised recommendations