Counting Suffix Arrays and Strings

  • Klaus-Bernd Schürmann
  • Jens Stoye
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3772)

Abstract

Suffix arrays are used in various application and research areas like data compression or computational biology. In this work, our goal is to characterize the combinatorial properties of suffix arrays and their enumeration. For fixed alphabet size and string length we count the number of strings sharing the same suffix array and the number of such suffix arrays. Our methods have applications to succinct suffix arrays and build the foundation for the efficient generation of appropriate test data sets for suffix array based algorithms.We also show that summing up the strings for all suffix arrays builds a particular instance for some summation identities of Eulerian numbers.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Manber, U., Myers, E.W.: Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing 22, 935–948 (1993)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Gonnet, G.H., Baeza-Yates, R.A., Snider, T.: New indices for text: Pat trees and pat arrays. In: Frakes, W.B., Baeza-Yates, R.A. (eds.) Information retrieval: data structures and algorithms, pp. 66–82. Prentice-Hall, Englewood Cliffs (1992)Google Scholar
  3. 3.
    Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms 2, 53–86 (2004)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Bannai, H., Inenaga, S., Shinohara, A., Takeda, M.: Inferring strings from graphs and arrays. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 208–217. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Burkhardt, S., Kärkkäinen, J.: Fast lightweight suffix array construction and checking. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 55–69. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Duval, J.P., Lefebvre, A.: Words over an ordered alphabet and suffix permutations. RAIRO – Theoretical Informatics and Applications 36, 249–259 (2002)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Technical Report 124, Digital System Research Center (1994)Google Scholar
  8. 8.
    Grossi, R., Vitter, J.S.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing (STOC 2000), pp. 397–406 (2000)Google Scholar
  9. 9.
    Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS 2000), pp. 390–398. IEEE Computer Society, Los Alamitos (2000)CrossRefGoogle Scholar
  10. 10.
    Sadakane, K.: Compressed text databases with efficient query algorithms based on the compressed suffix array. In: Lee, D.T., Teng, S.-H. (eds.) ISAAC 2000. LNCS, vol. 1969, pp. 410–421. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    He, M., Munro, J.I., Rao, S.S.: A categorization theorem on suffix arrays with applications to space efficient text indexes. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2005), pp. 23–32. SIAM, Philadelphia (2005)Google Scholar
  12. 12.
    Demaine, E.D., López-Ortiz, A.: A linear lower bound on index size for text retrieval. Journal of Algorithms 48, 2–15 (2003)MATHCrossRefGoogle Scholar
  13. 13.
    Miltersen, P.B.: Lower bounds on the size of selection and rank indexes. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2005), pp. 11–12. SIAM, Philadelphia (2005)Google Scholar
  14. 14.
    Crochemore, M., Désarménien, J., Perrin, D.: A note on the Burrows-Wheeler transformation. Theoretical Computer Science 332, 567–572 (2005)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Schürmann, K.B., Stoye, J.: Counting suffix arrays and strings. Technical Report 2005-04, Technische Fakultät, Universität Bielefeld, Germany (2005)Google Scholar
  16. 16.
    Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics, 2nd edn. Addison-Wesley, Reading (1994)MATHGoogle Scholar
  17. 17.
    Moore, D., Smyth, W.F., Miller, D.: Counting distinct strings. Algorithmica 23, 1–13 (1999)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Klaus-Bernd Schürmann
    • 1
  • Jens Stoye
    • 1
  1. 1.AG Genominformatik, Technische FakultätUniversität BielefeldGermany

Personalised recommendations