Position-Restricted Substring Searching over Small Alphabets

  • Sudip Biswas
  • Tsung-Han Ku
  • Rahul Shah
  • Sharma V. Thankachan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8214)

Abstract

We consider the problem of indexing a given text T[0...n − 1] of n characters over an alphabet set Σ of size σ, in order to answer the position-restricted substring searching queries. The query input consists of a pattern P (of length p) and two indices ℓ and r and the output is the set of all occ ℓ,r occurrences of P in T[ℓ...r]. In this paper, we propose an O(nlogσ)-word space index with O(p + occ ℓ,r loglogn) query time. Our solution is interesting when the alphabet size is small. For example, when the alphabet set is of constant size, we achieve exponential time improvement over the previously best-known linear space index by Navarro and Nekrich [SWAT 2012] with O(p + occ ℓ,r log ε n) query time, where ε > 0 is any positive constant. We also study the property matching problem and provide an improved index for handling semi-dynamic (only insertions) properties, where we use position-restricted substring queries as the main technique.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amir, A., Chencinski, E., Iliopoulos, C.S., Kopelowitz, T., Zhang, H.: Property matching and weighted matching. Theoretical Computer Science 395, 298–310 (2008)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Amir, A., Farach, M., Idury, R.M., La Poutré, J.A., Schäffer, A.A.: Improved Dynamic Dictionary Matching. Information and Computation 119(2), 258–282 (1995)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Bille, P., Gørtz, I.L.: Substring Range Reporting. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 299–308. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Chan, T.M., Larsen, K.G., Patrascu, M.: Orthogonal range searching on the RAM, revisited. In: SoCG, pp. 1–10 (2011)Google Scholar
  5. 5.
    Chien, Y.-F., Hon, W.K., Shah, R., Thankachan, S.V., Vitter, J.S.: Geometric BWT: Compressed Text Indexing via Sparse Suffixes and Range Searching. Algorithmica, 1–21 (2013)Google Scholar
  6. 6.
    Crochemore, M., Iliopoulos, C.S., Kubica, M., Rahman, M.S., Walen, T.: Improved Algorithms for the Range Next Value Problem and Applications. In: STACS, pp. 205–216 (2008)Google Scholar
  7. 7.
    Gagie, T., Gawrychowski, P.: Linear-Space Substring Range Counting over Polylogarithmic Alphabets. CoRR, arXiv: 1202.3208 (2012)Google Scholar
  8. 8.
    Golynski, A., Munro, J.I., Rao, S.S.: Rank/Select Operations on Large Alphabets: A Tool for Text Indexing. In: SODA, pp. 368–373 (2006)Google Scholar
  9. 9.
    Hon, W.K., Patil, M., Shah, R., Thankachan, S.V.: Compressed Property Suffix Tree. In: IEEE Data Compression Conference, pp. 123–132 (2011)Google Scholar
  10. 10.
    Hon, W.K., Shah, R., Thankachan, S.V., Vitter, J.S.: On position restricted substring searching in succinct space. Journal of Discrete Algorithms (2012); Hon, W.-K., Ku, T.-H., Shah, R., Thankachan, S.V., Vitter, J.S.: Compressed text indexing with wildcards. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 267–277. Springer, Heidelberg (2011)Google Scholar
  11. 11.
    Juan, M.T., Liu, J.J., Wang, Y.L.: Errata for “Faster index for property matching”. Information Processing Letter 109(18), 1027–1029 (2009)CrossRefMATHGoogle Scholar
  12. 12.
    Kopelowitz, T.: The Property Suffix Tree with Dynamic Properties. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 63–75. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Kopelowitz, T., Lewenstein, M., Porat, E.: Persistency in Suffix Trees with Applications to String Interval Problems. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 67–80. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  14. 14.
    Mäkinen, V., Navarro, G.: Position-Restricted Substring Searching. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 703–714. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Manber, U., Myers, G.: Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    McCreight, E.M.: A Space-Economical Suffix Tree Construction Algorithm. Journal of the ACM 23(2), 262–272 (1976)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Nekrich, Y., Navarro, G.: Sorted Range Reporting. In: Fomin, F.V., Kaski, P. (eds.) SWAT 2012. LNCS, vol. 7357, pp. 271–282. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  18. 18.
    Weiner, P.: Linear Pattern Matching Algorithms. In: SWAT (1973)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sudip Biswas
    • 1
  • Tsung-Han Ku
    • 2
  • Rahul Shah
    • 1
  • Sharma V. Thankachan
    • 1
  1. 1.Louisiana State UniversityUSA
  2. 2.National Tsing Hua UniversityHsinchuTaiwan

Personalised recommendations