Skip to main content

Sublinear Space Algorithms for the Longest Common Substring Problem

  • Conference paper
Algorithms - ESA 2014 (ESA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8737))

Included in the following conference series:

Abstract

Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic \(\mathcal{O}(n)\) space and \(\mathcal{O}(n)\) time solution (Weiner [FOCS’73], Hui [CPM’92]). However, the use of linear space is impractical in many applications. In this paper we show that for any trade-off parameter 1 ≤ τ ≤ n, the LCS problem can be solved in \(\mathcal{O}(\tau)\) space and \(\mathcal{O}(n^2/\tau)\) time, thus providing the first smooth deterministic time-space trade-off from constant to linear space. The result uses a new and very simple algorithm, which computes a τ-additive approximation to the LCS in \(\mathcal{O}(n^2/\tau)\) time and \(\mathcal{O}(1)\) space. We also show a time-space trade-off lower bound for deterministic branching programs, which implies that any deterministic RAM algorithm solving the LCS problem on documents from a sufficiently large alphabet in \(\mathcal{O}(\tau)\) space must use \(\Omega(n\sqrt{\log(n/(\tau\log n))/\log\log(n/(\tau\log n)})\) time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afek, Y., Bremler-Barr, A., Landau Feibish, S.: Automated signature extraction for high volume attacks. In: Proc. 9th ANCS, pp. 147–156 (2013)

    Google Scholar 

  2. Beame, P.: Clifford, R., Machmouchi, W.: Element Distinctness, Frequency Moments, and Sliding Windows. In: Proc. 54th FOCS, pp. 290–299 (2013)

    Google Scholar 

  3. Beame, P., Saks, M., Sun, X., Vee, E.: Time-Space Trade-Off Lower Bounds for Randomized Computation of Decision Problems. Journal of the ACM 50(2), 154–195 (2003)

    Article  MathSciNet  Google Scholar 

  4. Borodin, A., Cook, S.A.: A Time-Space Tradeoff for Sorting on a General Sequential Model of Computation. SIAM Journal on Computing 11(2), 287–297 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  5. Breslauer, D., Grossi, R., Mignosi, F.: Simple Real-Time Constant-Space String Matching. Theor. Comput. Sci. 483, 2–9 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  6. Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press (2007)

    Google Scholar 

  7. Farach-Colton, M.: Optimal Suffix Tree Construction with Large Alphabets. In: Proc. 38th FOCS, pp. 137–143 (1997)

    Google Scholar 

  8. Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. SIAM Journal on Computing 35(2), 378–407 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  9. Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)

    Google Scholar 

  10. Han, Y.: Deterministic sorting in O(nloglogn) time and linear space. Journal of Algorithms 50(1), 96–105 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  11. Hui, L.C.K.: Color Set Size Problem with Applications to String Matching. In: Apostolico, A., Galil, Z., Manber, U., Crochemore, M. (eds.) CPM 1992. LNCS, vol. 644, pp. 230–243. Springer, Heidelberg (1992)

    Chapter  Google Scholar 

  12. Kreibich, C., Crowcroft, J.: Honeycomb: Creating Intrusion Detection Signatures Using Honeypots. ACM SIGCOMM Comput. Commun. Rev. 34(1), 51–56 (2004)

    Article  Google Scholar 

  13. Navarro, G., Mäkinen, V.: Compressed Full-Text Indexes. ACM Computing Surveys (CSUR) 39(1), 2 (2007)

    Article  Google Scholar 

  14. Ružić, M.: Constructing Efficient Dictionaries in Close to Sorting Time. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 84–95. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Starikovskaya, T., Vildhøj, H.W.: Time-Space Trade-Offs for the Longest Common Substring Problem. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 223–234. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Wang, K., Cretu, G.F., Stolfo, S.J.: Anomalous Payload-Based Worm Detection and Signature Generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Weiner, P.: Linear Pattern Matching Algorithms. In: Proc. 14th FOCS (SWAT), pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kociumaka, T., Starikovskaya, T., Vildhøj, H.W. (2014). Sublinear Space Algorithms for the Longest Common Substring Problem. In: Schulz, A.S., Wagner, D. (eds) Algorithms - ESA 2014. ESA 2014. Lecture Notes in Computer Science, vol 8737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44777-2_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44777-2_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44776-5

  • Online ISBN: 978-3-662-44777-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics