Time-Space Trade-Offs for Longest Common Extensions

  • Philip Bille
  • Inge Li Gørtz
  • Benjamin Sach
  • Hjalte Wedel Vildhøj
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7354)

Abstract

We revisit the longest common extension (LCE) problem, that is, preprocess a string T into a compact data structure that supports fast LCE queries. An LCE query takes a pair (i,j) of indices in T and returns the length of the longest common prefix of the suffixes of T starting at positions i and j. We study the time-space trade-offs for the problem, that is, the space used for the data structure vs. the worst-case time for answering an LCE query. Let n be the length of T. Given a parameter τ, 1 ≤ τ ≤ n, we show how to achieve either \(O({n}/{\sqrt{\tau}})\) space and O(τ) query time, or O(n/τ) space and \(O(\tau \log({|\ensuremath{\mathrm{LCE}} (i,j)|}/{\tau}))\) query time, where \(|\ensuremath{\mathrm{LCE}} (i,j)|\) denotes the length of the LCE returned by the query. These bounds provide the first smooth trade-offs for the LCE problem and almost match the previously known bounds at the extremes when τ = 1 or τ = n. We apply the result to obtain improved bounds for several applications where the LCE problem is the computational bottleneck, including approximate string matching and computing palindromes. Finally, we also present an efficient technique to reduce LCE queries on two strings to one string.

Keywords

Query Time String Match Approximate String Match Common Extension Outermost Loop 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18 (1975)Google Scholar
  2. 2.
    Allouche, J., Baake, M., Cassaigne, J., Damanik, D.: Palindrome complexity. Theoret. Comput. Sci. 292(1), 9–31 (2003)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. J. Algorithms 50(2), 257–275 (2004)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Breslauer, D., Galil, Z.: Finding all periods and initial palindromes of a string in parallel. Algorithmica 14(4), 355–366 (1995)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Colbourn, C.J., Ling, A.C.: Quorums from difference covers. Inf. Process. Lett. 75(1-2), 9–12 (2000)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Cole, R., Hariharan, R.: Approximate String Matching: A Simpler Faster Algorithm. SIAM J. Comput. 31(6), 1761–1782 (2002)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Dietzfelbinger, M., Meyer auf der Heide, F.: A New Universal Class of Hash Functions and Dynamic Hashing in Real Time. In: Paterson, M. (ed.) ICALP 1990. LNCS, vol. 443, pp. 6–19. Springer, Heidelberg (1990)CrossRefGoogle Scholar
  8. 8.
    Fischer, J., Heun, V.: Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 36–48. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Gusfield, D.: Algorithms on strings, trees, and sequences: computer science and computational biology, Cambridge (1997)Google Scholar
  10. 10.
    Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69, 525–546 (2004)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Gąsieniec, L., Kolpakov, R., Potapov, I.: Space efficient search for maximal repetitions. Theoret. Comput. Sci. 339(1), 35–48 (2005)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Ilie, L., Navarro, G., Tinta, L.: The longest common extension problem revisited and applications to approximate string searching. J. of Discrete Algorithms 8, 418–428 (2010)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Jeuring, J.: The Derivation of On-Line Algorithms, with an Application to Finding Palindromes. Algorithmica 11(2), 146–184 (1994)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. J. ACM 53(6), 918–936 (2006)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Kolpakov, R., Kucherov, G.: Searching for Gapped Palindromes. In: Ferragina, P., Landau, G.M. (eds.) CPM 2008. LNCS, vol. 5029, pp. 18–30. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Landau, G.M., Schmidt, J.P.: An Algorithm for Approximate Tandem Repeats. J. Comput. Biol. 8(1), 1–18 (2001)CrossRefGoogle Scholar
  20. 20.
    Landau, G.M., Vishkin, U.: Fast Parallel and Serial Approximate String Matching. J. Algorithms 10, 157–169 (1989)MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Lu, L., Jia, H., Dröge, P., Li, J.: The human genome-wide distribution of DNA palindromes. Funct. Integr. Genomics 7(3), 221–227 (2007)CrossRefGoogle Scholar
  22. 22.
    Main, M.G., Lorentz, R.J.: An O (n log n) algorithm for finding all repetitions in a string. J. Algorithms 5(3), 422–432 (1984)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Manacher, G.: A New Linear-Time “On-Line” Algorithm for Finding the Smallest Initial Palindrome of a String. J. ACM 22(3), 346–351 (1975)CrossRefMATHGoogle Scholar
  24. 24.
    Matsubara, W., Inenaga, S., Ishino, A., Shinohara, A., Nakamura, T., Hashimoto, K.: Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theoret. Comput. Sci. 410(8-10), 900–913 (2009)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Myers, E.W.: An O(ND) difference algorithm and its variations. Algorithmica 1(2), 251–266 (1986)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Puglisi, S.J., Turpin, A.: Space-Time Tradeoffs for Longest-Common-Prefix Array Computation. In: Hong, S.-H., Nagamochi, H., Fukunaga, T. (eds.) ISAAC 2008. LNCS, vol. 5369, pp. 124–135. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  27. 27.
    Ružić, M.: Uniform deterministic dictionaries. ACM Trans. Algorithms 4, 1–23 (2008)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Philip Bille
    • 1
  • Inge Li Gørtz
    • 1
  • Benjamin Sach
    • 2
  • Hjalte Wedel Vildhøj
    • 1
  1. 1.Technical University of Denmark, DTU InformaticsDenmark
  2. 2.Department of Computer ScienceUniversity of WarwickUK

Personalised recommendations