Chapter

Computing and Combinatorics

Volume 3595 of the series Lecture Notes in Computer Science pp 263-272

Finding Longest Increasing and Common Subsequences in Streaming Data

  • David Liben-NowellAffiliated withDepartment of Mathematics and Computer Science, Carleton College
  • , Erik VeeAffiliated withIBM Almaden Research Center
  • , An ZhuAffiliated withGoogle, Inc

* Final gross prices may vary according to local VAT.

Get Access

Abstract

We present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data-streaming model. To decide if the LIS of a given stream of elements drawn from an alphabet Σ has length at least k, we discuss a one-pass algorithm using O(k log|Σ|) space, with update time either O(log k) or O(loglog|Σ|); for |Σ| = O(1), we can achieve O(log k) space and constant-time updates. We also prove a lower bound of Ω(k) on the space requirement for this problem for general alphabets Σ, even when the input stream is a permutation of Σ. For finding the actual LIS, we give a ⌈ log (1+1/ε) ⌉-pass algorithm using O(k 1 + ε log|Σ|) space, for any ε > 0. For LCS, there is a trivial Θ(1)-approximate O(log n)-space streaming algorithm when |Σ| = O(1). For general alphabet Σ, the problem is much harder. We prove several lower bounds on the LCS problem, of which the strongest is the following: it is necessary to use Ω(n/ρ 2) space to approximate the LCS of two n-element streams to within a factor of ρ, even if the streams are permutations of each other.