Abstract
We consider the problem of identifying periodic trends in data streams. We say a signal \({\mathbf a} \in \ensuremath{\mathbb{R}} ^n\) is p-periodic if a i = a i + p for all i ∈ [n − p]. Recently, Ergün et al. [4] presented a one-pass, O(polylog n)-space algorithm for identifying the smallest period of a signal. Their algorithm required \({\mathbf a}\) to be presented in the time-series model, i.e., a i is the ith element in the stream. We present a more general linear sketch algorithm that has the advantages of being applicable to a) the turnstile stream model, where coordinates can be incremented/decremented in an arbitrary fashion and b) the parallel or distributed setting where the signal is distributed over multiple locations/machines. We also present sketches for (1 + ε) approximating the ℓ2 distance between \({\mathbf a}\) and the nearest p-periodic signal for a given p. Our algorithm uses O(ε − 2 polylog n) space, comparing favorably to an earlier time-series result that used \(O(\epsilon^{-5.5} \sqrt{p} polylog n)\) space for estimating the Hamming distance to the nearest p-periodic signal. Our last periodicity result is an algorithm for estimating the periodicity of a sequence in the presence of noise. We conclude with a small-space algorithm for identifying when two signals are exact (or nearly) cyclic shifts of one another. Our algorithms are based on bilinear sketches [10] and combining Fourier transforms with stream processing techniques such as ℓ p sampling and sketching [13, 11].
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58(1), 137–147 (1999)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. J. Algorithms 55, 58–75 (2005)
Czumaj, A., Gąsieniec, L.: On the complexity of determining the period of a string. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 412–422. Springer, Heidelberg (2000)
Ergün, F., Jowhari, H., Saglam, M.: Periodicity in streams. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX 2010, LNCS, vol. 6302, pp. 545–559. Springer, Heidelberg (2010)
Ergün, F., Muthukrishnan, S., Sahinalp, S.C.: Periodicity testing with sublinear samples and space. ACM Transactions on Algorithms 6(2) (2010)
Gilbert, A.C., Guha, S., Indyk, P., Muthukrishnan, S., Strauss, M.: Near-optimal sparse fourier representations via sampling. In: STOC, pp. 152–161 (2002)
Hardy, G.H., Wright, E.M.: An Introduction to The Theory of Numbers (Fourth Edition). Oxford University Press, Oxford (1960)
Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM 53(3), 307–323 (2006)
Indyk, P., Koudas, N., Muthukrishnan, S.: Identifying representative trends in massive time series data sets using sketches. In: VLDB, pp. 363–372 (2000)
Indyk, P., McGregor, A.: Declaring independence via the sketching of sketches. In: SODA, pp. 737–745 (2008)
Kane, D.M., Nelson, J., Woodruff, D.P.: On the exact space complexity of sketching and streaming small norms. In: SODA, pp. 1161–1178 (2010)
Kane, D.M., Nelson, J., Woodruff, D.P.: An optimal algorithm for the distinct elements problem. In: PODS, pp. 41–52 (2010)
Monemizadeh, M., Woodruff, D.P.: 1-pass relative-error \(\text{L}_p\)-sampling with applications. In: SODA (2010)
Muthukrishnan, S.: Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science 1(2) (2005)
Nisan, N.: Pseudorandom generators for space-bounded computation. Combinatorica 12, 449–461 (1992)
Porat, B., Porat, E.: Exact and approximate pattern matching in the streaming model. In: FOCS, pp. 315–323 (2009)
Shor, P.W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26(5), 1484–1509 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Crouch, M.S., McGregor, A. (2011). Periodicity and Cyclic Shifts via Linear Sketches. In: Goldberg, L.A., Jansen, K., Ravi, R., Rolim, J.D.P. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2011 2011. Lecture Notes in Computer Science, vol 6845. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22935-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-22935-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22934-3
Online ISBN: 978-3-642-22935-0
eBook Packages: Computer ScienceComputer Science (R0)