Abstract
We show that, given a string s of length n, with constant memory and logarithmic passes over a constant number of streams we can build a context-free grammar that generates s and only s and whose size is within an \({\mathcal O}\left({\min \left( g \log g, \sqrt{n / \log n} \right)}\right)\)-factor of the minimum g. This stands in contrast to our previous result that, with polylogarithmic memory and polylogarithmic passes over a single stream, we cannot build such a grammar whose size is within any polynomial of g.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Albert, P., Mayordomo, E., Moser, P., Perifel, S.: Pushdown compression. In: Proceedings of the Symposium on Theoretical Aspects of Computer Science, pp. 39–48 (2008)
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58(1), 137–147 (1999)
Amir, A., Aumann, Y., Levy, A., Roshko, Y.: Quasi-distinct parsing and optimal compression methods. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 12–25. Springer, Heidelberg (2009)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the Symposium on Database Systems, pp. 1–16 (2002)
Beame, P., Huynh, T.: On the value of multiple read/write streams for approximating frequency moments. In: Proceedings of the Symposium on Foundations of Computer Science, pp. 499–508 (2008)
Beame, P., Jayram, T.S., Rudra, A.: Lower bounds for randomized read/write stream algorithms. In: Proceedings of the Symposium on Theory of Computing, pp. 689–698 (2007)
Bille, P., Landau, G., Weimann, O.: Random access to grammar compressed strings (2010), http://arxiv.org/abs/1001.1565
Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., shelat, a.: The smallest grammar problem. IEEE Transactions on Information Theory 51(7), 2554–2576 (2005)
Chen, J., Yap, C.-K.: Reversal complexity. SIAM Journal on Computing 20(4), 622–638 (1991)
Claude, F., Navarro, G.: Self-indexed text compression using straight-line programs. In: Královič, R., Niwiński, D. (eds.) MFCS 2009. LNCS, vol. 5734, pp. 235–246. Springer, Heidelberg (2009)
De Agostino, S., Storer, J.A.: On-line versus off-line computation in dynamic text compression. Information Processing Letters 59(3), 169–174 (1996)
Ferragina, P., Gagie, T., Manzini, G.: Lightweight data indexing and compression in external memory. In: Proceedings of the Latin American Theoretical Informatics Symposium (to appear, 2010)
Gagie, T.: On the value of multiple read/write streams for data compression. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 68–77. Springer, Heidelberg (2009)
Gagie, T., Manzini, G.: Space-conscious compression. In: Kučera, L., Kučera, A. (eds.) MFCS 2007. LNCS, vol. 4708, pp. 206–217. Springer, Heidelberg (2007)
Grohe, M., Hernich, A., Schweikardt, N.: Lower bounds for processing data with few random accesses to external memory. Journal of the ACM 56(3), 1–58 (2009)
Grohe, M., Schweikardt, N.: Lower bounds for sorting with few random accesses to external memory. In: Proceedings of the Symposium on Database Systems, pp. 238–249 (2005)
Hernich, A., Schweikardt, N.: Reversal complexity revisited. Theoretical Computer Science 401(1-3), 191–205 (2008)
Kieffer, J.C., Yang, E.-H.: Grammar-based codes: A new class of universal lossless source codes. IEEE Transactions on Information Theory 46(3), 737–754 (2000)
Kieffer, J.C., Yang, E.-H., Nelson, G.J., Cosman, P.C.: Universal lossless compression via multilevel pattern matching. IEEE Transactions on Information Theory 46(4), 1227–1245 (2000)
Kosaraju, S.R., Manzini, G.: Compression of low entropy strings with Lempel-Ziv algorithms. SIAM Journal on Computing 29(3), 893–911 (1999)
Kreft, S., Navarro, G.: LZ77-like compression with fast random access. In: Proceedings of the Data Compression Conference (to appear, 2010)
Larsson, N.J., Moffat, A.: Offline dictionary-based compression. Proceedings of the IEEE 88(11), 1722–1732 (2000)
Lifshits, Y.: Processing compressed texts: A tractability border. In: Proceedings of the Symposium on Combinatorial Pattern Matching, pp. 228–240 (2007)
Lifshits, Y., Mozes, S., Weimann, O., Ziv-Ukelson, M.: Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54(3), 379–399 (2009)
Magniez, F., Mathieu, C., Nayak, A.: Recognizing well-parenthesized expressions in the streaming model. Technical Report TR09-119, Electronic Colloquium on Computational Complexity (2009)
Mayordomo, E., Moser, P.: Polylog space compression is incomparable with Lempel-Ziv and pushdown compression. In: Proceedings of the Conference on Current Trends in Theory and Practice of Informatics, pp. 633–644 (2009)
Munro, J.I., Paterson, M.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)
Muthukrishnan, S.: Data Streams: Algorithms and Applications. In: Foundations and Trends in Theoretical Computer Science, vol. 1(2). Now Publishers (2005)
Navarro, G., Raffinot, M.: Practical and flexible pattern matching over Ziv-Lempel compressed text. Journal of Discrete Algorithms 2(3), 347–371 (2004)
Navarro, G., Russo, L.M.S.: Re-pair achieves high-order entropy. In: Proceedings of the Data Compression Conference, p. 537 (2008)
Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science 302(1-3), 211–222 (2003)
Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based compression. Journal of Discrete Algorithms 3(2-4), 416–430 (2005)
Sakamoto, H., Kida, T., Shimozono, S.: A space-saving linear-time algorithm for grammar-based compression. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 218–229. Springer, Heidelberg (2004)
Sakamoto, H., Maruyama, S., Kida, T., Shimozono, S.: A space-saving approximation algorithm for grammar-based compression. IEICE Transactions 92-D(2), 158–165 (2009)
Schweikardt, N.: Machine models and lower bounds for query processing. In: Proceedings of the Symposium on Principles of Database Systems, pp. 41–52 (2007)
Sheinwald, D., Lempel, A., Ziv, J.: On encoding and decoding with two-way head machines. Information and Computation 116(1), 128–133 (1995)
Storer, J.A., Szymanski, T.G.: Data compression via textual substitution. Journal of the ACM 29(4), 928–951 (1982)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23(3), 337–343 (1977)
Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gagie, T., Gawrychowski, P. (2010). Grammar-Based Compression in a Streaming Model. In: Dediu, AH., Fernau, H., Martín-Vide, C. (eds) Language and Automata Theory and Applications. LATA 2010. Lecture Notes in Computer Science, vol 6031. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13089-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-13089-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13088-5
Online ISBN: 978-3-642-13089-2
eBook Packages: Computer ScienceComputer Science (R0)