WEBIST 2008: Web Information Systems and Technologies pp 65-79 | Cite as
Compressing XML Data Streams with DAG+BSBC
Abstract
Whenever the growing amount of XML data that has to be stored, processed, exchanged, or transmitted becomes a major cost driver or performance bottleneck, XML compression is an important way to reduce these problems. However, many applications, e.g. those exchanging XML data streams, also require efficient path query processing on the structure of compressed XML data streams. We present an XML compression technique called DAG+BSBC, which extends Bit-Stream-Based-Compression (BSBC) [3] by a sparse index to compressed constants that reflects DAG pointers. Furthermore, DAG+BSBC supports XML stream compression, queries on compressed data, and provides a compression ra tio that not only significantly outperforms that of other queriable XML compression tech niques, like XGrind, but is also very competitive compared to non-queriable compression tech niques like gzip and XMill.
Keywords
Compression Ratio Query Processing Query Evaluation Path Query Inverted ListPreview
Unable to display preview. Download preview PDF.
References
- 1.Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: XQueC: A Query-Conscious Compressed XML Database. ACM Transactions on Internet Technology (to appear)Google Scholar
- 2.Bayardo, R.J., Gruhl, D., Josifovski, V., Myllymaki, J.: An evaluation of binary XML encoding optimizations for fast stream based XML processing. In: Proc. of the 13th international conference on World Wide Web (2004)Google Scholar
- 3.Böttcher, S., Hartel, R., Heinzemann, C.: Towards a succinct data format for XML streams. In: International Conference on Web Information Systems (WEBIST) (2008)Google Scholar
- 4.Böttcher, S., Steinmetz, R., Klein, N.: XML Index Compression by DTD Subtraction. In: International Conference on Enterprise Information Systems (ICEIS) (2007)Google Scholar
- 5.Böttcher, S., Steinmetz, R.: Data Management for Mobile Ajax Web 2.0 Applications. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 424–433. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 6.Buneman, P., Grohe, M., Koch, C.: Path Queries on Compressed XML. In: VLDB (2003)Google Scholar
- 7.Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)Google Scholar
- 8.Busatto, G., Lohrey, M., Maneth, S.: Efficient Memory Representation of XML Documents. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 199–216. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 9.Candan, K.S., Hsiung, W.-P., Chen, S., Tatemura, J., Agrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB (2006)Google Scholar
- 10.Cheney, J.: Compressing XML with multiplexed hierarchical models. In: Proceedings of the 2001 IEEE Data Compression Conference (DCC 2001) (2001)Google Scholar
- 11.Cheng, J., Ng, W.: XQzip, Querying Compressed XML Using Structural Indexing. In: EDBT (2004) Google Scholar
- 12.Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and Searching XML Data Via Two Zips. In: Proceedings of the Fifteenth International World Wide Web Conference (2006)Google Scholar
- 13.Franceschet, M.: XPathMark: an XPath benchmark for XMark generated data. In: Bressan, S., Ceri, S., Hunt, E., Ives, Z.G., Bellahsène, Z., Rys, M., Unland, R. (eds.) XSym 2005. LNCS, vol. 3671, pp. 129–143. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 14.Girardot, M., Sundaresan, N., Millau: An Encoding Format for Efficient Representation and Exchange of XML over the Web. In: Proceedings of the 9th International WWW Conference (2000)Google Scholar
- 15.Huffman, D.A.: A method for the construction of minimum-redundancy codes. In: Proc. of the I.R.E. (1952)Google Scholar
- 16.Liefke, H., Suciu, D.: XMill: An Efficient Compressor for XML Data. In: Proc. of ACM SIGMOD (2000)Google Scholar
- 17.Min, J.K., Park, M.J., Chung, C.W.: XPRESS: A Queriable Compression for XML Data. In: Proceedings of SIGMOD (2003)Google Scholar
- 18.Ng, W., Lam, W.Y., Wood, P.T., Levene, M.: XCQ: A queriable XML compression system. Knowledge and Information Systems (2006)Google Scholar
- 19.Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: Looking Forward. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 109–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 20.Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management, Hong Kong, China (2002)Google Scholar
- 21.Tolani, P.M., Hartisa, J.R.: XGRIND: A query-friendly XML compressor. In: Proc. ICDE (2002)Google Scholar
- 22.Yao, B.B., Özsu, M.T.: XBench - A family of benchmarks for XML DBMS (2002)Google Scholar
- 23.Zhang, N., Kacholia, V., Özsu, M.T.: A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In: ICDE (2004)Google Scholar