Abstract
Big data bring us not only constantly growing data volume, dynamic and elastic storage demands, diversified data structures, but also different data features. Apart from the traditional dense data, more and more “sparse” data emerged and account for the majority of the massive data. How to adapt to the characteristics of the sparse data without losing sight of the traits of the dense data is a challenge. This paper studies how to integrate row and column data-layouts for both dense and sparse datasets in the cloud. A new NF2 scalable storage structure named “Dynamic Table” based on the key-value storage is proposed. The formal definition of dynamic table and implemention on HDFS is also introduced.
This work is supported by Natural Science Foundation of China (NSFC) under grant numbers: 60973002 and 61170003, and National Science and Technology Major Program (No.2010ZX01042-001-003-05, 2010ZX01042-002-002-02).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gantz, J.F.: The Expanding Digital Universe. International Data Corporation (2007)
Doan, A., Naughton, J.F., Baid, A., et al.: Information extraction challenges in managing unstructured data. ACM SIGMOD Record Archive 37(4), 14–20 (2008)
Lux, M., Chatzichristofis, S.A.: LIRe: Lucene Image Retrieval – An Extensible Java CBIR Library. In: MM 2008 Proceedings of the 16th ACM international conference on Multimedia, pp. 1085–1088 (2008)
Tamura, H., Mori, S., Yamawaki, T.: Textural features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics 8(6), 460–472 (1978)
Hirata, K., Kato, T.: Query by Visual Example - Content-Based Image Retrieval. In: Pirotte, A., Delobel, C., Gottlob, G. (eds.) EDBT 1992. LNCS, vol. 580, pp. 56–71. Springer, Heidelberg (1992)
Apache Hive, http://hive.apache.org/
He, Y., Lee, R.B., Huai, Y., et al.: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1199–1208 (2011)
Beckmann, J.L., Halverson, A., Krishnamurthy, R., et al.: Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. In: Proceedings of the 22nd International Conference on Data Engineering ICDE, pp. 58–74 (2006)
Abadi, D.J.: Column Stores For Wide and Sparse Data. In: Proceedings of CIDR, pp. 292–297 (2007)
Apache HBase, http://hbase.apache.org/
Apache Cassandra, http://cassandra.apache.org/
Chang, F., Dean, J., Ghemawat, J., et al.: Bigtable: A Distributed Storage System for Structured Data. J. ACM Transactions on Computer Systems 26, 1–26 (2008)
Apache Pig, http://pig.apache.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Su, H., Li, H., Cheng, X., Liu, Z. (2013). Dynamic Table: A Scalable Storage Structure in the Cloud. In: Gao, Y., et al. Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7901. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39527-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-39527-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39526-0
Online ISBN: 978-3-642-39527-7
eBook Packages: Computer ScienceComputer Science (R0)