Performance Evaluation of Range Queries in Key Value Stores
- 461 Downloads
- 7 Citations
Abstract
Recently there has been a considerable increase in the number of different Key-Value stores, for supporting data storage and applications on the cloud environment. While all these solutions try to offer highly available and scalable services on the cloud, they are significantly different with each other in terms of the architecture and types of the applications, they try to support. Considering three widely-used such systems: Cassandra, HBase and Voldemort; in this paper we compare them in terms of their support for different types of query workloads. We are mainly focused on the range queries. Unlike HBase and Cassandra that have built-in support for range queries, Voldemort does not support this type of queries via its available API. For this matter, practical techniques are presented on top of Voldemort to support range queries. Our performance evaluation is based on mixed query workloads, in the sense that they contain a combination of short and long range queries, beside other types of typical queries on key-value stores such as lookup and update. We show that there are trade-offs in the performance of the selected system and scheme, and the types of the query workloads that can be processed efficiently.
Keywords
Key-value store Range query Range index Performance studyPreview
Unable to display preview. Download preview PDF.
References
- 1.Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. In: The Proceedings of VLDB Endowment, vol. 2 issue 1, pp. 922–933 (2009)Google Scholar
- 2.Agrawal, P., Silberstein, A., Cooper, B.F., Srivastava, U., Ramakrishnan, R.: Asynchronous view maintenance for vlsd databases. SIGMOD ’09. ACM (2009)Google Scholar
- 3.Aguilera, M.K., Golab, W., Golab, M.A.: A practical scalable distributed b-tree. In: Proceedings of VLDB Endow., vol. 1, pp. 598–609 (2008)Google Scholar
- 4.Andrzejak, A., Xu, Z.: Scalable, efficient range queries for grid information services. Peer-to-Peer Computing, pp. 33–40 (2002)Google Scholar
- 5.Apache CouchDB. http://couchdb.apache.org/. Accessed date Nov 2010
- 6.Apache HDFS. http://hadoop.apache.org/hdfs/. Accessed date Nov 2010
- 7.Aspnes, J., Kirsch, J., Krishnamurthy, A.: Load balancing and locality in range-queriable data structures. PODC ’04, pp. 115–124. ACM (2004)Google Scholar
- 8.Binnig, C., Kossmann, D., Kraska, T., Loesing, S.: How is the weather tomorrow?: towards a benchmark for the cloud. In: Proceedings of the 2nd International Workshop on Testing Database Systems, DBTest ’09, pp. 9:1–9:6. ACM (2009)Google Scholar
- 9.Brantner, M., Florescu, D., Graf, D.A., Kossmann, D., Kraska, T.: Building a database on s3. SIGMOD Conference, pp. 251–264 (2008)Google Scholar
- 10.Cassandra. http://cassandra.apache.org/. Accessed date Nov 2010
- 11.Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (2011)CrossRefGoogle Scholar
- 12.Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 2, 4:1–4:26 (2008)Google Scholar
- 13.Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1, 1277–1288 (2008)Google Scholar
- 14.Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. SoCC, pp. 143–154 (2010)Google Scholar
- 15.Ganesan, P., Bawa, M., Garcia-molina, H.:Online balancing of range-partitioned data with applications to Peer-to-Peer systems. In: VLDB, pp. 444–455 (2004)Google Scholar
- 16.Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multidimensional queries in p2p systems. WebDB (2004)Google Scholar
- 17.Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.J.: Quickly generating billion-record synthetic databases. SIGMOD Rec. 23, 243–252 (1994)CrossRefGoogle Scholar
- 18.Gupta, A., Agrawal, D., Abbadi, A.E.: Approximate range selection queries in peer-to-peer systems. CIDR (2003)Google Scholar
- 19.Hastorun, D., Jampani, M., Kakulapati, G., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazons highly available key-value store. In: Proceedings of SOSP, pp. 205–220 2007Google Scholar
- 20.HBase. http://hbase.apache.org/. Accessed date Nov 2010
- 21.Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: VLDB, pp. 661–672 (2005)Google Scholar
- 22.Lehman, P.L., Yao, S.B.: Efficient locking for concurrent operations on b-trees. ACM Trans. Database Syst. 6(4) 650–670 (1981)MATHCrossRefGoogle Scholar
- 23.Lomet, D.: Replicated indexes for distributed data. DIS ’96, IEEE Computer Society, pp. 108–119 (1996)Google Scholar
- 24.MongoDB. http://www.mongodb.org/. Accessed date Nov 2010
- 25.Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. SIGMOD Conference, pp. 165–178 (2009)Google Scholar
- 26.Pitoura, T., Ntarmos, N., Triantafillou, P.: Replication, load balancing and efficient range query processing in dhts. EDBT, pp. 131–148 (2006)Google Scholar
- 27.Project Voldemort. http://project-voldemort.com/. Accessed date Nov 2010
- 28.Ramabhadran, S., Ratnasamy, S., Hellerstein, J.M., Shenker, S.: Brief announcement: prefix hash tree. PODC ’04. ACM (2004)Google Scholar
- 29.Sahin, O.D., Gupta, A., Agrawal, D., Abbadi, A.E.: A peer-to-peer framework for caching range queries. ICDE, pp. 165–176 (2004)Google Scholar
- 30.Schütt, T., Schintke, F., Reinefeld, A.: Structured overlay without consistent hashing: empirical results. CCGRID (2006)Google Scholar
- 31.Schütt, T., Schintke, F., Reinefeld, A.: Range queries on structured overlay networks. Computer Communications, vol. 31 (2008)Google Scholar
- 32.Shi, Y., Meng, X., Zhao, J., Hu, X., Liu, B., Wang, H.: Benchmarking cloud-based data management systems, In: Proceedings of the second international workshop on Cloud data management. CloudDB ’10, pp. 47–54. ACM (2010)Google Scholar
- 33.Vo, H.T., Chen, C., Ooi, B.C.: Towards elastic transactional cloud storage with range query support. In: The Proceedings of VLDB Endowment, vol. 3, pp. 506–517 (2010)Google Scholar