Skip to main content

An Adaptive Load-Balanced Partitioning Module in Cassandra Using Rendezvous Hashing

  • Conference paper
  • First Online:
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016 (AISI 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 533))

Abstract

With the rapid growth and use of social networks, the appearance of Internet technology and the advent of the cloud computing a need for new tools and algorithms is appeared to handle the challenges of the big-data. One of the key advances in resolving the big-data challenges is to introduce scalable storage systems. NoSQL databases are considered as efficient big data storage management systems that provide horizontal scalability. To ensure scalability of the system, data partitioning strategies must be implemented in these databases. In this paper, an Adaptive Rendezvous Hashing Partitioning Module (ARHPM) is proposed for Cassandra NoSQL databases. The main goal of this module is to partition the data in Cassandra using rendezvous hashing with proposing a Load Balancing based Rendezvous Hashing (LBRH) algorithm for guaranteeing the load balancing in the partitioning process. To evaluate the proposed module, Cassandra is modified by embedding the APRHM partitioning module in it and a number of experiments are conducted to validate the load balancing of the proposed module by using the Yahoo Cloud Serving Benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Demchenko, Y., Membrey, P., Grosso, P., de Laat, C.: Addressing big data issues in scientific data infrastructure. In: First International Symposium on Big Data and Data Analytics in Collaboration (BDDAC 2013). Part of The 2013 International Conference on Collaboration Technologies and Systems (CTS 2013), 20–24 May 2013, San Diego, California, USA

    Google Scholar 

  2. Benzaken, V., Castagna, G., Nguyen, K., Siméon, J.: Static and dynamic semantics of NoSQL languages. SIGPLAN Not. 48(1), 101–114 (2013)

    MATH  Google Scholar 

  3. HBase Development Team. HBase: BigTable-like structured storage for Hadoop HDFS [EB/OL], 20 March 2013. http://wiki.apache.org/hadoop/Hbase/

  4. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., et al.: BigTable: a distributed storage system for structured data. In: Proceedings of the 7th OSDI, pp. 205–218. ACM, Seattle (2006)

    Google Scholar 

  5. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  6. Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, 1997, pp. 654–663. ACM, New York (1997)

    Google Scholar 

  7. Srinivasan, L., Varma, V.: Adaptive load-balancing for consistent hashing in heterogeneous clusters 2015. In: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2015)

    Google Scholar 

  8. Byers, J., Considine, J., Mitzenmacher, M.: Simple load balancing for distributed hash tables. In: Kaashoek, MFrans, Stoica, Ion (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 80–87. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Yao, Z., Ravishankar, C., Tripathi, S.: Hash-based virtual hierarchies for caching in hybrid content-delivery networks (PDF), CSE Department, University of California, Riverside, Riverside, CA, 13 May 2001. Accessed 15 November 2015

    Google Scholar 

  10. Turk, A., Oguz Selvitopi, R., Ferhatosmanoglu, H., Aykanat, C.: Temporal workload-aware replicated partitioning for social networks. IEEE Trans. Knowl. Data Eng. 26(11), 2832–2845 (2014)

    Google Scholar 

  11. Abramova, V., et al.: Testing cloud benchmark scalability with cassandra. In: 2014 IEEE 10th World Congress on Services (2014)

    Google Scholar 

  12. Huang, X., Wang, J., Zhong, Y., Song, S., Yu, P.S.: Optimizing data partition for scaling out NoSQL cluster, 20 September 2015 in Wiley Online Library (wileyonlinelibrary.com). doi:10.1002/cpe.3643

  13. Ramakrishnan, L., et al.: Processing cassandra datasets with hadoop-streaming based approaches. IEEE 2015 Trans. Serv. Comput.

    Google Scholar 

  14. Chen, Z.: Hybrid range consistent hash partitioning strategy–a new data partition strategy for NoSQL database. In: 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (2013)

    Google Scholar 

  15. Cooper, B.F., Silberstein, A., Tam, E., et al.: Benchmarking cloud serving systems with YCSB. In: Proceedings of SoCC. ACM, Indianapolis (2010)

    Google Scholar 

  16. Seada, K., Helmy, A.: Rendezvous regions: a scalable architecture for service location and datacentric storage in large-scale wireless networks. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, IPDPS 2004

    Google Scholar 

  17. Yuki, K.: “Digest::MurmurHash”, GitHub.com. Accessed 18 Mar 2015

    Google Scholar 

  18. Jenkins, B.: SpookyHash: a 128-bit noncryptographic hash. Accessed 29 Jan 2012

    Google Scholar 

  19. Server Virtualization with VMware vSphere|VMware India. www.vmware.com. Accessed 08 Mar 2016

  20. https://datastax.github.io/python-driver/api/cassandra/policies.html. Accessed 4 Jan 2016

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sally M. Elghamrawy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Elghamrawy, S.M. (2017). An Adaptive Load-Balanced Partitioning Module in Cassandra Using Rendezvous Hashing. In: Hassanien, A., Shaalan, K., Gaber, T., Azar, A., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016. AISI 2016. Advances in Intelligent Systems and Computing, vol 533. Springer, Cham. https://doi.org/10.1007/978-3-319-48308-5_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48308-5_56

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48307-8

  • Online ISBN: 978-3-319-48308-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics