Skip to main content

A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases

  • Conference paper
  • First Online:
Performance Evaluation and Benchmarking (TPCTC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13169))

Included in the following conference series:

  • 738 Accesses

Abstract

Many contemporary applications have to deal with unexpected spikes or unforeseen peaks in demand for specific data objects – so-called hotspot objects. For example in social networks, specific media items can go viral quickly and unexpectedly and therefore, properly provisioning for such behavior is not trivial.

NoSQL databases are specifically designed for enhanced scalability, high availability, and elasticity to deal with increasing data volumes. Although existing performance benchmarking systems such as the Yahoo! Cloud Serving Benchmark (YCSB) provide support to test the performance properties of different databases under identical workloads, they lack support for testing how well these databases can cope with the above-mentioned unexpected hotspot object behaviour.

To address this shortcoming and fill the research gap, we present the design and implementation of a new YCSB workload that is rooted upon a formal characterization of hotspot-based spikes. The proposed workload implements the Pitman-Yor distribution and is configurable in a number of parameters such as spike probability and data locality. As such, it allows for more extensive experimental validation of database systems.

Our functional validation illustrates how the workload can be used to effectively stress-test different types of databases and we present our comparative results of benchmarking two popular NoSQL databases that are Cassandra and MongoDB in terms of their response to spiked workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In this paper, we mainly focus on YCSB. However, a more extensive discussion of other benchmark systems is covered in Sect. 5.

  2. 2.

    In the YCSB config, it will be used when the parameter workload is set to site.ycsb.workloads.SpikesWorkload.

  3. 3.

    The experiments for a multi-node setup will be considered in the future work.

References

  1. Arasu, A., et al.: Linear Road: A Stream Data Management Benchmark (2004). https://doi.org/10.1016/B978-012088469-8/50044-9

  2. Armstrong, T., Ponnekanti, V., Borthakur, D., Callaghan, M.: LinkBench: a database benchmark based on the Facebook social graph, pp. 1185–1196 (2013)

    Google Scholar 

  3. Barahmand, S., Ghandeharizadeh, S.: BG: a benchmark to evaluate interactive social networking actions. Citeseer (2013)

    Google Scholar 

  4. Bodik, P., Fox, A., Franklin, M., Jordan, M., Patterson, D.: Characterizing, modeling, and generating workload spikes for stateful services, pp. 241–252 (2010). https://doi.org/10.1145/1807128.1807166

  5. Chen, J., et al.: HotRing: a hotspot-aware in-memory key-value store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), Santa Clara, CA, pp. 239–252. USENIX Association, February 2020. https://www.usenix.org/conference/fast20/presentation/chen-jiqiang

  6. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154 (2010)

    Google Scholar 

  7. Dayarathna, M., Suzumura, T.: XGDBench: a benchmarking platform for graph stores in exascale clouds, pp. 363–370 (2012)

    Google Scholar 

  8. Dayarathna, M., Suzumura, T.: Benchmarking Graph Data Management and Processing Systems: A Survey. arXiv preprint arXiv:2005.12873 (2020)

  9. Dey, A., Fekete, A., Nambiar, R., Rohm, U.: YCSB+T: benchmarking web-scale transactional databases, pp. 223–230 (2014)

    Google Scholar 

  10. Difallah, D.E., Pavlo, A., Curino, C., Cudre-Mauroux, P.: Oltp-bench: an extensible testbed for benchmarking relational databases. Proc. VLDB Endow. 7(4), 277–288 (2013)

    Article  Google Scholar 

  11. Gao, W., et al.: Bigdatabench: a scalable and unified big data and AI benchmark suite. arXiv preprint arXiv:1802.08254 (2018)

  12. Ghazal, A., et al.: BigBench V2: the new and improved BigBench. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1225–1236. IEEE (2017)

    Google Scholar 

  13. Ghazal, A., et al.: BigBench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208 (2013)

    Google Scholar 

  14. Gupta, P., Carey, M.J., Mehrotra, S., Yus, O.: SmartBench: a benchmark for data management in smart spaces. Proc. VLDB Endow. 13(12), 1807–1820 (2020)

    Article  Google Scholar 

  15. Kumar, S.P., Lefebvre, S., Chiky, R., Soudan, E.G.: Evaluating consistency on the fly using YCSB. In: 2014 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–6, November 2014. https://doi.org/10.1109/IWCIM.2014.7008801

  16. Leutenegger, S.T., Dias, D.: A modeling study of the TPC-C benchmark. ACM SIGMOD Rec. 22(2), 22–31 (1993)

    Article  Google Scholar 

  17. Lu, P., Yuan, L., Zhang, Y., Cao, H., Li, K.: AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing. arXiv preprint arXiv:2103.08888 (2021)

  18. Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB, vol. 6, pp. 1049–1058 (2006)

    Google Scholar 

  19. Patil, S., et al.: YCSB++: benchmarking and performance debugging advanced features in scalable table stores, pp. 1–14 (2011)

    Google Scholar 

  20. PilHo, K.: Transaction processing performance council (TPC). Guide d’installation (2014)

    Google Scholar 

  21. Pirzadeh, P., Carey, M.J., Westmann, T.: BigFUN: a performance study of big data management system functionality. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 507–514. IEEE (2015)

    Google Scholar 

  22. Pitman, J., Yor, M.: The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 855–900 (1997)

    Google Scholar 

  23. Poess, M., Smith, B., Kollar, L., Larson, P.: TPC-DS, taking decision support benchmarking to the next level (2002)

    Google Scholar 

  24. Sidhanta, S., Mukhopadhyay, S., Golab, W.: DYN-YCSB: benchmarking adaptive frameworks. In: 2019 IEEE World Congress on Services (SERVICES), vol. 2642–939X, pp. 392–393, July 2019. https://doi.org/10.1109/SERVICES.2019.00119

  25. TPC: Transaction Processing Performance Council. tpcorg http://www.tpc.org/. Accessed 14 Feb 2020

  26. TPC-E: TPC-E is an On-Line Transaction Processing Benchmark. http://www.tpc.org/tpce/ (2020). Accessed 20 Feb 2021

  27. Waudby, J., Steer, B.A., Karimov, K., Marton, J., Boncz, P., Szárnyas, G.: Towards testing ACID compliance in the LDBC social network benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2020. LNCS, vol. 12752, pp. 1–17. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84924-5_1

    Chapter  Google Scholar 

  28. Wu, Z., Butkiewicz, M., Perkins, D., Katz-Bassett, E., Madhyastha, H.V.: SPANStore: cost-effective geo-replicated storage spanning multiple cloud services. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 292–308 (2013)

    Google Scholar 

  29. Xia, F., Li, Y., Yu, C., Ma, H., Qian, W.: BSMA: a benchmark for analytical queries over social media data. Proc. VLDB Endow. 7(13), 1573–1576 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This research is partially funded by the Research Fund KU Leuven and the Cybersecurity Initiative Flanders (CIF) project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ansar Rafique .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Claesen, C., Rafique, A., Van Landuyt, D., Joosen, W. (2022). A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2021. Lecture Notes in Computer Science(), vol 13169. Springer, Cham. https://doi.org/10.1007/978-3-030-94437-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94437-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94436-0

  • Online ISBN: 978-3-030-94437-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics