A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases

Claesen, Casper; Rafique, Ansar; Van Landuyt, Dimitri; Joosen, Wouter

doi:10.1007/978-3-030-94437-7_1

Casper Claesen¹⁰,
Ansar Rafique¹⁰,
Dimitri Van Landuyt¹⁰ &
…
Wouter Joosen¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13169))

Included in the following conference series:

Technology Conference on Performance Evaluation and Benchmarking

738 Accesses

Abstract

Many contemporary applications have to deal with unexpected spikes or unforeseen peaks in demand for specific data objects – so-called hotspot objects. For example in social networks, specific media items can go viral quickly and unexpectedly and therefore, properly provisioning for such behavior is not trivial.

NoSQL databases are specifically designed for enhanced scalability, high availability, and elasticity to deal with increasing data volumes. Although existing performance benchmarking systems such as the Yahoo! Cloud Serving Benchmark (YCSB) provide support to test the performance properties of different databases under identical workloads, they lack support for testing how well these databases can cope with the above-mentioned unexpected hotspot object behaviour.

To address this shortcoming and fill the research gap, we present the design and implementation of a new YCSB workload that is rooted upon a formal characterization of hotspot-based spikes. The proposed workload implements the Pitman-Yor distribution and is configurable in a number of parameters such as spike probability and data locality. As such, it allows for more extensive experimental validation of database systems.

Our functional validation illustrates how the workload can be used to effectively stress-test different types of databases and we present our comparative results of benchmarking two popular NoSQL databases that are Cassandra and MongoDB in terms of their response to spiked workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems

KOBE: Cloud-Native Open Benchmarking Engine for Federated Query Processors

Benchmarking Replication in Cassandra and MongoDB NoSQL Datastores

Notes

1.
In this paper, we mainly focus on YCSB. However, a more extensive discussion of other benchmark systems is covered in Sect. 5.
2.
In the YCSB config, it will be used when the parameter workload is set to site.ycsb.workloads.SpikesWorkload.
3.
The experiments for a multi-node setup will be considered in the future work.

References

Arasu, A., et al.: Linear Road: A Stream Data Management Benchmark (2004). https://doi.org/10.1016/B978-012088469-8/50044-9
Armstrong, T., Ponnekanti, V., Borthakur, D., Callaghan, M.: LinkBench: a database benchmark based on the Facebook social graph, pp. 1185–1196 (2013)
Google Scholar
Barahmand, S., Ghandeharizadeh, S.: BG: a benchmark to evaluate interactive social networking actions. Citeseer (2013)
Google Scholar
Bodik, P., Fox, A., Franklin, M., Jordan, M., Patterson, D.: Characterizing, modeling, and generating workload spikes for stateful services, pp. 241–252 (2010). https://doi.org/10.1145/1807128.1807166
Chen, J., et al.: HotRing: a hotspot-aware in-memory key-value store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), Santa Clara, CA, pp. 239–252. USENIX Association, February 2020. https://www.usenix.org/conference/fast20/presentation/chen-jiqiang
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154 (2010)
Google Scholar
Dayarathna, M., Suzumura, T.: XGDBench: a benchmarking platform for graph stores in exascale clouds, pp. 363–370 (2012)
Google Scholar
Dayarathna, M., Suzumura, T.: Benchmarking Graph Data Management and Processing Systems: A Survey. arXiv preprint arXiv:2005.12873 (2020)
Dey, A., Fekete, A., Nambiar, R., Rohm, U.: YCSB+T: benchmarking web-scale transactional databases, pp. 223–230 (2014)
Google Scholar
Difallah, D.E., Pavlo, A., Curino, C., Cudre-Mauroux, P.: Oltp-bench: an extensible testbed for benchmarking relational databases. Proc. VLDB Endow. 7(4), 277–288 (2013)
Article Google Scholar
Gao, W., et al.: Bigdatabench: a scalable and unified big data and AI benchmark suite. arXiv preprint arXiv:1802.08254 (2018)
Ghazal, A., et al.: BigBench V2: the new and improved BigBench. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1225–1236. IEEE (2017)
Google Scholar
Ghazal, A., et al.: BigBench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208 (2013)
Google Scholar
Gupta, P., Carey, M.J., Mehrotra, S., Yus, O.: SmartBench: a benchmark for data management in smart spaces. Proc. VLDB Endow. 13(12), 1807–1820 (2020)
Article Google Scholar
Kumar, S.P., Lefebvre, S., Chiky, R., Soudan, E.G.: Evaluating consistency on the fly using YCSB. In: 2014 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–6, November 2014. https://doi.org/10.1109/IWCIM.2014.7008801
Leutenegger, S.T., Dias, D.: A modeling study of the TPC-C benchmark. ACM SIGMOD Rec. 22(2), 22–31 (1993)
Article Google Scholar
Lu, P., Yuan, L., Zhang, Y., Cao, H., Li, K.: AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing. arXiv preprint arXiv:2103.08888 (2021)
Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB, vol. 6, pp. 1049–1058 (2006)
Google Scholar
Patil, S., et al.: YCSB++: benchmarking and performance debugging advanced features in scalable table stores, pp. 1–14 (2011)
Google Scholar
PilHo, K.: Transaction processing performance council (TPC). Guide d’installation (2014)
Google Scholar
Pirzadeh, P., Carey, M.J., Westmann, T.: BigFUN: a performance study of big data management system functionality. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 507–514. IEEE (2015)
Google Scholar
Pitman, J., Yor, M.: The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 855–900 (1997)
Google Scholar
Poess, M., Smith, B., Kollar, L., Larson, P.: TPC-DS, taking decision support benchmarking to the next level (2002)
Google Scholar
Sidhanta, S., Mukhopadhyay, S., Golab, W.: DYN-YCSB: benchmarking adaptive frameworks. In: 2019 IEEE World Congress on Services (SERVICES), vol. 2642–939X, pp. 392–393, July 2019. https://doi.org/10.1109/SERVICES.2019.00119
TPC: Transaction Processing Performance Council. tpcorg http://www.tpc.org/. Accessed 14 Feb 2020
TPC-E: TPC-E is an On-Line Transaction Processing Benchmark. http://www.tpc.org/tpce/ (2020). Accessed 20 Feb 2021
Waudby, J., Steer, B.A., Karimov, K., Marton, J., Boncz, P., Szárnyas, G.: Towards testing ACID compliance in the LDBC social network benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2020. LNCS, vol. 12752, pp. 1–17. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84924-5_1
Chapter Google Scholar
Wu, Z., Butkiewicz, M., Perkins, D., Katz-Bassett, E., Madhyastha, H.V.: SPANStore: cost-effective geo-replicated storage spanning multiple cloud services. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 292–308 (2013)
Google Scholar
Xia, F., Li, Y., Yu, C., Ma, H., Qian, W.: BSMA: a benchmark for analytical queries over social media data. Proc. VLDB Endow. 7(13), 1573–1576 (2014)
Article Google Scholar

Download references

Acknowledgements

This research is partially funded by the Research Fund KU Leuven and the Cybersecurity Initiative Flanders (CIF) project.

Author information

Authors and Affiliations

imec-DistriNet, KU Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Casper Claesen, Ansar Rafique, Dimitri Van Landuyt & Wouter Joosen

Authors

Casper Claesen
View author publications
You can also search for this author in PubMed Google Scholar
Ansar Rafique
View author publications
You can also search for this author in PubMed Google Scholar
Dimitri Van Landuyt
View author publications
You can also search for this author in PubMed Google Scholar
Wouter Joosen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ansar Rafique .

Editor information

Editors and Affiliations

Advanced Micro Devices Inc., Santa Clara, CA, USA
Raghunath Nambiar
Oracle Corporation, Redwood Shores, CA, USA
Meikel Poess

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Claesen, C., Rafique, A., Van Landuyt, D., Joosen, W. (2022). A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2021. Lecture Notes in Computer Science(), vol 13169. Springer, Cham. https://doi.org/10.1007/978-3-030-94437-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-94437-7_1
Published: 14 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94436-0
Online ISBN: 978-3-030-94437-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases

Abstract

Access this chapter

Similar content being viewed by others

CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems

KOBE: Cloud-Native Open Benchmarking Engine for Federated Query Processors

Benchmarking Replication in Cassandra and MongoDB NoSQL Datastores

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A YCSB Workload for Benchmarking Hotspot Object Behaviour in NoSQL Databases

Abstract

Access this chapter

Similar content being viewed by others

CBench-Dynamo: A Consistency Benchmark for NoSQL Database Systems

KOBE: Cloud-Native Open Benchmarking Engine for Federated Query Processors

Benchmarking Replication in Cassandra and MongoDB NoSQL Datastores

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation