Advertisement

Evaluating Cassandra Scalability with YCSB

  • Veronika Abramova
  • Jorge Bernardino
  • Pedro Furtado
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8645)

Abstract

NoSQL data stores appeared to fill a gap in the database market: that of highly scalable data storage that can be used for simple storage and retrieval of key-indexed data while allowing easy data distribution over a possibly large number of servers. Cassandra has been pinpointed as one of the most efficient and scalable among currently existing NoSQL engines. Scalability of these engines means that, by adding nodes, we could have more served requests with the same performance and more nodes could result in reduced execution time of requests. However, we will see that adding nodes not always results in performance increase and we investigate how the workload, database size and the level of concurrency are related to the achieved scaling level. We will overview Cassandra data store engine, and then we evaluate experimentally how it behaves concerning scaling and request time speedup. We use the YCSB – Yahoo! Cloud Serving Benchmark for these experiments.

Keywords

NoSQL YCSB Cassandra Scalability 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC 2010), pp. 143–154. ACM, New York (2010)CrossRefGoogle Scholar
  2. 2.
    A community white paper developed by leading researchers across the United States. Challenges and Opportunities with Big DataGoogle Scholar
  3. 3.
    Cattell, R.: Scalable SQL and NoSQL data stores. SIGMOD Rec. 39(4), 12–27 (2011)CrossRefGoogle Scholar
  4. 4.
    Konstantinou, I., Angelou, E., Boumpouka, C., Tsoumakos, D., Koziris, N.: On the elasticity of nosql databases over cloud management platforms. In: CIKM, pp. 2385–2388 (2011)Google Scholar
  5. 5.
    Pirzadeh, P., Tatemura, J., Hacigumus, H.: Performance evaluation of range queries in key value stores. In: IPDPSW, pp. 1092–1101 (2011)Google Scholar
  6. 6.
  7. 7.
    Welsh, M., Culler, D., Brewer, E.: SEDA: an architecture for well-conditioned, scalable internet services. In: Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (SOSP 2001), pp. 230–243. ACM, New York (2001)CrossRefGoogle Scholar
  8. 8.
    Talia, D.: Clouds for Scalable Big Data Analytics. IEEE Computer (COMPUTER) 46(5), 98–101 (2013)CrossRefGoogle Scholar
  9. 9.
    Garefalakis, P., Papadopoulos, P., Manousakis, I., Magoutis, K.: Strengthening Consistency in the Cassandra Distributed Key-Value Store. In: Dowling, J., Taïani, F. (eds.) DAIS 2013. LNCS, vol. 7891, pp. 193–198. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  10. 10.
    Hewitt, E.: Cassandra - The Definitive Guide: Distributed Data at Web Scale. Springer (2011)Google Scholar
  11. 11.
    Beernaert, L., Gomes, P., Matos, M., Vilaça, R., Oliveira, R.: Evaluating Cassandra as a manager of large file sets. In: Proceedings of the 3rd International Workshop on Cloud Data and Platforms (CloudDP 2013), pp. 25–30. ACM, New York (2013)CrossRefGoogle Scholar
  12. 12.
    Fukuda, S., Kawashima, R., Saito, S., Matsuo, H.: Improving Response Time for Cassandra with Query Scheduling. In: Proceedings of the 2013 First International Symposium on Computing and Networking (CANDAR 2013), pp. 128–133. IEEE ComputerSociety, Washington, DC (2013)CrossRefGoogle Scholar
  13. 13.
    Feng, C., Zouand, Y., Xu, Z.: CCIndex for Cassandra: A Novel Scheme for Multi-dimensional Range Queriesin Cassandra. In: SKG 2011, pp. 130–136 (2011)Google Scholar
  14. 14.
    Dede, E., Sendir, B., Kuzlu, P., Hartog, J., Govindaraju, M.: An Evaluation of Cassandra for Hadoop. In: IEEE CLOUD 2013, pp. 494–501 (2013), Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)Google Scholar
  15. 15.
    Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Veronika Abramova
    • 1
  • Jorge Bernardino
    • 1
    • 2
  • Pedro Furtado
    • 1
  1. 1.CISUC – Centre of Informatics and Systems of the University of Coimbra, FCTUC – University of CoimbraCoimbraPortugal
  2. 2.Polytechnic Institute of CoimbraISEC – Superior Institute of Engineering of CoimbraCoimbraPortugal

Personalised recommendations