Skip to main content

Capturing Anomalies of Cassandra Performance with Increase in Data Volume: A NoSQL Analytical Approach

  • Conference paper
  • First Online:
Advances in Data Science and Management

Abstract

NoSQL database technology has been doing rounds since the early 1990s, but it was the exponential growth of internet and the rise of web applications that lead to a dynamic surge in the popularity of NoSQL databases. The BigTable research by Google (2006) and the Dynamo research by Amazon (2007) paved the way for databases which could develop with agility and operate at any scale. Cassandra and MongoDB have emerged as the two most widely used NoSQL database and hence either of the two is preferred depending on the data problem user is attempting to solve. This paper describes the underlying principles as well as the differences between both the databases. We focus on showing the anomaly in performance of Cassandra as the data volume increases and at the same time we compare its performance with that of MongoDB. We establish how important factor is data volume in choosing either of the databases for an application. Extensive experiments have been carried out to scale the performance in terms of anomaly similarities, and the future scope is pinpointed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. L. Okman, N. Gal-Oz, Y. Gonen, E. Gudes, J. Abramov, Security issues in nosql databases, in 2011 IEEE 10th international conference on Trust, security and privacy in computing and communications (TrustCom) (IEEE, 2011, November), pp. 541–547

    Google Scholar 

  2. O.P. Richard, A Scalable relational database model for cloud computing (Doctoral dissertation) (Makerere University, 2005)

    Google Scholar 

  3. E. Dede, M. Govindaraju, D. Gunter, R.S. Canon, L. Ramakrishnan, Performance evaluation of a mongodb and hadoop platform for scientific data analysis, in Proceedings of the 4th ACM workshop on scientific cloud computing (ACM, 2013, June), pp. 13–20

    Google Scholar 

  4. A. Marcus, The nosql ecosystem. Arch. Open Source Appl., 185–205 (2011)

    Google Scholar 

  5. D. Ramesh, A.K. Jain, C. Kumar, C, Implementation of atomicity and snapshot isolation for multi-row transactions on column oriented distributed databases using rdbms, in 2012 International conference on communications, devices and intelligent systems (CODIS) (IEEE, 2012, December), pp. 298–301

    Google Scholar 

  6. Y. Li, S. Manoharan, A performance comparison of SQL and NoSQL databases, in Communications, computers and signal processing (PACRIM), 2013 IEEE pacific rim conference on (IEEE, 2013, August), pp. 15–19

    Google Scholar 

  7. V. Abramova, J. Bernardino, NoSQL databases: MongoDB vs Cassandra, in Proceedings of the international C* conference on computer science and software engineering (ACM, 2013, July), pp. 14–22

    Google Scholar 

  8. J. Han, E. Haihong, G. Le, J. Du, Survey on NoSQL database, in 2011 6th international conference on Pervasive computing and applications (ICPCA) (IEEE, 2011, October), pp. 363–366

    Google Scholar 

  9. A. Chebotko, A. Kashlev, S. Lu, A big data modeling methodology for apache cassandra, in 2015 IEEE international congress on big data (bigdata congress) (IEEE, 2015, June), pp. 238–245

    Google Scholar 

  10. S. Dhingra, S. Sharma, P. Kaur, C. Dabas, Fault tolerant streaming of live news using multi-node Cassandra, in 2017 Tenth international conference on contemporary computing (IC3) (IEEE, 2017, August), pp. 1–5

    Google Scholar 

  11. https://academy.datastax.com/resources/brief-introduction-apache-cassandra

  12. D. Ramesh, A. Sinha, S. Singh, Data modelling for discrete time series data using cassandra and mongodb, in 2016 3rd international conference on recent advances in information technology (RAIT) (2016, March, IEEE), pp. 598–601

    Google Scholar 

  13. D. Featherston, Cassandra: principles and application, in Department of computer science university of illinois at Urbana-champaign (2010)

    Google Scholar 

  14. Z. Parker, S. Poe, S.V. Vrbsky, Comparing nosql mongodb to an sql db. In Proceedings of the 51st ACM southeast conference (ACM, 2013, April), p. 5

    Google Scholar 

  15. P. Membrey, E. Plugge, D. Hawkins, The definitive guide to MongoDB: the noSQL database for cloud and desktop computing (Apress, 2011), pp. 55–56

    Google Scholar 

  16. K. Banker, MongoDB in action (Manning Publications Co, 2011)

    Google Scholar 

  17. Introduction to BSON: http://bsonspec.org/

  18. A blog on new features introduced in MongoDB 3.6: http://www.dbta.com/Columns/MongoDB-Matters/New-Features-in-MongoDB-36-122316.aspx

  19. K. Chodorow, MongoDB: The definitive guide: powerful and scalable data storage (O’Reilly Media, Inc, 2013), pp. 231–239

    Google Scholar 

  20. Y. Liu, Y. Wang, Y. Jin, Research on the improvement of MongoDB Auto-Sharding in cloud environment. In 2012 7th international conference on Computer science & education (ICCSE) (IEEE, 2012, July), pp. 851–854

    Google Scholar 

  21. D. Ramesh, E. Khosla, S.N. Bhukya, Inclusion of e-commerce workflow with NoSQL DBMS: MongoDB document store, in 2016 IEEE international conference on computational intelligence and computing research (ICCIC) (IEEE, 2016, December), pp. 1–5

    Google Scholar 

  22. Properties on Cassandra database: https://db-engines.com/en/system/Cassandra

  23. MongoDB official documentation. https://docs.mongodb.com/manual/aggregation/

  24. A. Lakshman, P. Malik, Cassandra: a decentralized structured storage system. ACM SIGOPS Oper Syst Rev 44(2), 35–40 (2010)

    Article  Google Scholar 

  25. Official documentation on CQL: http://cassandra.apache.org/doc/latest/cql/index.html

  26. B.G. Tudorica, C. Bucur, A comparison between several NoSQL databases with comments and notes, in 2011 10th Roedunet International Conference (RoEduNet) (IEEE, 2011, June), pp. 1–5

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the Indian Institute of Technology (ISM), Dhanbad that comes under the administrative and financial control of the Ministry of Human Resource Development (MHRD), Government of India. The authors express their gratitude towards the Department of Computer Science and Engineering at IIT (ISM) for providing all the necessary support to carry out the research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramesh Dharavath .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dharavath, R., Kumar, A., Dharavath, V.K. (2020). Capturing Anomalies of Cassandra Performance with Increase in Data Volume: A NoSQL Analytical Approach. In: Borah, S., Emilia Balas, V., Polkowski, Z. (eds) Advances in Data Science and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 37. Springer, Singapore. https://doi.org/10.1007/978-981-15-0978-0_1

Download citation

Publish with us

Policies and ethics