A Framework to Benchmark NoSQL Data Stores for Large-Scale Model Persistence

  • Seyyed M. Shah
  • Ran Wei
  • Dimitrios S. Kolovos
  • Louis M. Rose
  • Richard F. Paige
  • Konstantinos Barmpis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8767)


We present a framework and methodology to benchmark NoSQL stores for large scale model persistence. NoSQL technologies potentially improve performance of some applications and provide schema-less data-structures, so are particularly suited to persisting large and heterogeneous models. Recent studies consider only a narrow set of NoSQL stores for large scale modelling. Benchmarking many technologies requires substantial effort due to the disparate interface each store provides. Our experiments compare a broad range of NoSQL stores in terms of processor time and disc space used. The framework and methodology is evaluated through a case study that involves persisting large reverse-engineered models of open source projects. The results give tool engineers and practitioners a basis for selecting a store to persist large models.


Large Model Property Graph Open Source Project Graph Database Model Drive Engineer 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Steinberg, D., Budinsky, F., Merks, E., Paternostro, M.: EMF: Eclipse modeling framework. Pearson Education (2008)Google Scholar
  2. 2.
    Kolovos, D.S., Rose, L.M., Matragkas, N., Paige, R.F., Guerra, E., Cuadrado, J.S., De Lara, J., Ráth, I., Varró, D., Tisi, M., Cabot, J.: A Research Roadmap Towards Achieving Scalability in Model Driven Engineering. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 2:1–2:10. ACM, New York (2013)Google Scholar
  3. 3.
    Barmpis, K., Kolovos, D.S.: Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models. Journal of Object Technology (to appear, 2014)Google Scholar
  4. 4.
    Espinazo Pagán, J., Sánchez Cuadrado, J., García Molina, J.: Morsa: A Scalable Approach for Persisting and Accessing Large Models. In: Whittle, J., Clark, T., Kühne, T. (eds.) MODELS 2011. LNCS, vol. 6981, pp. 77–92. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Fitzpatrick, B.: Distributed caching with memcached. Linux Journal 2004(124), 5 (2004)Google Scholar
  6. 6.
    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)CrossRefGoogle Scholar
  7. 7.
    Fink, B.: Distributed computation on dynamo-style distributed storage: Riak pipe. In: Hoffman, T., Hughes, J. (eds.) Erlang Workshop, pp. 43–50. ACM (2012)Google Scholar
  8. 8.
    Fuchs, A.: Accumulo–Extensions to Google’s Bigtable Design (2012)Google Scholar
  9. 9.
    Auradkar, A., Botev, C., Das, S., De Maagd, D., Feinberg, A., Ganti, P., Gao, L., Ghosh, B., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., Zhang, J.: Data Infrastructure at LinkedIn. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1370–1381 (April 2012)Google Scholar
  10. 10.
    Chodorow, K., Dirolf, M.: MongoDB - The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly (2010)Google Scholar
  11. 11.
    Brown, M.C.: Getting Started with CouchDB - Extreme Scalability at Your Fingertips. O’Reilly (2012)Google Scholar
  12. 12.
  13. 13.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2006 (2006)Google Scholar
  14. 14.
    Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. Operating Systems Review 44(2), 35–40 (2010)CrossRefGoogle Scholar
  15. 15.
    George, L.: HBase: The Definitive Guide, 1st edn. O’Reilly Media (2011)Google Scholar
  16. 16.
    Webber, J.: A programmatic introduction to Neo4j. In: Leavens, G.T. (ed.) SPLASH, pp. 217–218. ACM (2012)Google Scholar
  17. 17.
  18. 18.
  19. 19.
    Kuhlmann, M., Hamann, L., Gogolla, M., Büttner, F.: A benchmark for OCL engine accuracy, determinateness, and efficiency. Software and System Modeling 11(2), 165–182 (2012)CrossRefGoogle Scholar
  20. 20.
    Bergmann, G., Ujhelyi, Z., Ráth, I., Varró, D.: A Graph Query Language for EMF Models. In: Cabot, J., Visser, E. (eds.) ICMT 2011. LNCS, vol. 6707, pp. 167–182. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Varró, G., Schürr, A., Varró, D.: Benchmarking for Graph Transformation. In: VL/HCC, pp. 79–88 (2005)Google Scholar
  22. 22.
    Barmpis, K., Kolovos, D.S.: Comparative Analysis of Data Persistence Technologies for Large-Scale Models. In: XM@MoDELS (2012)Google Scholar
  23. 23.
  24. 24.
    Paige, R.F., Kolovos, D.S., Rose, L.M., Drivalos, N., Polack, F.A.C.: The Design of a Conceptual Framework and Technical Infrastructure for Model Management Language Engineering. In: Proc. 14th IEEE International Conference on Engineering of Complex Computer Systems, Potsdam, Germany (2009)Google Scholar
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
    Scheidgen, M., Zubow, A., Fischer, J., Kolbe, T.H.: Automated and transparent model fragmentation for persisting large models. In: France, R.B., Kazmeier, J., Breu, R., Atkinson, C. (eds.) MODELS 2012. LNCS, vol. 7590, pp. 102–118. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  30. 30.
    Barmpis, K., Kolovos, D.: Hawk: Towards a scalable model indexing architecture. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 6:1–6:9. ACM, New York (2013)Google Scholar
  31. 31.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)Google Scholar
  32. 32.
    Bruneliere, H., Cabot, J., Jouault, F., Madiot, F.: MoDisco: A generic and extensible framework for model driven reverse engineering. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 173–174. ACM (2010)Google Scholar
  33. 33.
    Ait-Ameur, Y., Besnard, F., Girard, P., Pierra, G., Potier, J.C.: Formal specification and metaprogramming in the EXPRESS language. In: Intern. Conference on Software Engineering and Knowledge Engineering SEKE, vol. 95, pp. 181–189 (1995)Google Scholar
  34. 34.
  35. 35.
    Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying rdf and rdf schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  36. 36.
  37. 37.
  38. 38.
  39. 39.
    Seltzer, M.: Oracle nosql database. Oracle White Paper (2011)Google Scholar
  40. 40.
    Brewer, E.A.: Towards robust distributed systems. In: PODC (2000)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Seyyed M. Shah
    • 1
  • Ran Wei
    • 1
  • Dimitrios S. Kolovos
    • 1
  • Louis M. Rose
    • 1
  • Richard F. Paige
    • 1
  • Konstantinos Barmpis
    • 1
  1. 1.Department of Computer ScienceUniversity of YorkUK

Personalised recommendations