Advertisement

Big data storage technologies: a survey

Review
  • 466 Downloads

Abstract

There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed ‘big data’. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mechanism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the existing approaches using Brewer’s CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system.

Key words

Big data Big data storage NoSQL databases Distributed databases CAP theorem Scalability Consistency-partition resilience Availability-partition resilience 

CLC number

TP311.13 

References

  1. Aasman, J., 2008. Event Processing Using an RDF Database (White Paper). Association for the Advancement of Artificial Intelligence, p.1–5.Google Scholar
  2. Abadi, D.J., Boncz, P.A., Harizopoulos, S., 2009. Column-oriented database systems. Proc. VLDB Endow., 2(2): 1664–1665. https://doi.org/10.14778/1687553.1687625Google Scholar
  3. Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D., et al., 2009. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proc. VLDB Endow., 2(1): 922–933. https://doi.org/10.14778/1687627.1687731Google Scholar
  4. Abramova, V., Bernardino, J., 2013. NoSQL databases: MongoDB vs Cassandra. Proc. Int. Conf. on Computer Science and Software Engineering, p.14–22. https://doi.org/10.1145/2494444.2494447Google Scholar
  5. Aerospike, 2012. Aerospike, Architecture Overview (White Paper). http://www.aerospike.com/Google Scholar
  6. Aerospike, 2015. NoSQL Database, In-Memory or Flash Optimized and Web Scale. http://www.aerospike.com/ [Accessed on May 5, 2015].Google Scholar
  7. Alex, P., Ana-Maria, B., 2009. Terrastore: a Consistent, Partitioned and Elastic Document Database. http://nosql.mypopescu.com/ [Accessed on May 7, 2015].Google Scholar
  8. AllegroGraph, 2015. AllegroGraph. http://franz.com [Accessed on May 5, 2015].Google Scholar
  9. Anderson, J.C., Lehnardt, J., Slater, N., 2010. CouchDB: the Definitive Guide. O’Reilly Media, Inc., California.Google Scholar
  10. Apache, 2015. Apache CouchDB: a Database for the Web. http://couchdb.apache.org/ [Accessed on May 5, 2015].Google Scholar
  11. Apache Software Foundation, 2015. HBase Apache. http://hbase.apache.org/ [Accessed on Jan. 15, 2015].Google Scholar
  12. Armbrust, M., Fox, A., Patterson, D., et al., 2009. Scads: scaleindependent storage for social computing applications. arXiv:0909.1775.Google Scholar
  13. Azeem, R., Khan, M.I.A., 2012. Techniques about data replication for mobile ad-hoc network databases. Int. J. Multidiscipl. Sci. Eng., 3(5): 53–57.Google Scholar
  14. Banker, K., 2011. MongoDB in Action. Manning Publications Co., New York.Google Scholar
  15. Baron, J., Kotecha, S., 2013. Storage Options in the AWS Cloud. Technical Report, Amazon Web Services, Washington DC.Google Scholar
  16. Batra, S., Tyagi, C., 2012. Comparative analysis of relational and graph databases. Int. J. Soft Comput. Eng., 2(2): 509–512.Google Scholar
  17. Bohlouli, M., Schulz, F., Angelis, L., et al., 2013. Towards an integrated platform for big data analysis. In: Fathi, M. (Ed.), Integration of Practice-Oriented Knowledge Technology: Trends and Prospectives. Springer Berlin Heidelberg, p.47–56. https://doi.org/10.1007/978-3-642-34471-8_4CrossRefGoogle Scholar
  18. Bossa, S., 2009. Thoughts and Fragments: Terrastore and the CAP Theorem. http://sbtourist.blogspot.com/2009/12/terrastore-and-cap-theorem.html [Accessed on May 7, 2015].Google Scholar
  19. Brewer, E., 2012. CAP twelve years later: how the “rules” have changed. Computer, 45(2): 23–29. https://doi.org/10.1109/MC.2012.37Google Scholar
  20. Bunch, C., Chohan, N., Krintz, C., et al., 2010. An evaluation of distributed datastores using the AppScale cloud platform. IEEE 3rd Int. Conf. on Cloud Computing, p.305–312. https://doi.org/10.1109/CLOUD.2010.51Google Scholar
  21. Burrows, M., 2006. The Chubby lock service for loosely-coupled distributed systems. Proc. 7th Symp. on Operating Systems Design and Implementation, p.335–350.Google Scholar
  22. Buza, K., Nagy, G.I., Nanopoulos, A., 2014. Storage-optimizing clustering algorithms for high-dimensional tick data. Expert Syst. Appl., 41(9): 4148–4157. https://doi.org/10.1016/j.eswa.2013.12.046Google Scholar
  23. Carlson, J., 2013. Redis in Action. Manning Publications Co., New York.Google Scholar
  24. Cattell, R., 2010. Scalable SQL and NoSQL data stores. SIGMOD Rec., 39(4): 12–27. https://doi.org/10.1145/1978915.1978919Google Scholar
  25. Chandra, T.D., Griesemer, R., Redstone, J., 2007. Paxos made live: an engineering perspective. Proc. 26th Annual ACM Symp. on Principles of Distributed Computing, p.398–407. https://doi.org/10.1145/1281100.1281103Google Scholar
  26. Chang, F., Dean, J., Ghemawat, S., et al., 2008. Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst., 26(2): 1–26. https://doi.org/10.1145/1365815.1365816Google Scholar
  27. Chen, C.L.P., Zhang, C.Y., 2014. Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inform. Sci., 275: 314–347. https://doi.org/10.1016/j.ins.2014.01.015Google Scholar
  28. Chen, M., Mao, S.W., Liu, Y.H., 2014. Big data: a survey. Mob. Networks Appl., 19(2): 171–209. https://doi.org/10.1007/s11036-013-0489-0Google Scholar
  29. Cichocki, A., 2014. Era of big data processing: a new approach via tensor networks and tensor decompositions. arXiv:1403.2048.Google Scholar
  30. Coburn, J., Caulfield, A.M., Akel, A., et al., 2011. NV-Heaps: making persistent objects fast and safe with nextgeneration, non-volatile memories. ACM SIGPLAN Not., 47(3): 105–118. https://doi.org/10.1145/2248487.1950380Google Scholar
  31. Cudré-Mauroux, P., Kimura, H., Lim, K.T., et al., 2009. A demonstration of SciDB: a science-oriented DBMS. Proc. VLDB Endow., 2(2): 1534–1537. https://doi.org/10.14778/1687553.1687584Google Scholar
  32. Deagustini, C.A.D., Dalibón, S.E.F., Gottifredi, S., et al., 2013. Relational databases as a massive information source for defeasible argumentation. Knowl.-Based Syst., 51: 93–109. https://doi.org/10.1016/j.knosys.2013.07.010Google Scholar
  33. Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1): 107–113. https://doi.org/10.1145/1327452.1327492Google Scholar
  34. DeCandia, G., Hastorun, D., Jampani, M., et al., 2007. Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev., 41(6): 205–220. https://doi.org/10.1145/1323293.1294281Google Scholar
  35. Deka, G.C., 2014. A survey of cloud database systems. IT Prof., 16(2): 50–57. https://doi.org/10.1109/MITP.2013.1Google Scholar
  36. Dharavath, R., Kumar, C., 2015. A scalable generic transaction model scenario for distributed NoSQL databases. J. Syst. Softw., 101: 43–58. https://doi.org/10.1016/j.jss.2014.11.037Google Scholar
  37. Diack, B.W., Ndiaye, S., Slimani, Y., 2013. CAP theorem between claims and misunderstandings: what is to be sacrificed? Int. J. Adv. Sci. Technol., 56: 1–12.Google Scholar
  38. Dittrich, J., Quiané-Ruiz, J., Richter, S., et al., 2012. Only aggressive elephants are fast elephants. Proc. VLDB Endow., 5(11): 1591–1602. https://doi.org/10.14778/2350229.2350272Google Scholar
  39. Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., et al., 2010. Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Shen, H.T., Pei, J., Özsu, M.T., et al. (Eds.), Web-Age Information Management. Springer Berlin Heidelberg, p.37–48. https://doi.org/10.1007/978-3-642-16720-1_4CrossRefGoogle Scholar
  40. Excoffier, L., Lischer, H.E.L., 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Res., 10(3): 564–567. https://doi.org/10.1111/j.1755-0998.2010.02847.xGoogle Scholar
  41. Fox, A., Brewer, E.A., 1999. Harvest, yield, and scalable tolerant systems. Proc. 7th Workshop on Hot Topics in Operating Systems, p.174–178. https://doi.org/10.1109/HOTOS.1999.798396Google Scholar
  42. Fox, A., Gribble, S.D., Chawathe, Y., et al., 1997. Cluster-based scalable network services. Proc. 16th ACM Symp. on Operating Systems Principles, p.78–91. https://doi.org/10.1145/268998.266662Google Scholar
  43. Fulton, S., 2011. The Other Non-SQL Alternative: Infinite Graph 2.0. http://readwrite.com/2011/08/24/the-othernon-sql-alternative [Accessed on May 5, 2015].Google Scholar
  44. Gani, A., Siddiqa, A., Shamshirband, S., et al., 2015. A survey on indexing techniques for big data: taxonomy and performance evaluation. Knowl. Inform. Syst., 46(2): 241–284. https://doi.org/10.1007/s10115-015-0830-yGoogle Scholar
  45. George, L., 2011. HBase: the Definitive Guide. O’Reilly Media, Inc., California.Google Scholar
  46. Ghemawat, S., Gobioff, H., Leung, S.T., 2003. The Google file system. SIGOPS Oper. Syst. Rev., 37(5): 29–43. https://doi.org/10.1145/1165389.945450Google Scholar
  47. Gorton, I., Klein, J., 2015. Distribution, data, deployment: software architecture convergence in big data systems. IEEE Softw., 32(3): 78–85. https://doi.org/10.1109/MS.2014.51Google Scholar
  48. Gray, J., 1981. The transaction concept: virtues and limitations. Proc. 7th Int. Conf. on Very Large Data Bases, p.144–154.Google Scholar
  49. Habeeb, M., 2010. A Developer’s Guide to Amazon SimpleDB. Addison-Wesley Professional.Google Scholar
  50. Han, J., Haihong, E., Le, G., et al., 2011. Survey on NoSQL database. 6th Int. Conf. on Pervasive Computing and Applications, p.363–366. https://doi.org/10.1109/ICPCA.2011.6106531Google Scholar
  51. Hecht, R., Jablonski, S., 2011. NoSQL evaluation: a use case oriented survey. Int. Conf. on Cloud and Service Computing, p.336–341. https://doi.org/10.1109/CSC.2011.6138544Google Scholar
  52. Helmke, M., 2012. Ubuntu Unleashed 2012 Edition: Covering 11.10 and 12.04. Sams Publishing.Google Scholar
  53. Hewitt, E., 2010. Cassandra: the Definitive Guide. O’Reilly Media, Inc., California.Google Scholar
  54. Hu, Y., Dessloch, S., 2014. Extracting deltas from column oriented NoSQL databases for different incremental applications and diverse data targets. Data Knowl. Eng., 93: 42–59. https://doi.org/10.1016/j.datak.2014.07.002Google Scholar
  55. HyperGraphDB, 2010. HyperGraphDB—A Graph Database. http://www.hypergraphdb.org/ [Accessed on May 5, 2015].Google Scholar
  56. HyperTable, 2015. Hypertable. http://hypertable.com/documentation/ [Accessed on Jan. 15, 2015].Google Scholar
  57. IMDB, 2015. The Internet Movie DataBase. http://www.imdb.com/.[Accessed on May 5, 2015]Google Scholar
  58. InfiniteGraph, 2014. InfiniteGraph ∣ Distributed Graph Database. http://www.objectivity.com/ [Accessed on May 5, 2015].Google Scholar
  59. Iordanov, B., 2010. HyperGraphDB: a generalized graph database. In: Shen, H.T., Pei, J., Özsu, M.T., et al. (Eds.), Web-Age Information Management. Springer Berlin Heidelberg, p.25–36. https://doi.org/10.1007/978-3-642-16720-1_3CrossRefGoogle Scholar
  60. Kaisler, S., Armour, F., Espinosa, J.A., et al., 2013. Big data: issues and challenges moving forward. 46th Hawaii Int. Conf. on System Sciences, p.995–1004. https://doi.org/10.1109/HICSS.2013.645Google Scholar
  61. Karaboga, D., Ozturk, C., 2011. A novel clustering approach: artificial bee colony (ABC) algorithm. Appl. Soft Comput., 11(1): 652–657. https://doi.org/10.1016/j.asoc.2009.12.025Google Scholar
  62. Khetrapal, A., Ganesh, V., 2006. HBase and Hypertable for Large Scale Distributed Storage Systems. Department of Computer Science, Purdue University.Google Scholar
  63. Kim, M., Candan, K.S., 2014a. Efficient static and dynamic in-database tensor decompositions on chunk-based array stores. Proc. 23rd ACM Int. Conf. on Information and Knowledge Management, p.969–978. https://doi.org/10.1145/2661829.2661864Google Scholar
  64. Kim, M., Candan, K.S., 2014b. TensorDB: in-database tensor manipulation with tensor-relational query plans. Proc. 23rd ACM Int. Conf. on Information and Knowledge Management, p.2039–2041. https://doi.org/10.1145/2661829.2661842Google Scholar
  65. Kristina, C., Michael, D., 2010. MongoDB: the Definitive Guide. O’Reilly Media, Inc., California.Google Scholar
  66. Lakshman, A., Malik, P., 2010. Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44(2): 35–40. https://doi.org/10.1145/1773912.1773922Google Scholar
  67. Lam, C.F., Liu, H., Koley, B., et al., 2010. Fiber optic communication technologies: what’s needed for datacenter network operations. IEEE Commun. Mag., 48(7): 32–39. https://doi.org/10.1109/MCOM.2010.5496876Google Scholar
  68. MacFadden, G., 2013. 21 NoSQL Innovators to Look for in 2020. http://blog.parityresearch.com/21-nosql-innovatorsto-look-for-in-2020/ [Accessed on May 5, 2015].Google Scholar
  69. MemcacheDB, 2015. MemcacheDB. http://memcachedb.org/ [Accessed on Jan. 10, 2015].Google Scholar
  70. Milne, D., Witten, I.H., 2013. An open-source toolkit for mining Wikipedia. Artif. Intell., 194: 222–239. https://doi.org/10.1016/j.artint.2012.06.007MathSciNetGoogle Scholar
  71. MovieLens, 2015. MovieLens. https://movielens.org/ [Accessed on May 5, 2015].Google Scholar
  72. Nagy, D., Yassin, A.M., Bhattacherjee, A., 2010. Organizational adoption of open source software: barriers and remedies. Commun. ACM, 53(3): 148–151. https://doi.org/10.1145/1666420.1666457Google Scholar
  73. Neumeyer, L., Robbins, B., Nair, A., et al., 2010. S4: distributed stream computing platform. IEEE Int. Conf. on Data Mining Workshops, p.170–177. https://doi.org/10.1109/ICDMW.2010.172Google Scholar
  74. Niranjanamurthy, M., Archana, U.L., Niveditha, K.T., et al., 2014. The research study on DynamoDB—NoSQL database service. Int. J. Comput. Sci. Mob. Comput., 3(10): 268–279.Google Scholar
  75. Objectivity, Inc., 2012. InfiniteGraph: the Distributed Graph Database (White Paper). http://www.objectivity.com/products/infinitegraph/ [Accessed on May 5, 2015].Google Scholar
  76. Oliveira, S.F., Fürlinger, K., Kranzlmüller, D., 2012. Trends in computation, communication and storage and the consequences for data-intensive science. IEEE 14th Int. Conf. on High Performance Computing and Communication and IEEE 9th Int. Conf. on Embedded Software and Systems, p.572–579. https://doi.org/10.1109/HPCC.2012.83Google Scholar
  77. Oracle, 2015c. Unleash High Availability Applications with Berkeley DB (White Paper). http://www.oracle.com/technetwork/products/berkeleydb/high-availability-0990 50.html [Accessed on May 5, 2015].Google Scholar
  78. Oracle Secondary, 2015. Secondary Indexes. https://docs.oracle.com/cd/E17275_01/html/programmer_reference/am_second.html [Accessed on May 5, 2015].Google Scholar
  79. OrientDB, 2015. OrientDB—OrientDB Mulit-model NoSQL Database. http://orientdb.com [Accessed on May 5, 2015].Google Scholar
  80. Padhye, V., Tripathi, A., 2015. Scalable transaction management with snapshot isolation for NoSQL data storage systems. IEEE Trans. Serv. Comput., 8(1): 121–135. https://doi.org/10.1109/TSC.2013.47Google Scholar
  81. Pokorny, J., 2013. NoSQL databases: a step to database scalability in web environment. Int. J. Web Inform. Syst., 9(1): 69–82. https://doi.org/10.1108/17440081311316398Google Scholar
  82. Putnik, G., Sluga, A., ElMaraghy, H., et al., 2013. Scalability in manufacturing systems design and operation: state-of the-art and future developments roadmap. CIRP Ann. Manuf. Technol., 62(2): 751–774. https://doi.org/10.1016/j.cirp.2013.05.002Google Scholar
  83. Qualcomm, 2014a. NoSQL XML Databases Qualcomm Qizx. https://www.qualcomm.com/qizx [Accessed on May 5, 2015].Google Scholar
  84. Qualcomm, 2014b. Qualcomm Qizx ∣ User Guide. https://www.qualcomm.com/qizx [Accessed on May 5, 2015].Google Scholar
  85. Ramakrishnan, R., 2012. CAP and cloud data management. Computer, 45(2): 43–49. https://doi.org/10.1109/MC.2011.388Google Scholar
  86. RethinkDB, 2015. RethinkDB: the Open Source Database for Real-Time Web. http://rethinkdb.com/ [Accessed on May 5, 2015].Google Scholar
  87. RocketSoftware, 2014a. High Availability and Dissaster Recovery for Rocket U2 Databases. http://info.rocketsoftware.com/hadr.html [Accessed on May 5, 2015].Google Scholar
  88. RocketSoftware, 2014b. Vermont Teddy Bear ∣ A Top Ecommerce Retailer Relies on Rocket U2 to Successfully Manage Information Processing Activities in Its Directto-Consumer Divisions [Case Study]. http://blog.rocketsoftware.com/blog/2014/12/22/vermont-teddy-bear-relies-rocket-u2-improve-service-increase-revenue/Google Scholar
  89. RocketSoftware, 2015. Flexible, High Volume Data Management ∣ Rocket Software. http://www.rocketsoft.ware.com/product-families/rocket-u2 [Accessed on May 5, 2015].Google Scholar
  90. Ruflin, N., Burkhart, H., Rizzotti, S., 2011. Social-data storage-systems. Proc. Databases and Social Networks, p.7–12. https://doi.org/10.1145/1996413.1996415Google Scholar
  91. Sakr, S., Liu, A., Batista, D.M., et al., 2011. A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutor., 13(3): 311–336. https://doi.org/10.1109/SURV.2011.032211.00087Google Scholar
  92. Scalaris, 2015. Scalaris. http://scalaris.zib.de/ [Accessed on May 5, 2015].Google Scholar
  93. Schütt, T., Schintke, F., Reinefeld, A., 2008. Scalaris: reliable transactional P2P key/value store. Proc. 7th ACM SIGPLAN Workshop on ERLANG, p.41–48. https://doi.org/10.1145/1411273.1411280Google Scholar
  94. Sciore, E., 2007. SimpleDB: a simple Java-based multiuser system for teaching database internals. SIGCSE Bull., 39(1): 561–565. https://doi.org/10.1145/1227504.1227498Google Scholar
  95. SD Times Newswire, 2013. OrientDB Becomes Distributed Using Hazelcast, Leading Open Source In-Memory Data Grid. http://sdtimes.com/orientdb-becomes-distributedusing-hazelcast-leading-open-source-in-memory-data-grid/ [Accessed on May 5, 2015].Google Scholar
  96. Seltzer, M., Bostic, K., 2015. The Architecture of Open Source Applications: Berkeley DB. http://www.aosabook.org/en/bdb.html [Accessed on May 5, 2015].Google Scholar
  97. Sheehy, J., 2010. Riak 0.10 is Full of Great Stuff. http://basho.com/riak-0-10-is-full-of-great-stuff/ [Accessed on May 5, 2015].Google Scholar
  98. Shvachko, K.V., 2010. HDFS scalability: the limits to growth. Login, 35(2): 6–16.Google Scholar
  99. Sivasubramanian, S., 2012. Amazon DynamoDB: a seamlessly scalable non-relational database service. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.729–730. https://doi.org/10.1145/2213836.2213945Google Scholar
  100. Skoulis, I., Vassiliadis, P., Zarras, A.V., 2015. Growing up with stability: how open-source relational databases evolve. Inform. Syst., 53: 363–385. https://doi.org/10.1016/j.is.2015.03.009Google Scholar
  101. SourceForge, 2015. KAI SourceForge. http://sourceforge.net/projects/kai/ [Accessed on May 5, 2015].Google Scholar
  102. Spaho, E., Barolli, L., Xhafa, F., et al., 2013. P2P data replication and trustworthiness for a JXTA-overlay P2P system using fuzzy logic. Appl. Soft Comput., 13(1): 321–328. https://doi.org/10.1016/j.asoc.2012.08.044Google Scholar
  103. Stonebraker, M., Brown, P., Zhang, D., et al., 2013. SciDB: a database management system for applications with complex analytics. Comput. Sci. Eng., 15(3): 54–62. https://doi.org/10.1109/MCSE.2013.19Google Scholar
  104. Subramaniyaswamy, V., Vijayakumar, V., Logesh, R., et al., 2015. Unstructured data analysis on big data using MapReduce. Proc. Comput. Sci., 50: 456–465. https://doi.org/10.1016/j.procs.2015.04.015Google Scholar
  105. Sumbaly, R., Kreps, J., Gao, L., et al., 2012. Serving largescale batch computed data with project Voldemort. Proc. 10th USENIX Conf. on File and Storage Technologies, p.18.Google Scholar
  106. Sun, D.W., Chang, G.R., Gao, S., et al., 2012. Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. J. Comput. Sci. Technol., 27(2): 256–272. https://doi.org/10.1007/s11390-012-1221-4MATHGoogle Scholar
  107. Taheri, J., Lee, Y.C., Zomaya, A.Y., et al., 2013. A bee colony based optimization approach for simultaneous job scheduling and data replication in grid environments. Comput. Oper. Res., 40(6): 1564–1578. https://doi.org/10.1016/j.cor.2011.11.012MathSciNetMATHGoogle Scholar
  108. Tanenbaum, A., van Steen, M., 2007. Distributed Systems. Pearson Prentice Hall.MATHGoogle Scholar
  109. Taylor, R.C., 2010. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform., 11(Suppl 12):1–6. https://doi.org/10.1186/1471-2105-11-S12-S1Google Scholar
  110. Terrastore, 2015. Terrastore—Scalable, Elastic, Consistent Document Store. http://code.google.com/p/terrastore [Accessed on May 7, 2015].Google Scholar
  111. Tudorica, B.G., Bucur, C., 2011. A comparison between several NoSQL databases with comments and notes. 10th Roedunet Int. Conf., p.1–5. https://doi.org/10.1109/RoEduNet.2011.5993686Google Scholar
  112. Turk, A., Selvitopi, R.O., Ferhatosmanoglu, H., et al., 2014. Temporal workload-aware replicated partitioning for social networks. IEEE Trans. Knowl. Data Eng., 26(11): 2832–2845. https://doi.org/10.1109/TKDE.2014.2302291Google Scholar
  113. Vicknair, C., Macias, M., Zhao, Z.D., et al., 2010. A comparison of a graph database and a relational database: a data provenance perspective. Proc. 48th Annual Southeast Regional Conf., p.1–6. https://doi.org/10.1145/1900008.1900067Google Scholar
  114. Vyas, U., Kuppusamy, P., 2014. DynamoDB Applied Design Patterns. Packt Publishing Ltd., Birmingham.Google Scholar
  115. Walsh, L., Akhmechet, V., Glukhovsky, M., 2009. RethinkDBRethinking Database Storage (White Paper).Google Scholar
  116. Wang, H.J., Li, J.H., Zhang, H.M., et al., 2014. Benchmarking Replication and Consistency Strategies in Cloud Serving Databases: HBase and Cassandra. In: Zhan, J.F., Han, R., Weng, C.L. (Eds.), Big Data Benchmarks, Performance Optimization, and Emerging Hardware. Springer International Publishing, p.71–82. https://doi.org/10.1007/978-3-319-13021-7_6Google Scholar
  117. Wang, X., Sun, H.L., Deng, T., et al., 2015. On the tradeoff of availability and consistency for quorum systems in data center networks. Comput. Networks, 76: 191–206. https://doi.org/10.1016/j.comnet.2014.11.006Google Scholar
  118. Wenk, A., Slater, N., 2014. Introduction. https://cwiki.apache.org/confluence/display/COUCHDB/Introduction [Accessed on May 5, 2015].Google Scholar
  119. Xiao, Z.F., Liu, Y.M., 2011. Remote sensing image database based on NOSQL database. 19th Int. Conf. on Geoinformatics, p.1–5. https://doi.org/10.1109/GeoInformatics.2011.5980724Google Scholar
  120. Zhang, X.X., Xu, F., 2013. Survey of research on big data storage. 12th Int. Symp. on Distributed Computing and Applications to Business, Engineering and Science, p.76–80. https://doi.org/10.1109/DCABES.2013.21Google Scholar
  121. Zhao, W.Z., Ma, H.F., He, Q., 2009. Parallel K-means clustering based on MapReduce. IEEE Int. Conf. on Cloud Computing, p.674–679. https://doi.org/10.1007/978-3-642-10665-1_71Google Scholar

Copyright information

© Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  1. 1.Faculty of Computer Science and Information TechnologyUniversity of MalayaKuala LumpurMalaysia
  2. 2.Department of Information TechnologyBahauddin Zakariya UniversityMultanPakistan

Personalised recommendations