Advertisement

Distributed and Parallel Databases

, Volume 34, Issue 2, pp 179–215 | Cite as

Performance analysis of data intensive cloud systems based on data management and replication: a survey

  • Saif Ur Rehman Malik
  • Samee U. Khan
  • Sam J. Ewen
  • Nikos Tziritas
  • Joanna Kolodziej
  • Albert Y. Zomaya
  • Sajjad A. Madani
  • Nasro Min-Allah
  • Lizhe Wang
  • Cheng-Zhong Xu
  • Qutaibah Marwan Malluhi
  • Johnatan E. Pecero
  • Pavan Balaji
  • Abhinav Vishnu
  • Rajiv Ranjan
  • Sherali Zeadally
  • Hongxiang Li
Article

Abstract

As we delve deeper into the ‘Digital Age’, we witness an explosive growth in the volume, velocity, and variety of the data available on the Internet. For example, in 2012 about 2.5 quintillion bytes of data was created on a daily basis that originated from myriad of sources and applications including mobile devices, sensors, individual archives, social networks, Internet of Things, enterprises, cameras, software logs, etc. Such ‘Data Explosions’ has led to one of the most challenging research issues of the current Information and Communication Technology era: how to optimally manage (e.g., store, replicated, filter, and the like) such large amount of data and identify new ways to analyze large amounts of data for unlocking information. It is clear that such large data streams cannot be managed by setting up on-premises enterprise database systems as it leads to a large up-front cost in buying and administering the hardware and software systems. Therefore, next generation data management systems must be deployed on cloud. The cloud computing paradigm provides scalable and elastic resources, such as data and services accessible over the Internet Every Cloud Service Provider must assure that data is efficiently processed and distributed in a way that does not compromise end-users’ Quality of Service (QoS) in terms of data availability, data search delay, data analysis delay, and the like. In the aforementioned perspective, data replication is used in the cloud for improving the performance (e.g., read and write delay) of applications that access data. Through replication a data intensive application or system can achieve high availability, better fault tolerance, and data recovery. In this paper, we survey data management and replication approaches (from 2007 to 2011) that are developed by both industrial and research communities. The focus of the survey is to discuss and characterize the existing approaches of data replication and management that tackle the resource usage and QoS provisioning with different levels of efficiencies. Moreover, the breakdown of both influential expressions (data replication and management) to provide different QoS attributes is deliberated. Furthermore, the performance advantages and disadvantages of data replication and management approaches in the cloud computing environments are analyzed. Open issues and future challenges related to data consistency, scalability, load balancing, processing and placement are also reported.

Keywords

Replication Data management Cloud computing systems  Performance gradation Data intensive computing 

Notes

Acknowledgments

The authors are thankful to Kashif Bilal and Osman Khalid for the valuable reviews, suggestions, and comments.

References

  1. 1.
    Mell, P., Grance, T.: Definition of cloud computing. Technical report, National Institute of Standard and Technology (NIST) (2009)Google Scholar
  2. 2.
    Bell, G., Gray, J., Szalay, A.: Petascale computational systems. IEEE Comp. 39(1), 110–112 (2006)CrossRefGoogle Scholar
  3. 3.
    Lamanna, M.: High-energy physics applications on the grid. In: Wang, Lizhe, Jie, Wei, Chen, Jinjun (eds.) Grid Computing: Infrastructure, Service, and Applications, pp. 433–458. CRC Press, Boca Raton (2009)CrossRefGoogle Scholar
  4. 4.
    Khatib, Y., Edwards, C.: A Survey-Based Study of Grid Traffic. In: Proceedings of GridNets, pp. 41–48 (2007)Google Scholar
  5. 5.
    Gartner: Gartner top ten disruptive technologies for 2008 to 2012. Emerging trends and technologies roadshow http://www.gartner.com/it/page.jsp?id=681107, Accessed (2011)
  6. 6.
    Abadi, D.: Data management in the cloud: limitations and opportunities. IEEE Data Eng. Bull. 32(1), 3–12 (2009)Google Scholar
  7. 7.
    Leinwand, A.: The Hidden Cost of the cloud: Bandwidth Charges, GIGAom, Jul. 17 2009, http://gigaom.com/2009/07/17/the-hidden-cost-of-the-cloud-bandwidth-charges/, Accessed May 12 (2011)
  8. 8.
    Sakr, S., Liu, A., Batista, D., Alomari, M.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Survey Tutor. 09, 1–26 (2011)Google Scholar
  9. 9.
    Cassandra: Available at http://incubator.apache.org/cassandra/, Accessed (2011)
  10. 10.
    Thusoo, A., Sarma, J., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive-A warehousing solution over a MapReduce framework. In VLDB, pp. 1626–1629 (2009)Google Scholar
  11. 11.
    HBase: Available at http://hadoop.apache.org/hbase/, Accessed (2011)
  12. 12.
    Loukopoulos, Thanasis, Ahmad, Ishfaq, Papadias, Dimitris: An overview of data replication on the internet. In: Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN.02), pp. 27–32 (2002)Google Scholar
  13. 13.
    Kia, H.S., Khan, S.U.: Server replication in multicast networks. In: 10th IEEE International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, pp. 337–341 (2012)Google Scholar
  14. 14.
    Khan, S.U., Ahmad, I.: A pure Nash equilibrium based game theoretical method for data replication across multiple servers. IEEE Trans. Knowl. Data Eng. 21(4), 537–553 (2009)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Khan, S.U.: A frugal auction technique for data replication in large distributed computing systems. In: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, NV, USA, pp. 17–23 (2009)Google Scholar
  16. 16.
    Khan, S.U., Ardil, C.: A competitive replica placement methodology for Ad Hoc networks. In: International Conference on Parallel and Distributed Computing Systems (ICPDCS), Oslo, Norway, pp. 128–133 (2009)Google Scholar
  17. 17.
    Khan, S.U., Ahmad, I.: Comparison and analysis of ten static heuristics-based internet data replication techniques. J. Parallel Distrib. Comput. 68(2), 113–136 (2008)CrossRefMATHGoogle Scholar
  18. 18.
    Khan, S.U., Maciejewski, A.A., Siegel, H.J., Ahmad, I.: A game theoretical data replication technique for mobile Ad Hoc networks. In: 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS). Miami (2008)Google Scholar
  19. 19.
    Khan, S.U., Ahmad, I.: A pure Nash equilibrium guaranteeing game theoretical replica allocation method for reducing web access time. In: 12th International Conference on Parallel and Distributed Systems (ICPADS), Minneapolis pp. 169–176 (2006)Google Scholar
  20. 20.
    Khan, S.U., Ahmad, I.: Game theoretical solutions for data replication in distributed computing systems. In: Rajasekaran, S., Reif, J. (eds.), Handbook of Parallel Computing: Models, Algorithms, and Applications. Chapman & Hall/CRC Press, Boca Raton (2007). ISBN 1-584-88623-4, Chapter 45Google Scholar
  21. 21.
    Khan, S.U., Ahmad, I.: Data replication in large distributed computing systems using supergames. In: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, pp. 38–44 (2006)Google Scholar
  22. 22.
    Khan, S.U., Ardil, C.: A frugal bidding procedure for replicating WWW content. Int. J. Inform. Technol. 5(1), 67–80 (2009)Google Scholar
  23. 23.
    Khan, S.U., Maciejewski, A.A., Siegel, H.J.: Robust CDN replica placement techniques. In: 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS). Italy, Rome (2009)Google Scholar
  24. 24.
    Khan, S.U., Ardil, C.: A fast replica placement methodology for large-scale distributed computing systems. In: International Conference on Parallel and Distributed Computing Systems (ICPDCS), Oslo, pp. 121–127 (2009)Google Scholar
  25. 25.
    Wu, Y., Li, G., Wang, L., Ma, Y., Kolodziej, J., Khan, S.U.: A review of data intensive computing. In: 12th International Conference on Scalable Computing and Communications (ScalCom), Changzhou, (2012)Google Scholar
  26. 26.
    Khan, S.U., Ahmad, I.: A cooperative game theoretical replica placement technique. In: 13th International Conference on Parallel and Distributed Systems (ICPADS), Hsinchu, (2007)Google Scholar
  27. 27.
    Khan, S.U., Ahmad, I.: Replicating data objects in large-scale distributed computing systems using extended Vickery auction. Int. J. Comput. Intell. 3(1), 14–22 (2006)Google Scholar
  28. 28.
    Gao, Aiqiang, Diao, Luhong: Lazy update propagation for data replication in cloud computing. In: 5th International Conference on Pervasive Computing and Applications (ICPCA), pp. 250–254 (2010)Google Scholar
  29. 29.
    Ikeda, Takahiko, Ohara, Mamoru, Fukumoto, Satoshi, Arai, Masayuki, Iwasaki, Kazuhiko: A distributed data replication protocol for file versioning with optimal node assignments. In: Proceedings of IEEE International Pacific Rim International Symposium on Dependable Computing 2010, pp. 117–125 (2011)Google Scholar
  30. 30.
    Khan, S.U., Ahmad, I.: Discriminatory algorithmic mechanism design based WWW content replication. Informatica 31(1), 105–119 (2007)MathSciNetGoogle Scholar
  31. 31.
    Khan, S.U., Ahmad, I.: A semi-distributed axiomatic game theoretical mechanism for replicating data objects in large distributed computing systems. In: 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS). Long Beach (2007)Google Scholar
  32. 32.
    Kohavi, R., Henne, R.M., Sommerfield, D.: Practical guide to controlled experiments on the Web: Listen to your customers not to the HiPPO. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining (KDD 2007), pp. 959–967Google Scholar
  33. 33.
    Gulati, A., Merchant, A., Varman, P.: pClock: An arrival curve based approach for QoS in shared storage systems. In: Proceedings ACM International Conference on Measurement and Modeling of Computer System (SIGMETRICS), (2007)Google Scholar
  34. 34.
    Gulati, A., Merchant, A., Varman, P.: mClock: Handling throughput variability for hypervisor IO scheduling. In: Proceedings of the 9th OSDI, (2010)Google Scholar
  35. 35.
    Wang, J., Varmany, P., Xie, C.: Avoiding performance fluctuation in cloud storage. In: Proceeding of High Performance Computing (HiPC), pp. 1–9 (2010)Google Scholar
  36. 36.
    Goiri, I., Julia, F., Fito, J., Macias, M., Guitart, J.: Resource-level QoS metric for CPU-based guarantees in Cloud providers. In: 7th international workshop on economics of grids, Clouds, systems, and services, pp. 34–47 (2010)Google Scholar
  37. 37.
    Amrhein, D., Anderson, P., de Andrade, A., Armstrong, J., Arasan, E., Bartlett, J., Bruklis, R., Cameron, K., Cohen, R., Crawford, T. M., Deolaliker, V., Easton, A., Flores, R., Fourcade, G.: Review and summary of cloud service level agreements. http://public.dhe.ibm.com/software/dw/cloud/library/cl-rev2sla-pdf.pdf
  38. 38.
    Kliazovich, D., Bouvry, P., Khan, S.U.: Simulation and Performance Analysis of Data Intensive and Workload Intensive Cloud Computing Data Centers. In: Kachris, C., Bergman, K., Tomkos, I. (eds.) Optical Interconnects for Future Data Center Networks.Springer, New York, USA, ISBN: 978-1-4614-4629-3, Chapter 4Google Scholar
  39. 39.
    Goel, S., Buyya, R.: Data Replication Strategies in Wide Area Distributed Systems. Enterprise Service Computing: From Concept to Deployment, Robin G. Qiu (ed), pp. 211–241, ISBN 1-599044181-2, Idea Group Inc., Hershey (2006)Google Scholar
  40. 40.
    Pallickara, S.L., Pallickara, S., Pierce, M.: Scientific Data Management in the Cloud: A Survey of Technologies, Approaches and Challenges. Chapter 22: pp. 517–534, Handbook of Cloud Computing. Springer. ISBN: 978-1-4419-6523-3 (2010)Google Scholar
  41. 41.
    Ramakrishnan, R.: Data Management in the Cloud. In: Proceedings of IEEE 25th International Conference on Data Engineering(ICDE ’09), pp. 5–5 (2009)Google Scholar
  42. 42.
    Gonzalez, L., Merino, L., Caceres, J., Lindner, M.: A break in the clouds: towards a cloud definition. Comp. Commun. Rev. 39(1), 50–55 (2009)Google Scholar
  43. 43.
    Plummer, D., Bittman, T., Austin, T., Cearley, D., Smith, D.: Cloud Computing: Defining and Describing an Emerging Phenomenon. Technical report, Gartner (2008)Google Scholar
  44. 44.
    Staten, J., Yates, S., Gillett, F., Saleh, W., Dines, R.: Is cloud computing ready for the enterprise?. Technical Report, Forrester Research (2008)Google Scholar
  45. 45.
    Bojanova, I., Samba, A.: Analysis of cloud computing delivery architecture models. In: IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA), Biopolis, pp. 453–458 (2011)Google Scholar
  46. 46.
    Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D., Rasin, A., Silberschatz, A.: Hadoopdb: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Publ. Very Large Database (PVLDB) 2(1), 922–933 (2009)Google Scholar
  47. 47.
    Cooper, B., Baldeschwieler, E., Fonseca, R., Kistler, J., Narayan, P., Neerdaels, C., Negrin, T., Ramakrishnan, R., Silberstein, A., Srivastava, U., Stata, R.: Building a cloud for Yahoo!. IEEE Data Eng. Bull. 32(1), 36–43 (2009)Google Scholar
  48. 48.
    Pfleeger, C.P., Pfleeger, S.L.: Security in Computing, 4th edn. Prentice Hall PTR, Upper Saddle River (2006)MATHGoogle Scholar
  49. 49.
    Chen, Y., Paxson, V., Katz, R.H.: What’s New about cloud Computing Security?, Technical Report UCB/EECS-2010-5, EECS Department, University of California, Berkeley (2010)Google Scholar
  50. 50.
    Ristenpart et al.: Hey, you, get off of my cloud! Exploring information leakage in third- party compute clouds. In: Proceedings of the 16th ACM Conference on Computer and Communication Security (CCS-09), pp. 199–212. ACM Press (2009)Google Scholar
  51. 51.
    Habib, S.M., Ries, S., Muhlhauser, M.: Cloud Computing Landscape and Research Challenges regarding Trust and Reputation. In: 7th International Conference on Ubiquitous Intelligence & Computing and 7th International Conference on Autonomic & Trusted Computing (UIC/ATC), 2010, pp. 410–415 (2010)Google Scholar
  52. 52.
    Person, S.: Taking account of privacy when designing cloud computing services, Technical Report, HPL-2009-54, HP Laboratories (2009)Google Scholar
  53. 53.
    Everett, C.: Cloud computing: a question of trust. Comput. Fraud Security 2009(6), 5–7 (2009)CrossRefGoogle Scholar
  54. 54.
    Dillon, T.S., Wu, C., Chang, E.: Cloud computing: issues and challenges, In: Proceedings of 24th IEEE International Conference on Advanced Information Networking and Applications (AINA-2010), pp. 27–33 (2010)Google Scholar
  55. 55.
    Mouline, I.: Why assumptions about cloud performance can be dangerous to your business. J. Cloud Comput. 2(3), 24–28 (2009)Google Scholar
  56. 56.
    Goel, S., Buyya, R.: Data replication strategies in wide area distributed systems. In: Qiu, Robin G. (ed.) Enterprise Service Computing: From Concept to Deployment, pp. 211–241, ISBN 1-599044181-2, Idea Group Inc., Hershey, PA, USA (2006)Google Scholar
  57. 57.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY, USA, 2003). SOSP ’03. pp. 29–43 (2003)Google Scholar
  58. 58.
    Amazon.com: Amazon simple storage service (Amazon S3), http://aws.amazon.com/s3, Accessed on 2011
  59. 59.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The danger of replication and a solution. In: Proceedings of International Conference on Management of Data ACM SIGMOD, Montreal, pp. 173–182 (1996)Google Scholar
  60. 60.
    Loukopoulos, T., Ahmad, I.: Static and adaptive distributed data replication using genetic algorithms. J. Parallel Distrib. Comput. 64(11), 1270–1285 (2004)CrossRefMATHGoogle Scholar
  61. 61.
    Ullah Khan, Samee, Ahmad, Ishfaq: A pure Nash equilibrium-based game theoretical method for data replication across multiple servers. IEEE Trans. Knowl. Data Eng. 21(4), 537-553 (2009)Google Scholar
  62. 62.
    Wei, Q., Veeravalli, B., Gong, B., Zeng, L., Feng, D.: CDRM: A cost-effective dynamic replication management scheme for cloud storage cluster. In: IEEE International Conference on Cluster Computing 2010, pp. 188–197 (2010)Google Scholar
  63. 63.
    Kangasharju, J., Roberts, J., Ross, K.: Object replication strategies in content distribution networks. In: Proceedings of Sixth International Workshop on Web Caching and Content Distribution (WCW ’01), pp. 455–456 (2001)Google Scholar
  64. 64.
    Dowdy, L., Foster, D.: Comparative models of the file assignment problem. ACM Comput. Surveys 14(2), 287–313 (1982)CrossRefGoogle Scholar
  65. 65.
    Khan, S., Ahmad, I.: Heuristic-based replication schemas for fast information retrieval over the internet. In: Proceedings of 17th International Conference on Parallel and Distributed Computing Systems (PDCS ’04), pp. 278–283 (2004)Google Scholar
  66. 66.
    Li, B., Golin, M., Italiano, G., Deng, X.: On the optimal placement of Web Proxies in the internet. Proc. IEEE INFOCOM ’00 1(1), 1282–1290 (2000)Google Scholar
  67. 67.
    Qiu, L., Padmanabhan, V., Voelker, G.: On the placement of web server replicas. Proc. IEEE INFOCOM ’01 1(2), 1587–1596 (2000)Google Scholar
  68. 68.
    Loukopoulos, T., Lampsas, P., Ahmad, I.: Continuous replica placement schemes in distributed systems. In: International Conference on Supercomputing (ICS’05) Boston, June 20–22Google Scholar
  69. 69.
    Chu, W.W.: Optimal file allocation in a multiple-computer information system. IEEE Trans. Comput. C–18, 885–889 (1969)CrossRefMATHGoogle Scholar
  70. 70.
    Chu, W.W.: Optimal file allocation in a computer network. In: Abramson, N., Kuo, F.F. (eds.) Computer-Communication Networks, pp. 83–94. Prentice-Hall, Englewood Cliffs (1973)Google Scholar
  71. 71.
    Casey, R.G.: Allocation of copies of files in an information network. In: Proceedings of AFZPS 1972 SJCC, vol. 40, pp. 617–625. AFIPS Press (1972)Google Scholar
  72. 72.
    Eswaran, K.P.: Placement of records in a file and file allocation in a computer network. In: Proceedings of the ZFZP Congress on Information Processing 1974, pp. 304–307. North-Holland, Amsterdam (1974)Google Scholar
  73. 73.
    Mahmoud, S., Riordon, J.S.: Optimal allocation of resources in distributed information networks. ACM Trans. Database Syst. 1(1), 66–78 (1976)CrossRefGoogle Scholar
  74. 74.
    Ramamoorthy, C.V., Wah, B.W.: The placement of relations on a distributed relational database. In: Proceedings of the 1st International Conference on Distributed Computing Systems (Huntsville, Ala., Oct. 1979). IEEE, New York, pp. 642–650 (1979)Google Scholar
  75. 75.
    Wah, B.W., Lien, Y.-N.: Design of distributed databases on local computer systems with a multiaccess network. IEEE Trans. Softw. Eng. SE–11(7), 606–619 (1985)CrossRefGoogle Scholar
  76. 76.
    Wang, F., Oral, S., Shipman, G., Drokin, O., Wang, T., Huang, I.: Understanding lustre filesystem internals. Technical Report ORNL/TM-2009/117, Oak Ridge National Lab., National Center for Computational Sciences (2009)Google Scholar
  77. 77.
    Cloudstore (kosmosfs), http://code.google.com/p/kosmosfs/. Accessed 12 June 2012
  78. 78.
    Haddad, I.F.: PVFS: A parallel virtual file system for linux clusters. In: 4th Annual Linux Showcase and Conference, pp. 317–328. Atlanta (2000)Google Scholar
  79. 79.
    Huang, H., Hung, W., Shin, K.G.: FS2: dynamic data replication in free disk space for improving disk performance and energy consumption. In: Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP 2005), (2005)Google Scholar
  80. 80.
    Bonvin, N., Papaioannou, T.G., Aberer, K.: A self-organized, fault tolerant and scalable replication scheme for cloud storage. In: Proceedings of the Symposium on Cloud Computing, pp. 205–216. Indianapolis, USA (2010)Google Scholar
  81. 81.
    Decandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Proceedings of ACM Symposium on Operating Systems Principles, pp. 205–220. New York (2007)Google Scholar
  82. 82.
    Silvestre, G., Monnet, S., Krishnaswamy, R., Sens, P.: AREN: A Popularity aware replication scheme for cloud storage. In: IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 189–196 (2012)Google Scholar
  83. 83.
    Ye, Y., Xiao, L., Yen, I., Bastani, F.B.: Cloud storage design based on hybrid of replication and data partitioning. In: Proceedings of IEEE Sixteenth International Conference on Parallel and Distributed Systems (ICPADS), pp. 415\(\sim \)422. (2010)Google Scholar
  84. 84.
    Ye, Y., Yen, I., Xiao, L., Bastani, F.: Secure. Dependable and high performance cloud storage. Technical Report: UTDCS-10-10Google Scholar
  85. 85.
    Gupta, A., Liskov, B., Rodrigues, R.: One Hop lookups for peer-to-peer overlays. In: Proceedings of the Hot Topics in Operating Systems, Hawaii (2003)Google Scholar
  86. 86.
    Wang, F., Qiu, J., Yang, J., Dong, B., Li, X., Li, Ying: Hadoop high availability through metadata replication. In: Proceeding of the first international workshop on cloud data management, pp. 37–44 (2009)Google Scholar
  87. 87.
    Skeen, D., Stonebraker, M.: A formal model of crash recovery in a distributed system. IEEE Trans. Softw. Eng. 9(3), 219–228 (1983)CrossRefGoogle Scholar
  88. 88.
    Suresh, A.: HadoopT: Breaking the Scalability Limits of Hadoop. Diss, Rochester Institute of Technology, Rochester (2011)Google Scholar
  89. 89.
    Bessani, A., Correia, M., Quaresma, B., Andr’e, F., Sousa, P.: DepSky: Dependable and secure storage in a cloud-of-clouds. In: Proceedings of the European Conference on Computer Systems (EuroSys), pp. 31–46 (2011)Google Scholar
  90. 90.
    Francisco, R., Correia, M.: Lucy in the sky without diamonds: Stealing confidential data in the cloud. In: IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W) (2011)Google Scholar
  91. 91.
    Tsai, W., Zhong, P., Elston, J., Bai, X., Chen, Y.: Service replication with MapReduce in clouds. In: Tenth International Symposium on Autonomous Decentralized Systems, pp. 381–388 (2011)Google Scholar
  92. 92.
    Cecchet, E., Singh, R., Sharma, U., Shenoy, P.: Dolly: virtualization-driven database provisioning for the cloud. In: Proceedings of the 7th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pp. 51–62 (2011)Google Scholar
  93. 93.
    Twin Peaks Software Inc. http://www.TwinPeakSoft.com. Accessed 04 May 2012
  94. 94.
    Twin Peaks Software Inc., Mirror File System for Cloud Computing. U.S Patent number: 7418439Google Scholar
  95. 95.
  96. 96.
    Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. In: VLDB, pp. 48–57 (2010)Google Scholar
  97. 97.
    Armbrust, M., Fox, A., Rean, G., Joseph, A., Katz, R., Konwinski, A., Gunho, L., David, P., Rabkin, A., Stoica, I., Zaharia, M.: Above the clouds: A Berkeley View of cloud Computing. Tech. Rep. UCB/EECS-2009-28, EECS Department, U.C. Berkeley (2009)Google Scholar
  98. 98.
    Khan, S.U., Min-Allah, N.: A goal programming based energy efficient resource allocation in data centers. J. Supercomput. 61(3), 502–519 (2012)CrossRefGoogle Scholar
  99. 99.
    Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: Proceedings of cloudDB’2009, pp. 17–24Google Scholar
  100. 100.
    Sellis, T.K., Roussopoulos, N., Faloutsos, C.: The R -tree: a dynamic index for multi-dimensional objects. VLDB J., pp. 507–518 (1987)Google Scholar
  101. 101.
    Garcia-Molina, H., Ullman, J.D., Widom, J.: Database System Implementation. Prentice Hall Inc, Upper Saddle River (1999)Google Scholar
  102. 102.
    Haojun, L., Han, J., Fang, J.: Multi-Dimensional index on Hadoop Distributed File System. In: IEEE 5th International Conference on Networking, Architecture and Storage (NAS) (2010)Google Scholar
  103. 103.
    Tiwari, R.G., Navathe, S.B., Kulkarni, G. J.: Towards transactional data management over the cloud. In: proceedings of Second International Symposium on Data, Privacy, and E-Commerce, pp. 100–107 (2010)Google Scholar
  104. 104.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W., Wallach, D., Burrows, M., Chandra, T., Fikes, A., Gruber, R.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)Google Scholar
  105. 105.
    Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Operating systems design and implementation, pp. 335–350 (2006)Google Scholar
  106. 106.
    Cooper, B., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Publ. Very Large Database (PVLDB) 1(2), 1277–1288 (2008)Google Scholar
  107. 107.
    Simmhan, Y., Barga, R., van Ingen, C., Lazowska, E., Szalay, A.: Building the Trident Scientific Workflow Workbench for Data Management in the cloud. In: Third International Conference on Advanced Engineering Computing and Applications in Sciences, 2009. ADVCOMP ’09, pp. 41–50 (2009)Google Scholar
  108. 108.
    Hey, T., Trefethen, A.: The Data Deluge: An e-Science Perspective, in Grid Computing: Making the Global Infrastructure a Reality. Wiley, Chichester (2003)Google Scholar
  109. 109.
    Barnes, C.R., Bornhold, B.D., Juniper, S.K., Pirenne, B., Phibbs, P.: The NEPTUNE Project–a cabled ocean observatory in the NE Pacific: Overview, challenges and scientific objectives for the installation and operation of Stage I in Canadian waters. In: Symposium on Underwater Technology and Workshop on Scientific Use of Submarine Cables and Related Technologies, pp. 308–313 (2007)Google Scholar
  110. 110.
    Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: European Professional Society for Systems (EuroSys), pp. 59–72 (2007)Google Scholar
  111. 111.
    Yu, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, U., Gunda, P., Currey, J.: Dryad linq: A system for general-purpose distributed dataparallel computing using a high-level language. In: OSDI, pp. 1–14 (2008)Google Scholar
  112. 112.
    Das, S., Agrawal, D., Abbadi, A.E.: Elastras: An elastic transactional data store in the cloud. In: Workshop on Hot Topics in Cloud Computing (2009)Google Scholar
  113. 113.
    Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (1992)MATHGoogle Scholar
  114. 114.
    Weikum, G., Vossen, G.: Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  115. 115.
    Aguilera, M.K., Merchant, A., Shah, M., Veitch, A., Karamanolis, C.: Sinfonia, A new paradigm for building scalable distributed systems. In: SOSP, pp. 159–174 (2007)Google Scholar
  116. 116.
    Hsieh, M., Chang, C., Ho, L.Y., Wu, J., Liu, P.: SQLMR : A scalable database management system for cloud computing. In: Proceedings of International Conference on Parallel Processing, pp. 315–324 (2011)Google Scholar
  117. 117.
    Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)CrossRefGoogle Scholar
  118. 118.
    Youn, H., Lee, D., Lee, B., Choi, J., Kim, H., Park, C., Su, L.: An efficient hybrid replication protocol for highly available distributed system. In: Proceedings of IASTED International Conference on Communications and Computer Networks, pp. 508–513 (2002)Google Scholar
  119. 119.
    Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of 7th ACM Symposium on Operating Systems Principles, pp. 150–162 (1979)Google Scholar
  120. 120.
    Agrawal, D., Abbadi, A.: The tree Quorum protocol: an efficient approach for managing replicated data. In: Proceedings of 16th Very Large Database Conference, pp. 243–254 (1990)Google Scholar
  121. 121.
    Taheri, J., Zomaya, A.Y., Bouvry, P., Khan, S.U.: Hopfield neural network for simultaneous job scheduling and data replication in grids. Future Gener. Comput. Syst. 29(8), 1885–1900 (2013)CrossRefGoogle Scholar
  122. 122.
    Khan, S.U., Ahmad, I.: Replicating data objects in large distributed database systems: an axiomatic game theoretical mechanism design approach. Distrib. Parallel Databases 28(2–3), 187–218 (2010)CrossRefGoogle Scholar
  123. 123.
    Moiz, S.A., Sailaja, P., Venkataswamy, G., Supriya, N.: Database replication: a survey of open source and commercial tools. Int. J. Comput. Appl. 13(6), 1–8 (2011)Google Scholar
  124. 124.
    Khan, S.U., Ahmad, I.: Non-cooperative, semi-cooperative, and cooperative games-based grid resource allocation. In: 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS). Rhodes Island, (2006)Google Scholar
  125. 125.
    Garcia-Molina, H., Lindsay, B.: Research directions for distributed databases. IEEE Q. Bull. Database Eng. 13(4), 12–17 (1990)Google Scholar
  126. 126.
    Stonebraker, M.: Future trends in database systems. IEEE Trans. Knowl. Data Eng. 1(1), 33–44 (1989)CrossRefGoogle Scholar
  127. 127.
    Razavi, A., Moschoyiannis, S., Krause, P.: Concurrency control and recovery management in open e-Business transactions. In: WoTUG Communicating Process Architectures, pp. 267–285 (2007)Google Scholar
  128. 128.
    Christmann, P., Härder, T.H., meyer-wegener, K., Sikeler, A.: Which kinds of OS mechanisms should be provided for database management. In: Nehmer, J. (ed.), Experiences with Distributed Systems, pp. 213–251. Springer, New YorkGoogle Scholar
  129. 129.
    GORDA Project: State of the Art in Database Replication Deliverable D1.1, http://gorda.di.uminho.pt/deliverables, Accessed on 08 June 2013 (2006)
  130. 130.
    Abdellatif, T., Cecchet, E., Lachaize, R.: Evaluation of a Group Communication Middleware for Clustered J2EE Application Servers. ODBASE, Cyprus (2004)CrossRefGoogle Scholar
  131. 131.
    Energy, STAR Data Center Energy Efficiency Initiatives, http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf?d7a4-0cec. Accessed 16 Aug 2012
  132. 132.
    Andersen, D.G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., Vasudevan, V.: FAWN: A fast array of wimpy nodes. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pp. 1–14 (2009)Google Scholar
  133. 133.
    Szalay, A.S., Bell, G.C., Huang, H.H., Terzis, A., White, A.: Low-power amdahl-balanced blades for data intensive computing. ACM SIGOPS Oper. Syst. Rev. 44(1), 71–75 (2010)CrossRefGoogle Scholar
  134. 134.
    Nedevschi, S., Popa, L., Iannaccone, G., Ratnasamy, S., Wetherall, D.: Reducing network energy consumption via sleeping and rate-adaptation. In: NSDI’08: Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pp. 323–336, Berkeley (2008). USENIX AssociationGoogle Scholar
  135. 135.
    Goiri, I., Le, K., Haque, M.E., Beauchea, R., Nguyen, T.D., Guitart, J., Torres, J., Bianchini, R.: GreenSlot: Scheduling Energy Consumption in Green Datacenters. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, p. 20 (2011)Google Scholar
  136. 136.
    Khan, S.U., Bouvry, P., Engel, T.: Energy-efficient high-performance parallel and distributed computing. J. Supercomput. 60(2), 163–164 (2012)CrossRefGoogle Scholar
  137. 137.
    Marzolla, M., Babaoglu, O., Panzieri, F.: Server Consolidation in Clouds through Gossiping, TR UBLCS-2011-01. Department of Computer Science, University of Bologna, Italy (2011)Google Scholar
  138. 138.
    Shen, X., Liao, W., Choudhary, A., Memik, G., Kandemir, M.: A high-performance application data environment for large-scale scientific computations. IEEE Trans. Parallel Distrib. Syst. 14(12), 1262–1274 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Saif Ur Rehman Malik
    • 1
  • Samee U. Khan
    • 2
  • Sam J. Ewen
    • 2
  • Nikos Tziritas
    • 3
  • Joanna Kolodziej
    • 4
  • Albert Y. Zomaya
    • 5
  • Sajjad A. Madani
    • 1
  • Nasro Min-Allah
    • 6
  • Lizhe Wang
    • 7
  • Cheng-Zhong Xu
    • 8
  • Qutaibah Marwan Malluhi
    • 9
  • Johnatan E. Pecero
    • 10
  • Pavan Balaji
    • 11
  • Abhinav Vishnu
    • 12
  • Rajiv Ranjan
    • 13
  • Sherali Zeadally
    • 14
  • Hongxiang Li
    • 15
  1. 1.COMSATS Institute of Information TechnologyIslamabadPakistan
  2. 2.North dakota State UniversityFargoUSA
  3. 3.Shenzhen Institute of Advanced TechnologyShenzhenChina
  4. 4.University of Bielsko-BialaBielsko-BialaPoland
  5. 5.University of SydneySydneyAustralia
  6. 6.University of DammamDammamSaudi Arabia
  7. 7.Chinese Academy of SciencesBeijingChina
  8. 8.Wayne State UniversityDetroitUSA
  9. 9.Qatar UniversityDohaQatar
  10. 10.University of LuxembourgWalferdangeLuxembourg
  11. 11.Argonne National LaboratoryLemontUSA
  12. 12.Pacific Northwest National LaboratoryRichlandUSA
  13. 13.CSIRO ICT CenterMarsfield, NSWAustralia
  14. 14.University of the District of ColumbiaWashingtonUSA
  15. 15.University of LouisvilleLouisvilleKentucky

Personalised recommendations