DAIS: dynamic access and integration services framework for cloud-oriented storage systems

Abstract

This paper proposes a framework named Dynamic Access and Integration Services (DAIS) framework. The framework provides on-demand and large scale integrated data access in cloud computing environment. The sole motivation of this research paper is to enhance the performance of Cloud-Oriented Storage (COS) systems. DAIS framework consists of two basic modules namely (1) Adjacency COS Overlay Topology (ACOT) and (2) Cloud Storage Resource Discovery (CSRD). ACOT introduces a self-organizing well balanced m-way tree which removes the necessity of centralized approaches and adopts dynamic and scalable data access in cloud computing environment. CSRD module provides data integration mechanism and efficient resource discovery from geographically distributes COS systems. In this paper, Hadoop test-bed setup has been deployed for experiment and results analysis. All input and output data have been collected in the Hadoop Distributed File System (HDFS). In the experiments, performance, bandwidth and energy efficiency of the proposed DAIS framework have been demonstrated and compared with three different approaches like, Static DAIS, P2P and DSSM. To make it energy efficient, measurement unit and its mathematical model have been defined. Energy efficiency results have been compared with two popular approaches HBase and Hadoop-MR, and found better in most of the cases.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

References

  1. 1.

    Abadi, D.J.: Data management in the cloud: limitations and opportunities. IEEE Data Eng. Bull. 32(1), 3–12 (2009)

    MathSciNet  Google Scholar 

  2. 2.

    Abels, T., Dhawan, P., Chandrasekaran, B.: An overview of xen virtualization. Dell Power Solut. 8, 109–111 (2005)

    Google Scholar 

  3. 3.

    Alsedairy, T., Qi, Y., Imran, A., Imran, M.A., Evans, B.: Self organising cloud cells: a resource efficient network densification strategy. Trans. Emerg. Telecommun. Technol. 26(8), 1096–1107 (2015)

    Google Scholar 

  4. 4.

    Amazon, E.: Amazon web services. http://aws.amazon.com/es/ec2/ (2015)

  5. 5.

    Amazon, E.: Xeround cloud database for mysql applications. https://aws.amazon.com/marketplace/pp/B0085GVT7O (2015)

  6. 6.

    Anagnostopoulos, C., Savva, F., Triantafillou, P.: Scalable aggregation predictive analytics. Appl. Intell. 48(9), 2546–2567 (2018)

    Google Scholar 

  7. 7.

    Anand, A., Dhingra, M., Lakshmi, J., Nandy, S.: Resource usage monitoring for kvm based virtual machines. In: 2012 18th International Conference on Advanced Computing and Communications (ADCOM), pp. 66–70. IEEE (2012)

  8. 8.

    Andrzejak, A., Graupner, S., Kotov, V., Trinks, H.: Algorithms for self-organization and adaptive service placement in dynamic distributed systems (2002)

  9. 9.

    Bai, L., Yan, L., Ma, Z.: Determining topological relationship of fuzzy spatiotemporal data integrated with xml twig pattern. Appl. Intell. 39(1), 75–100 (2013)

    Google Scholar 

  10. 10.

    Baker, T., Rana, O.F., Calinescu, R., Tolosana-Calasanz, R., Bañares, J.Á.: Towards autonomic cloud services engineering via intention workflow model. In: International Conference on Grid Economics and Business Models, pp. 212–227. Springer (2013)

  11. 11.

    Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data Management for Peer-to-Peer Computing: A Vision. University of Trento, Trento (2002)

    Google Scholar 

  12. 12.

    Bienko, C.D., Greenstein, M., Holt, S.E., Phillips, R.T., et al.: IBM Cloudant: Database as a Service Advanced Topics. IBM Redbooks, Markham (2015)

    Google Scholar 

  13. 13.

    Birman, K., Chockler, G., van Renesse, R.: Toward a cloud computing research agenda. ACM SIGACt News 40(2), 68–80 (2009)

    Google Scholar 

  14. 14.

    Bolte, M., Sievers, M., Birkenheuer, G., Niehörster, O., Brinkmann, A.: Non-intrusive virtualization management using libvirt. In: 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), pp. 574–579. IEEE (2010)

  15. 15.

    Brantner, M., Florescu, D., Graf, D., Kossmann, D., Kraska, T.: Building a database on s3. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 251–264. ACM (2008)

  16. 16.

    Buyya, R., et al.: High Performance Cluster Computing: Architectures and Systems (Volume 1). Prentice Hall, Upper Saddle River (1999)

    Google Scholar 

  17. 17.

    Buyya, R., Yeo, C.S., Venugopal, S.: Market-oriented cloud computing: vision, hype, and reality for delivering it services as computing utilities. In: 2008 10th IEEE International Conference on High Performance Computing and Communications, pp. 5–13. IEEE (2008)

  18. 18.

    Campos, M.M., Carpenter, G.A.: S-tree: self-organizing trees for data clustering and online vector quantization. Neural Netw. 14(4–5), 505–525 (2001)

    Google Scholar 

  19. 19.

    Caprarescu, B.A., Calcavecchia, N.M., Di Nitto, E., Dubois, D.J.: Sos cloud: self-organizing services in the cloud. In: International Conference on Bio-Inspired Models of Network, Information, and Computing Systems, pp. 48–55. Springer (2010)

  20. 20.

    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008)

    Google Scholar 

  21. 21.

    Chebotko, A., Kashlev, A., Lu, S.: A big data modeling methodology for apache cassandra. In: Big Data (BigData Congress), 2015 IEEE International Congress on, pp. 238–245. IEEE (2015)

  22. 22.

    Chodorow, K., Mongo, D.B.: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media Inc., Newton (2013)

    Google Scholar 

  23. 23.

    Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)

    Google Scholar 

  24. 24.

    Curbera, F., Duftler, M., Khalaf, R., Nagy, W., Mukhi, N., Weerawarana, S.: Unraveling the web services web: an introduction to soap, wsdl, and uddi. IEEE Internet Comput. 6(2), 86–93 (2002)

    Google Scholar 

  25. 25.

    Das, S., Agrawal, D., El Abbadi, A.: G-store: a scalable data store for transactional multi key access in the cloud. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 163–174. ACM (2010)

  26. 26.

    Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)

    Google Scholar 

  27. 27.

    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, vol. 41, pp. 205–220. ACM (2007)

  28. 28.

    Deka, G.C.: A survey of cloud database systems. IT Prof. 16(2), 50–57 (2014)

    Google Scholar 

  29. 29.

    Deng, Y., Wang, F.: A heterogeneous storage grid enabled by grid service. ACM SIGOPS Oper. Syst. Rev. 41(1), 7–13 (2007)

    Google Scholar 

  30. 30.

    Deng, Y., Wang, F., Helian, N., Wu, S., Liao, C.: Dynamic and scalable storage management architecture for grid oriented storage devices. Parallel Comput. 34(1), 17–31 (2008)

    Google Scholar 

  31. 31.

    Deng, Y., Wang, F., Ciura, A.: Ant colony optimization inspired resource discovery in p2p grid systems. J. Supercomput. 49(1), 4–21 (2009)

    Google Scholar 

  32. 32.

    Dillon, T., Wu, C., Chang, E.: Cloud computing: issues and challenges. In: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 27–33. IEEE (2010)

  33. 33.

    Drakopoulos, G., Baroutiadi, A., Megalooikonomou, V.: Higher order graph centrality measures for neo4j. In: 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–6. IEEE (2015)

  34. 34.

    Du, S., Khan, A., PalChaudhuri, S., Post, A., Saha, A.K., Druschel, P., Johnson, D.B., Riedi, R.: Safari: a self-organizing, hierarchical architecture for scalable ad hoc networking. Ad Hoc Netw. 6(4), 485–507 (2008)

    Google Scholar 

  35. 35.

    Feinberg, A.: Project voldemort: Reliable distributed storage. In: Proceedings of the 10th IEEE International Conference on Data Engineering (2011)

  36. 36.

    Gibson, G.A., Van Meter, R.: Network attached storage architecture. Commun. ACM 43(11), 37–45 (2000)

    Google Scholar 

  37. 37.

    Gonzalez, H., Halevy, A., Jensen, C.S., Langen, A., Madhavan, J., Shapley, R., Shen, W.: Google fusion tables: data management, integration and collaboration in the cloud. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 175–180. ACM (2010)

  38. 38.

    Gutierrez-Garcia, J.O., Sim, K.M.: Agent-based cloud service composition. Appl. Intell. 38(3), 436–464 (2013)

    Google Scholar 

  39. 39.

    Hacigumus, H., Iyer, B., Mehrotra, S.: Providing database as a service. In: Proceedings 18th International Conference on Data Engineering, pp. 29–38. IEEE (2002)

  40. 40.

    Hai, H., Sakoda, S.: SaaS and integration best practices. Fujitsu Sci. Tech. J. 45(3), 257–264 (2009)

    Google Scholar 

  41. 41.

    Hasan, R., Anwar, Z., Yurcik, W., Brumbaugh, L., Campbell, R.: A survey of peer-to-peer storage techniques for distributed file systems. In: International Conference on Information Technology: Coding and Computing (ITCC’05)-Volume II, vol. 2, pp. 205–213. IEEE (2005)

  42. 42.

    Hashizume, K., Rosado, D.G., Fernández-Medina, E., Fernandez, E.B.: An analysis of security issues for cloud computing. J. Internet Serv. Appl. 4(1), 5 (2013)

    Google Scholar 

  43. 43.

    Hogan, M.: Cloud computing & databases. ScaleDB Inc, How databases can meet the demands of cloud computing (2008)

  44. 44.

    Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 195–204. ACM (2013)

  45. 45.

    Huai, Y., Chauhan, A., Gates, A., Hagleitner, G., Hanson, E.N., O’Malley, O., Pandey, J., Yuan, Y., Lee, R., Zhang, X.: Major technical advancements in apache hive. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 1235–1246. ACM (2014)

  46. 46.

    Istin, M.D., Visan, A., Pop, F., Cristea, V.: Sopsys: Self-organizing decentralized peer-to-peer system based on well balanced multi-way trees. In: P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2010 International Conference on, pp. 369–374. IEEE (2010)

  47. 47.

    Jha, S., Merzky, A., Fox, G.: Using clouds to provide grids with higher levels of abstraction and explicit support for usage modes. Concurr. Comput. 21(8), 1087–1108 (2009)

    Google Scholar 

  48. 48.

    Jiang, X., Xu, D.: Violin: Virtual internetworking on overlay infrastructure. In: International Symposium on Parallel and Distributed Processing and Applications, pp. 937–946. Springer (2004)

  49. 49.

    Kadadi, A.: Challenges of data integration in big data. Ph.D. thesis, North Carolina Agricultural and Technical State University (2015)

  50. 50.

    Kang, J., Sim, K.M.: A multiagent brokering protocol for supporting grid resource discovery. Appl. Intell. 37(4), 527–542 (2012)

    Google Scholar 

  51. 51.

    Krishnaswamy, J.: Microsoft SQL Azure Enterprise Application Development. Packt Publishing Ltd, Birmingham (2010)

    Google Scholar 

  52. 52.

    Kumar, A., Bawa, S.: Virtualization of large-scale data storage system to achieve dynamicity and scalability in grid computing. In: Advances in Computer Science, Engineering & Applications, pp. 323–331. Springer (2012)

  53. 53.

    Kumar, A., Bawa, S.: Adjacency cloud-oriented storage overlay topology using self-organizing m-way tree. In: 2019 International Conference on Innovative Computing and Communication (ICICC-2019), p. 76. Springer (2019)

  54. 54.

    Lasecki, W.S., Miller, C.D., Naim, I., Kushalnagar, R., Sadilek, A., Gildea, D., Bigham, J.P.: Scribe: deep integration of human and machine intelligence to caption speech in real time. Commun. ACM 60(9), 93–100 (2017)

    Google Scholar 

  55. 55.

    Lin, W., Qi, D.: Research on resource self-organizing model for cloud computing. In: Internet technology and applications, 2010 international conference on, pp. 1–5. IEEE (2010)

  56. 56.

    Liu, D., Zhu, H., Xu, C., Bayley, I., Lightfoot, D., Green, M., Marshall, P.: Cide: An integrated development environment for microservices. In: 2016 IEEE International Conference on Services Computing (SCC), pp. 808–812. IEEE (2016)

  57. 57.

    Loganayagi, B., Sujatha, S.: Creating virtual platform for cloud computing. In: 2010 IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–4. IEEE (2010)

  58. 58.

    Ma, J., Chen, T., Wu, S., Yang, C., Bai, M., Shu, K., Li, K., Zhang, G., Jin, Z., He, F., et al.: iprox: an integrated proteome resource. Nucleic Acids Res. 47(D1), D1211–D1217 (2018)

    Google Scholar 

  59. 59.

    Mastroianni, C., Meo, M., Papuzzo, G.: Probabilistic consolidation of virtual machines in self-organizing cloud data centers. IEEE Trans. Cloud Comput. 1(2), 215–228 (2013)

    Google Scholar 

  60. 60.

    Mc Evoy, G.V., Schulze, B.: Using clouds to address grid limitations. In: Proceedings of the 6th International Workshop on Middleware for Grid Computing, p. 11. ACM (2008)

  61. 61.

    Mehta, H., Kanungo, P., Chandwani, M.: Generic data access and integration service for distributed computing environment. Int. J. Grid Comput. Appl. 1, 14–21 (2010)

    Google Scholar 

  62. 62.

    Mollah, M.B., Islam, K.R., Islam, S.S.: Next generation of computing through cloud computing technology. In: 2012 25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 1–6. IEEE (2012)

  63. 63.

    Mykletun, E., Tsudik, G.: Aggregation queries in the database-as-a-service model. In: IFIP Annual Conference on Data and Applications Security and Privacy, pp. 89–103. Springer (2006)

  64. 64.

    Nambiar, R.O., Poess, M.: The making of tpc-ds. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 1049–1058. VLDB Endowment (2006)

  65. 65.

    Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: Eucalyptus: A technical report on an elastic utility computing architecture linking your programs to useful systems. In: UCSB Technical Report. Citeseer (2008)

  66. 66.

    Oestreicher, K.: A forensically robust method for acquisition of icloud data. Digit. Investig. 11, S106–S113 (2014)

    Google Scholar 

  67. 67.

    Pavlovich, T.A., Ivanovich, T.G.: Analysis of cloud services integration with enterprise information systems. J. Theor. Appl. Inf. Technol. 82(2), 257–265 (2015)

    Google Scholar 

  68. 68.

    Peacock, M.: Creating Development Environments with Vagrant. Packt Publishing Ltd, Birmingham (2013)

    Google Scholar 

  69. 69.

    Poess, M., Rabl, T., Jacobsen, H.A., Caufield, B.: Tpc-di: the first industry benchmark for data integration. Proc. VLDB Endow. 7(13), 1367–1378 (2014)

    Google Scholar 

  70. 70.

    Pournaras, E., Warnier, M., Brazier, F.M.: Adaptive self-organization in distributed tree topologies. Int. J. Distrib. Syst. Technol. (IJDST) 5(3), 24–57 (2014)

    Google Scholar 

  71. 71.

    Prahlad, A., Schwartz, J.: Systems and methods for performing storage operations using network attached storage (2009). US Patent 7,546,324

  72. 72.

    Qiu, M., Gai, K., Thuraisingham, B., Tao, L., Zhao, H.: Proactive user-centric secure data scheme using attribute-based semantic access controls for mobile clouds in financial industry. Future Gener. Comput. Syst. 80, 421–429 (2018)

    Google Scholar 

  73. 73.

    Rings, T., Caryer, G., Gallop, J., Grabowski, J., Kovacikova, T., Schulz, S., Stokes-Rees, I.: Grid and cloud computing: opportunities for integration with the next generation network. J. Grid Comput. 7(3), 375 (2009)

    Google Scholar 

  74. 74.

    Rochwerger, B., Breitgand, D., Levy, E., Galis, A., Nagin, K., Llorente, I.M., Montero, R., Wolfsthal, Y., Elmroth, E., Caceres, J., et al.: The reservoir model and architecture for open federated cloud computing. IBM J. Res. Dev. 53(4), 4–1 (2009)

    Google Scholar 

  75. 75.

    Shvachko, K., Kuang, H., Radia, S., Chansler, R., et al.: The hadoop distributed file system. MSST 10, 1–10 (2010)

    Google Scholar 

  76. 76.

    Song, J., Li, T., Liu, X., Zhu, Z.: Comparing and analyzing the energy efficiency of cloud database and parallel database. In: Advances in Computer Science, Engineering & Applications, pp. 989–997. Springer (2012)

  77. 77.

    Thor, A., Rahm, E.: Cloudfuice: A flexible cloud-based data integration system. In: International Conference on Web Engineering, pp. 304–318. Springer (2011)

  78. 78.

    Tsai, W.T., Sun, X., Balasooriya, J.: Service-oriented cloud computing architecture. In: 2010 Seventh International Conference on Information Technology: New Generations, pp. 684–689. IEEE (2010)

  79. 79.

    Voas, J., Zhang, J.: Cloud computing: new wine or just a new bottle? IT Prof. 11(2), 15–17 (2009)

    Google Scholar 

  80. 80.

    Vora, M.N.: Hadoop-hbase for large-scale data. In: 2011 International Conference onComputer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011)

  81. 81.

    Yee, T.T., Naing, T.T.: Pc-cluster based storage system architecture for cloud storage. Int. J. Cloud Comput. Serv. Archit. (IJCCSA) 1(3), 117–128 (2011)

    Google Scholar 

  82. 82.

    Yi, S., Kondo, D., Andrzejak, A.: Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp. 236–243. IEEE (2010)

  83. 83.

    Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. J. Grid Comput. 3(3–4), 171–200 (2005)

    Google Scholar 

  84. 84.

    Zheng, Z., Zhu, J., Lyu, M.R.: Service-generated big data and big data-as-a-service: an overview. In: 2013 IEEE International Congress on Big Data, pp. 403–410. IEEE (2013)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ajay Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kumar, A., Bawa, S. DAIS: dynamic access and integration services framework for cloud-oriented storage systems. Cluster Comput 23, 3289–3308 (2020). https://doi.org/10.1007/s10586-020-03088-0

Download citation

Keywords

  • Cloud computing
  • Cloud storage cluster
  • Self-organizing
  • Dynamicity
  • Scalability
  • Storage services
  • Integration services