Abstract
This paper proposes a framework named Dynamic Access and Integration Services (DAIS) framework. The framework provides on-demand and large scale integrated data access in cloud computing environment. The sole motivation of this research paper is to enhance the performance of Cloud-Oriented Storage (COS) systems. DAIS framework consists of two basic modules namely (1) Adjacency COS Overlay Topology (ACOT) and (2) Cloud Storage Resource Discovery (CSRD). ACOT introduces a self-organizing well balanced m-way tree which removes the necessity of centralized approaches and adopts dynamic and scalable data access in cloud computing environment. CSRD module provides data integration mechanism and efficient resource discovery from geographically distributes COS systems. In this paper, Hadoop test-bed setup has been deployed for experiment and results analysis. All input and output data have been collected in the Hadoop Distributed File System (HDFS). In the experiments, performance, bandwidth and energy efficiency of the proposed DAIS framework have been demonstrated and compared with three different approaches like, Static DAIS, P2P and DSSM. To make it energy efficient, measurement unit and its mathematical model have been defined. Energy efficiency results have been compared with two popular approaches HBase and Hadoop-MR, and found better in most of the cases.
Similar content being viewed by others
References
Abadi, D.J.: Data management in the cloud: limitations and opportunities. IEEE Data Eng. Bull. 32(1), 3–12 (2009)
Abels, T., Dhawan, P., Chandrasekaran, B.: An overview of xen virtualization. Dell Power Solut. 8, 109–111 (2005)
Alsedairy, T., Qi, Y., Imran, A., Imran, M.A., Evans, B.: Self organising cloud cells: a resource efficient network densification strategy. Trans. Emerg. Telecommun. Technol. 26(8), 1096–1107 (2015)
Amazon, E.: Amazon web services. http://aws.amazon.com/es/ec2/ (2015)
Amazon, E.: Xeround cloud database for mysql applications. https://aws.amazon.com/marketplace/pp/B0085GVT7O (2015)
Anagnostopoulos, C., Savva, F., Triantafillou, P.: Scalable aggregation predictive analytics. Appl. Intell. 48(9), 2546–2567 (2018)
Anand, A., Dhingra, M., Lakshmi, J., Nandy, S.: Resource usage monitoring for kvm based virtual machines. In: 2012 18th International Conference on Advanced Computing and Communications (ADCOM), pp. 66–70. IEEE (2012)
Andrzejak, A., Graupner, S., Kotov, V., Trinks, H.: Algorithms for self-organization and adaptive service placement in dynamic distributed systems (2002)
Bai, L., Yan, L., Ma, Z.: Determining topological relationship of fuzzy spatiotemporal data integrated with xml twig pattern. Appl. Intell. 39(1), 75–100 (2013)
Baker, T., Rana, O.F., Calinescu, R., Tolosana-Calasanz, R., Bañares, J.Á.: Towards autonomic cloud services engineering via intention workflow model. In: International Conference on Grid Economics and Business Models, pp. 212–227. Springer (2013)
Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data Management for Peer-to-Peer Computing: A Vision. University of Trento, Trento (2002)
Bienko, C.D., Greenstein, M., Holt, S.E., Phillips, R.T., et al.: IBM Cloudant: Database as a Service Advanced Topics. IBM Redbooks, Markham (2015)
Birman, K., Chockler, G., van Renesse, R.: Toward a cloud computing research agenda. ACM SIGACt News 40(2), 68–80 (2009)
Bolte, M., Sievers, M., Birkenheuer, G., Niehörster, O., Brinkmann, A.: Non-intrusive virtualization management using libvirt. In: 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), pp. 574–579. IEEE (2010)
Brantner, M., Florescu, D., Graf, D., Kossmann, D., Kraska, T.: Building a database on s3. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 251–264. ACM (2008)
Buyya, R., et al.: High Performance Cluster Computing: Architectures and Systems (Volume 1). Prentice Hall, Upper Saddle River (1999)
Buyya, R., Yeo, C.S., Venugopal, S.: Market-oriented cloud computing: vision, hype, and reality for delivering it services as computing utilities. In: 2008 10th IEEE International Conference on High Performance Computing and Communications, pp. 5–13. IEEE (2008)
Campos, M.M., Carpenter, G.A.: S-tree: self-organizing trees for data clustering and online vector quantization. Neural Netw. 14(4–5), 505–525 (2001)
Caprarescu, B.A., Calcavecchia, N.M., Di Nitto, E., Dubois, D.J.: Sos cloud: self-organizing services in the cloud. In: International Conference on Bio-Inspired Models of Network, Information, and Computing Systems, pp. 48–55. Springer (2010)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008)
Chebotko, A., Kashlev, A., Lu, S.: A big data modeling methodology for apache cassandra. In: Big Data (BigData Congress), 2015 IEEE International Congress on, pp. 238–245. IEEE (2015)
Chodorow, K., Mongo, D.B.: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media Inc., Newton (2013)
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)
Curbera, F., Duftler, M., Khalaf, R., Nagy, W., Mukhi, N., Weerawarana, S.: Unraveling the web services web: an introduction to soap, wsdl, and uddi. IEEE Internet Comput. 6(2), 86–93 (2002)
Das, S., Agrawal, D., El Abbadi, A.: G-store: a scalable data store for transactional multi key access in the cloud. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 163–174. ACM (2010)
Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, vol. 41, pp. 205–220. ACM (2007)
Deka, G.C.: A survey of cloud database systems. IT Prof. 16(2), 50–57 (2014)
Deng, Y., Wang, F.: A heterogeneous storage grid enabled by grid service. ACM SIGOPS Oper. Syst. Rev. 41(1), 7–13 (2007)
Deng, Y., Wang, F., Helian, N., Wu, S., Liao, C.: Dynamic and scalable storage management architecture for grid oriented storage devices. Parallel Comput. 34(1), 17–31 (2008)
Deng, Y., Wang, F., Ciura, A.: Ant colony optimization inspired resource discovery in p2p grid systems. J. Supercomput. 49(1), 4–21 (2009)
Dillon, T., Wu, C., Chang, E.: Cloud computing: issues and challenges. In: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 27–33. IEEE (2010)
Drakopoulos, G., Baroutiadi, A., Megalooikonomou, V.: Higher order graph centrality measures for neo4j. In: 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–6. IEEE (2015)
Du, S., Khan, A., PalChaudhuri, S., Post, A., Saha, A.K., Druschel, P., Johnson, D.B., Riedi, R.: Safari: a self-organizing, hierarchical architecture for scalable ad hoc networking. Ad Hoc Netw. 6(4), 485–507 (2008)
Feinberg, A.: Project voldemort: Reliable distributed storage. In: Proceedings of the 10th IEEE International Conference on Data Engineering (2011)
Gibson, G.A., Van Meter, R.: Network attached storage architecture. Commun. ACM 43(11), 37–45 (2000)
Gonzalez, H., Halevy, A., Jensen, C.S., Langen, A., Madhavan, J., Shapley, R., Shen, W.: Google fusion tables: data management, integration and collaboration in the cloud. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 175–180. ACM (2010)
Gutierrez-Garcia, J.O., Sim, K.M.: Agent-based cloud service composition. Appl. Intell. 38(3), 436–464 (2013)
Hacigumus, H., Iyer, B., Mehrotra, S.: Providing database as a service. In: Proceedings 18th International Conference on Data Engineering, pp. 29–38. IEEE (2002)
Hai, H., Sakoda, S.: SaaS and integration best practices. Fujitsu Sci. Tech. J. 45(3), 257–264 (2009)
Hasan, R., Anwar, Z., Yurcik, W., Brumbaugh, L., Campbell, R.: A survey of peer-to-peer storage techniques for distributed file systems. In: International Conference on Information Technology: Coding and Computing (ITCC’05)-Volume II, vol. 2, pp. 205–213. IEEE (2005)
Hashizume, K., Rosado, D.G., Fernández-Medina, E., Fernandez, E.B.: An analysis of security issues for cloud computing. J. Internet Serv. Appl. 4(1), 5 (2013)
Hogan, M.: Cloud computing & databases. ScaleDB Inc, How databases can meet the demands of cloud computing (2008)
Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 195–204. ACM (2013)
Huai, Y., Chauhan, A., Gates, A., Hagleitner, G., Hanson, E.N., O’Malley, O., Pandey, J., Yuan, Y., Lee, R., Zhang, X.: Major technical advancements in apache hive. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 1235–1246. ACM (2014)
Istin, M.D., Visan, A., Pop, F., Cristea, V.: Sopsys: Self-organizing decentralized peer-to-peer system based on well balanced multi-way trees. In: P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2010 International Conference on, pp. 369–374. IEEE (2010)
Jha, S., Merzky, A., Fox, G.: Using clouds to provide grids with higher levels of abstraction and explicit support for usage modes. Concurr. Comput. 21(8), 1087–1108 (2009)
Jiang, X., Xu, D.: Violin: Virtual internetworking on overlay infrastructure. In: International Symposium on Parallel and Distributed Processing and Applications, pp. 937–946. Springer (2004)
Kadadi, A.: Challenges of data integration in big data. Ph.D. thesis, North Carolina Agricultural and Technical State University (2015)
Kang, J., Sim, K.M.: A multiagent brokering protocol for supporting grid resource discovery. Appl. Intell. 37(4), 527–542 (2012)
Krishnaswamy, J.: Microsoft SQL Azure Enterprise Application Development. Packt Publishing Ltd, Birmingham (2010)
Kumar, A., Bawa, S.: Virtualization of large-scale data storage system to achieve dynamicity and scalability in grid computing. In: Advances in Computer Science, Engineering & Applications, pp. 323–331. Springer (2012)
Kumar, A., Bawa, S.: Adjacency cloud-oriented storage overlay topology using self-organizing m-way tree. In: 2019 International Conference on Innovative Computing and Communication (ICICC-2019), p. 76. Springer (2019)
Lasecki, W.S., Miller, C.D., Naim, I., Kushalnagar, R., Sadilek, A., Gildea, D., Bigham, J.P.: Scribe: deep integration of human and machine intelligence to caption speech in real time. Commun. ACM 60(9), 93–100 (2017)
Lin, W., Qi, D.: Research on resource self-organizing model for cloud computing. In: Internet technology and applications, 2010 international conference on, pp. 1–5. IEEE (2010)
Liu, D., Zhu, H., Xu, C., Bayley, I., Lightfoot, D., Green, M., Marshall, P.: Cide: An integrated development environment for microservices. In: 2016 IEEE International Conference on Services Computing (SCC), pp. 808–812. IEEE (2016)
Loganayagi, B., Sujatha, S.: Creating virtual platform for cloud computing. In: 2010 IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–4. IEEE (2010)
Ma, J., Chen, T., Wu, S., Yang, C., Bai, M., Shu, K., Li, K., Zhang, G., Jin, Z., He, F., et al.: iprox: an integrated proteome resource. Nucleic Acids Res. 47(D1), D1211–D1217 (2018)
Mastroianni, C., Meo, M., Papuzzo, G.: Probabilistic consolidation of virtual machines in self-organizing cloud data centers. IEEE Trans. Cloud Comput. 1(2), 215–228 (2013)
Mc Evoy, G.V., Schulze, B.: Using clouds to address grid limitations. In: Proceedings of the 6th International Workshop on Middleware for Grid Computing, p. 11. ACM (2008)
Mehta, H., Kanungo, P., Chandwani, M.: Generic data access and integration service for distributed computing environment. Int. J. Grid Comput. Appl. 1, 14–21 (2010)
Mollah, M.B., Islam, K.R., Islam, S.S.: Next generation of computing through cloud computing technology. In: 2012 25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 1–6. IEEE (2012)
Mykletun, E., Tsudik, G.: Aggregation queries in the database-as-a-service model. In: IFIP Annual Conference on Data and Applications Security and Privacy, pp. 89–103. Springer (2006)
Nambiar, R.O., Poess, M.: The making of tpc-ds. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 1049–1058. VLDB Endowment (2006)
Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: Eucalyptus: A technical report on an elastic utility computing architecture linking your programs to useful systems. In: UCSB Technical Report. Citeseer (2008)
Oestreicher, K.: A forensically robust method for acquisition of icloud data. Digit. Investig. 11, S106–S113 (2014)
Pavlovich, T.A., Ivanovich, T.G.: Analysis of cloud services integration with enterprise information systems. J. Theor. Appl. Inf. Technol. 82(2), 257–265 (2015)
Peacock, M.: Creating Development Environments with Vagrant. Packt Publishing Ltd, Birmingham (2013)
Poess, M., Rabl, T., Jacobsen, H.A., Caufield, B.: Tpc-di: the first industry benchmark for data integration. Proc. VLDB Endow. 7(13), 1367–1378 (2014)
Pournaras, E., Warnier, M., Brazier, F.M.: Adaptive self-organization in distributed tree topologies. Int. J. Distrib. Syst. Technol. (IJDST) 5(3), 24–57 (2014)
Prahlad, A., Schwartz, J.: Systems and methods for performing storage operations using network attached storage (2009). US Patent 7,546,324
Qiu, M., Gai, K., Thuraisingham, B., Tao, L., Zhao, H.: Proactive user-centric secure data scheme using attribute-based semantic access controls for mobile clouds in financial industry. Future Gener. Comput. Syst. 80, 421–429 (2018)
Rings, T., Caryer, G., Gallop, J., Grabowski, J., Kovacikova, T., Schulz, S., Stokes-Rees, I.: Grid and cloud computing: opportunities for integration with the next generation network. J. Grid Comput. 7(3), 375 (2009)
Rochwerger, B., Breitgand, D., Levy, E., Galis, A., Nagin, K., Llorente, I.M., Montero, R., Wolfsthal, Y., Elmroth, E., Caceres, J., et al.: The reservoir model and architecture for open federated cloud computing. IBM J. Res. Dev. 53(4), 4–1 (2009)
Shvachko, K., Kuang, H., Radia, S., Chansler, R., et al.: The hadoop distributed file system. MSST 10, 1–10 (2010)
Song, J., Li, T., Liu, X., Zhu, Z.: Comparing and analyzing the energy efficiency of cloud database and parallel database. In: Advances in Computer Science, Engineering & Applications, pp. 989–997. Springer (2012)
Thor, A., Rahm, E.: Cloudfuice: A flexible cloud-based data integration system. In: International Conference on Web Engineering, pp. 304–318. Springer (2011)
Tsai, W.T., Sun, X., Balasooriya, J.: Service-oriented cloud computing architecture. In: 2010 Seventh International Conference on Information Technology: New Generations, pp. 684–689. IEEE (2010)
Voas, J., Zhang, J.: Cloud computing: new wine or just a new bottle? IT Prof. 11(2), 15–17 (2009)
Vora, M.N.: Hadoop-hbase for large-scale data. In: 2011 International Conference onComputer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011)
Yee, T.T., Naing, T.T.: Pc-cluster based storage system architecture for cloud storage. Int. J. Cloud Comput. Serv. Archit. (IJCCSA) 1(3), 117–128 (2011)
Yi, S., Kondo, D., Andrzejak, A.: Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp. 236–243. IEEE (2010)
Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. J. Grid Comput. 3(3–4), 171–200 (2005)
Zheng, Z., Zhu, J., Lyu, M.R.: Service-generated big data and big data-as-a-service: an overview. In: 2013 IEEE International Congress on Big Data, pp. 403–410. IEEE (2013)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, A., Bawa, S. DAIS: dynamic access and integration services framework for cloud-oriented storage systems. Cluster Comput 23, 3289–3308 (2020). https://doi.org/10.1007/s10586-020-03088-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-020-03088-0