CAD: An Efficient Data Management and Migration Scheme across Clouds for Data-Intensive Scientific Applications
Data management and migration are important research challenges of novel Cloud environments. While moving data among different geographical domains, it is important to lower the transmission cost for performance purposes. Efficient scheduling methods allow us to manage data transmissions with lower number of steps and shorter transmission time. In previous research efforts, several methods have been proposed in literature in order to manage data and minimize transmission cost for the case of Single Cluster environments. Unfortunately, these methods are not suitable to large-scale and complicated environments such as Clouds, with particular regard to the case of scheduling policies. Starting from these motivations, in this paper we propose an efficient data transmission method for data-intensive scientific applications over Clouds, called Cloud Adaptive Dispatching (CAD). This method adapts to specialized characteristics of Cloud systems and successfully shortens the transmission cost, while also avoiding node contention during moving data from sites to sites. We conduct an extensive campaign of experiments focused to test the effective performance of CAD. Results clearly demonstrate the improvements offered by CAD in supporting data transmissions across Clouds for data-intensive scientific applications.
KeywordsSchedule Algorithm Cloud Environment Migration Scheme Transmission Cost Transmission Schedule
Unable to display preview. Download preview PDF.
- 1.de Assuncao, M.D., di Costanzo, A., Buyya, R.: Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, pp. 141–150 (June 2009)Google Scholar
- 5.Castillo, C., Rouskas, G.N., Harfoush, K.: Efficient Resource Management Using Advance Reservations for Heterogeneous Grids. In: Proceedings of 21st IEEE International Parallel and Distributed Processing, pp. 1–12 (April 2008)Google Scholar
- 19.Liu, H., Orban, D.: GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications. In: Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid, May 2008, pp. 295–304 (2008)Google Scholar
- 23.Tu, M., Li, P., Ma, Q., Yen, I.-L., Bastani, F.B.: On the Optimal Placement of Secure Data Objects over Internet. In: Proceedings of 19th IEEE International Parallel and Distributed Processing, pp. 14–14 (April 2005)Google Scholar
- 26.Wee, S., Liu, H.: Client-Side Load Balancer using Cloud. In: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 399–405 (March 2010)Google Scholar
- 28.Yang, Y., Liu, K., Chen, J., Liu, X., Yuan, D., Jin, H.: An Algorithm in SwinDeW-C for Scheduling Transaction-Intensive Cost-Constrained Cloud Workflows. In: Proceedings of the 4th IEEE International Conference on eScience, pp. 374–375 (December 2008)Google Scholar