Security-driven scheduling for data-intensive applications on grids
Security-sensitive applications that access and generate large data sets are emerging in various areas including bioinformatics and high energy physics. Data grids provide such data-intensive applications with a large virtual storage framework with unlimited power. However, conventional scheduling algorithms for data grids are unable to meet the security needs of data-intensive applications. In this paper we address the problem of scheduling data-intensive jobs on data grids subject to security constraints. Using a security- and data-aware technique, a dynamic scheduling strategy is proposed to improve quality of security for data-intensive applications running on data grids. To incorporate security into job scheduling, we introduce a new performance metric, degree of security deficiency, to quantitatively measure quality of security provided by a data grid. Results based on a real-world trace confirm that the proposed scheduling strategy significantly improves security and performance over four existing scheduling algorithms by up to 810% and 1478%, respectively.
KeywordsScheduling Security-sensitive application Data grid Degree of security deficiency
Unable to display preview. Download preview PDF.
- 1.Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. J. Netw. Comput. Appl. 23, 187–200 (2001) Google Scholar
- 2.Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. Journal Supercomput. Appl. 15(3), 200–222 (2001) Google Scholar
- 3.Keahey, K., Welch, V.: Fine-grain authorization for resource management in the grid environment. In: Proc. Int’l Workshop Grid Computing, 2002 Google Scholar
- 4.Novotny, J., Tuecke, S., Welch, V.: An online credential repository for the grid: MyProxy. In: Proc. Int’l Symp. High Performance Distributed Computing, August 2001 Google Scholar
- 5.Park, S.-M., Kim, J.-H.: Chameleon: a resource scheduler in a data grid environment. In: Proc. Int’l Symp. Cluster Computing and the Grid, 2003 Google Scholar
- 6.Qin, X., Jiang, H.: Data grids: supporting data-intensive applications in wide area networks. In: Yang, L., Guo, M.-Y. (eds.) High Performance Computing: Paradigm and Infrastructure, Wiley, Hoboken (2004) Google Scholar
- 7.Ranganathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: Proc. IEEE Int. Symp. High Performance Distributed Computing, 2002, pp. 352–358 Google Scholar
- 8.Song, S., Kwok, Y.-K., Hwang, K.: Trusted job scheduling in open computational grids: security-driven heuristics and a fast genetic algorithms. In: Proc. Int’l Symp. Parallel and Distributed Processing, 2005 Google Scholar
- 9.Welch, V., Siebenlist, F., Foster, I., Bresnahan, J., Czajkowski, K., Gawor, J., Kesselman, C., Meder, S., Pearlman, L., Tuecke, S.: Security for grid services. In: Proc. Int’l Symp. High Performance Distr. Computing, 2003 Google Scholar
- 11.Xie, T., Qin, X., Sung, A.: SAREC: a security-aware scheduling strategy for real-time applications on clusters. In: Proc. 34th Int’l Conf. Parallel Processing, Norway, June 2005 Google Scholar
- 12.Xie, T., Qin, X.: Enhancing security of real-time applications on grids through dynamic scheduling. In: Proc. 11th Workshop Job Scheduling Strategies for Parallel Processing, MA, June 2005 Google Scholar