DECO: Data Replication and Execution CO-scheduling for Utility Grids

  • Vikas Agarwal
  • Gargi Dasgupta
  • Koustuv Dasgupta
  • Amit Purohit
  • Balaji Viswanathan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4294)


Vendor strategies to standardize grid computing as the IT backbone for service-oriented architectures have created business opportunities to offer grid as a utility service for compute and data–intensive applications. With this shift in focus, there is an emerging need to incorporate agreements that represent the QoS expectations (e.g. response time) of customer applications and the prices they are willing to pay. We consider a utility model where each grid application (job) is associated with a function, that captures the revenue accrued by the provider on servicing it within a specified deadline. The function also specifies the penalty incurred on failing to meet the deadline. Scheduled execution of jobs on appropriate sites, along with timely transfer of data closer to compute sites, collectively work towards meeting these deadlines. To this end, we present DECO, a grid meta-scheduler that tightly integrates the compute and data transfer times of each job. A unique feature of DECO is that it enables differentiated QoS – by assigning profitable jobs to more powerful sites and transferring the datasets associated with them at a higher priority. Further, it employs replication of popular datasets to save on transfer times. Experimental studies demonstrate that DECO earns significantly better revenue for the grid provider, when compared to alternative scheduling methodologies.


Data Replication Maximum Revenue Revenue Function Cluster Site Utility Grid 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The physiology of the grid: An open grid services architecture for distributed systems integration (2002)Google Scholar
  2. 2.
    Irwin, D.E., Grit, L.E., Chase, J.S.: Balancing risk and reward in a market-based task service. In: Proc. of HPDC 2004 (2004)Google Scholar
  3. 3.
    Yeo, C.S., Buyya, R.: Service level agreement based allocation of cluster resources: Handling penalty to enhance utility. In: Proc. of Cluster 2005 (2005)Google Scholar
  4. 4.
    Mohamed, H., Epema, D.: An evaluation of the close-to-files processor and data co-allocation policy in multiclusters. In: Proc. of IEEE International Conference on Cluster Computing (2004)Google Scholar
  5. 5.
    Venugopal, S., Buyya, R.: A deadline and budget constrained scheduling algorithm for e-science applications on data grids. In: Proc. of 6th International Conference on Algorithms and Architectures for Parallel Processing (2005)Google Scholar
  6. 6.
    Ranganathan, K., Foster, I.: Computation scheduling and data replication algorithms for data grids. Grid resource management: state of the art and future trends, 359–373 (2004)Google Scholar
  7. 7.
    Chakrabarti, A., Dheepak, R.A., Sengupta, S.: Integration of scheduling and replication in data grids. In: Bougé, L., Prasanna, V.K. (eds.) HiPC 2004. LNCS, vol. 3296, pp. 375–385. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Allcock, W.: Gridftp protocol specification (global grid forum recommendation gfd.20). In: Globus Project: (2003),
  9. 9.
    Hall, L., Schulz, A., Shmoys, D.B., Wein, J.: Scheduling to minimize average completion time: off-line and on-line algorithms. In: Proc. of ACM-SIAM Symposium on Discrete Algorithms (1996)Google Scholar
  10. 10.
    Sulistio, A., Cibej, U., Buyya, R., Robic, B.: A toolkit for modeling and simulation of data grids with integration of data storage, replication and analysis. Technical Report, GRIDS-TR-2005-13, GRIDS Lab, University of Melbourne, Australia (2005)Google Scholar
  11. 11.
    Ranganathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: Proc. of the 11th IEEE International Symposium on High Performance Distributed Computing (2002)Google Scholar
  12. 12.
    Venugopal, S., Buyya, R., Winton, L.: A grid service broker for scheduling distributed data-oriented applications on global grids. In: Proc. of the 2nd workshop on Middleware for grid computing (2004)Google Scholar
  13. 13.
    Dasgupta, G., Dasgupta, K., Purohit, A., Viswanathan, B.: Qos-graf: A framework for qos based grid resource allocation with failure provisioning. In: Proc. of 14th IEEE IWQoS (2006)Google Scholar
  14. 14.
    Ferguson, D.F., Yemini, Y., Nikolaou, C.: Microeconomic algorithms for load balancing in distributed computer systems. In: Proc. of ICDCS (1988)Google Scholar
  15. 15.
    Kosar, T., Livny, M.: Stork: Making data placement a first class citizen in the grid. In: Proc. of the 24th Int. Conference on Distributed Computing Systems (2004)Google Scholar
  16. 16.
    Romosan, A., Rotem, D., Shoshani, A., Wright, D.: Co-scheduling of computation and data on computer clusters. In: SSDBM 2005: Proceedings of the 17th SSDBM 2005 (2005)Google Scholar
  17. 17.
    Liu, H., Beck, M., Huang, J.: Dynamic co-scheduling of distributed computation and replication. In: Proc. of 6th IEEE Int. Symposium on Cluster Computing and the Grid (2006) (to appear)Google Scholar
  18. 18.
    Phan, T., Ranganathan, K., Sion, R.: Evolving toward the perfect schedule: Coscheduling job assignments and data replication in wide-area systems using a genetic algorithm. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 173–193. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Vikas Agarwal
    • 1
  • Gargi Dasgupta
    • 1
  • Koustuv Dasgupta
    • 1
  • Amit Purohit
    • 1
  • Balaji Viswanathan
    • 1
  1. 1.IBM, India Research Lab 

Personalised recommendations