Internet Computing of Tasks with Dependencies Using Unreliable Workers
This paper studies the problem of improving the effectiveness of computing dependent tasks over the Internet. The distributed system is composed of a reliable server that coordinates the computation of a massive number of unreliable workers. It is known that the server cannot always ensure that the result of a task is correct without computing the task itself. This fact has significant impact on computing interdependent tasks. Since the computational capacity of the server may be restricted and so may be the time to complete the computation, the server may be able to compute only selected tasks, without knowing whether the remaining tasks were computed by workers correctly. But an incorrectly computed task may render the results of all dependent tasks incorrect. Thus it may become important for the server to compute judiciously selected tasks, so as to maximize the number of correct results.
In this work we assume that any worker computes correctly with probability p<1. Any incorrectly computed task corrupts all dependent tasks. The goal is to determine which tasks should be computed by the (reliable) server and which by the (unreliable) workers, and when, so as to maximize the expected number of correct results, under a constraint d on the computation time. We show that this optimization problem is NP-hard. Then we study optimal scheduling algorithms for the mesh with the tightest deadline. We present combinatorial arguments that completely describe optimal solutions for two ranges of values of worker reliability p, when p is close to zero and when p is close to one.
KeywordsDirected Acyclic Graph Correct Result Leftmost Column Deadline Constraint Distribute Processing Symposium
Unable to display preview. Download preview PDF.
- 2.Crescenzi, P., Kann, V. (eds.): A compendium of NP optimization problems, http://www.nada.kth.se/~viggo/wwwcompendium/node173.html
- 3.Du, W., Jia, J., Mangal, M., Murugesan, M.: Uncheatable Grid Computing. In: 24th International Conference on Distributed Computing Systems, ICDCS (2004)Google Scholar
- 4.Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure, 2nd edn. Morgan Kaufmann, San Francisco (2004)Google Scholar
- 8.The Intel Philanthropic Peer-to-Peer program, http://www.intel.com/cure
- 9.Kahney, L.: Cheaters Bow to Peer Pressure. In: Wired News, February 15 (2001), http://www.wired.com/news/technology/0,1282,41838,00.html
- 10.Kondo, D., Casanova, H., Wing, E., Berman, F.: Models and Scheduling Mechanisms for Global Computing Applications. In: 16th IEEE International Parallel & Distributed Processing Symposium (2002)Google Scholar
- 12.Malewicz, G.: Parallel Scheduling of Complex Dags under Uncertainty (2005) (submitted for publication)Google Scholar
- 13.Malewicz, G., Rosenberg, A.L., Yurkewych, M.: On Scheduling Complex Dags for Internet-Based Computing. In: 19th IEEE International Parallel & Distributed Processing Symposium, IPDPS (2005) (to appear)Google Scholar
- 14.Malewicz, G., Rosenberg, A.L.: On batch-scheduling dags for Internet-based computing. Typescript, University of Massachusetts (2004) (submitted for publication)Google Scholar
- 16.The Olson Laboratory Fight AIDS@Home project, http://www.fightaidsathome.org
- 18.Rosenberg, A.L., Yurkewych, M.: Optimal Schedules for Some Common Computation-Dags on the Internet. IEEE Transactions on Computers (2005) (to appear) Google Scholar
- 19.Rosenberg, A.L.: On Scheduling Mesh-Structured Computations on the Internet. IEEE Transactions on Computers 53(9) (2004)Google Scholar
- 21.The RSA Factoring By Web project, http://www.npac.syr.edu/factoring
- 23.SETI@home: Current Total Statistics May 9 (2004), http://setiathome.ssl.berkeley.edu/totals.html
- 24.Sun, X.H., Wu, M.: GHS: A performance Prediction and Task Scheduling System for Grid Computing. In: 17th IEEE International Parallel & Distributed Processing Symposium (2003)Google Scholar
- 25.Szajda, D., Lawson, B., Owen, J.: Hardening Functions for Large Scale Distributed Computations. In: IEEE Symposium on Security and Privacy, pp. 216–224 (2003)Google Scholar