Performing Dynamically Injected Tasks on Processes Prone to Crashes and Restarts

  • Chryssis Georgiou
  • Dariusz R. Kowalski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6950)

Abstract

To identify the tradeoffs between efficiency and fault-tolerance in dynamic cooperative computing, we initiate the study of a task performing problem under dynamic processes’ crashes/restarts and task injections. The system consists of n message-passing processes which, subject to dynamic crashes and restarts, cooperate in performing independent tasks that are continuously and dynamically injected to the system. The task specifications are not known a priori to the processes. This problem abstracts todays Internet-based computations, such as Grid computing and cloud services, where tasks are generated dynamically and different tasks may be known to different processes. We measure performance in terms of the number of pending tasks, and as such it can be directly compared with the optimum number obtained under the same crash-restart-injection pattern by the best off-line algorithm. We propose several deterministic algorithmic solutions to the considered problem under different information models and correctness criteria, and we argue that their performance is close to the best possible offline solutions.

Keywords

Performing tasks Dynamic task injection Crashes and restarts Competitive analysis Distributed Algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ajtai, M., Aspnes, J., Dwork, C., Waarts, O.: A theory of competitive analysis for distributed algorithms. In: Proc. of FOCS 1994, pp. 401–411 (1994)Google Scholar
  2. 2.
    Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2
  3. 3.
    Attiya, H., Fouren, A.: Polynomial and adaptive long-lived (2k - 1)-renaming. In: Herlihy, M.P. (ed.) DISC 2000. LNCS, vol. 1914, pp. 149–163. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Attiya, H., Fouren, A., Gafni, E.: An adaptive collect algorithm with applications. Distributed Computing 15(2), 87–96 (2002)CrossRefGoogle Scholar
  5. 5.
    Awerbuch, B., Kutten, S., Peleg, D.: Competitive distributed job scheduling. In: Proc. of STOC 1992, pp. 571–580 (1992)Google Scholar
  6. 6.
    Bartal, Y., Fiat, A., Rabani, Y.: Competitive algorithms for distributed data management. In: Proc. of STOC 1992, pp. 39–50 (1992)Google Scholar
  7. 7.
    Chlebus, B., De-Prisco, R., Shvartsman, A.A.: Performing tasks on restartable message-passing processors. Distributed Computing 14(1), 49–64 (2001)CrossRefGoogle Scholar
  8. 8.
    Chlebus, B.S., Kowalski, D.R., Shvartsman, A.A.: Collective asynchronous reading with polylogarithmic worst-case overhead. In: Proc. of STOC 2004, pp. 321–330 (2004)Google Scholar
  9. 9.
    Cordasco, G., Malewicz, G., Rosenberg, A.: Extending IC-Scheduling via the sweep algorithm. J. of Parallel and Distributed Computing 70(3), 201–211 (2010)CrossRefMATHGoogle Scholar
  10. 10.
    Dwork, C., Halpern, J., Waarts, O.: Performing work efficiently in the presence of faults. SIAM Journal on Computing 27(5), 1457–1491 (1998)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Enabling Grids for E-sciencE (EGEE), http://www.eu-egee.org
  12. 12.
    Emek, Y., Halldorsson, M.M., Mansour, Y., Patt-Shamir, B., Radhakrishnan, J., Rawitz, D.: Online set packing and competitive scheduling of multi-part tasks. In: Proc. of PODC 2010, pp. 440–449 (2010)Google Scholar
  13. 13.
    Georgiou, C., Gilbert, S., Kowalski, D.R.: Meeting the deadline: on the complexity of fault-tolerant continuous gossip. In: Proc. of PODC 2010, pp. 247–256 (2010)Google Scholar
  14. 14.
    Georgiou, C., Russell, A., Shvartsman, A.A.: The complexity of synchronous iterative Do-All with crashes. Distributed Computing 17, 47–63 (2004)CrossRefMATHGoogle Scholar
  15. 15.
    Georgiou, C., Russell, A., Shvartsman, A.A.: Work-competitive scheduling for cooperative computing with dynamic groups. SIAM J. on Comp. 34(4), 848–862 (2005)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Georgiou, C., Shvartsman, A.A.: Do-All Computing in Distributed Systems: Cooperation in the Presence of Adversity. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Hui, L., Huashan, Y., Xiaoming, L.: A Lightweight Execution Framework for Massive Independent Tasks. In: Proc. of MTAGS 2008 (2008)Google Scholar
  18. 18.
    Kanellakis, P.C., Shvartsman, A.A.: Fault-Tolerant Parallel Computation. Kluwer Academic Publishers, Dordrecht (1997)CrossRefMATHGoogle Scholar
  19. 19.
    Korpela, E., Werthimer, D., Anderson, D., Cobb, J., Lebofsky, M.: SETI@home: Massively distributed computing for SETI. Comp. in Sc. & Eng. 3(1), 78–83 (2001)CrossRefGoogle Scholar
  20. 20.
    Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: Proc. of SIGMOD 2010, pp. 135–145 (2010)Google Scholar
  21. 21.
    Malewicz, G., Rosenberg, A., Yurkewych, M.: Toward a theory for scheduling dags in Internet-based computing. IEEE Trans. on Computers 55(6), 757–768 (2006)CrossRefGoogle Scholar
  22. 22.
    Malewicz, G., Russell, A., Shvartsman, A.A.: Distributed scheduling for disconnected cooperation. Distributed Computing 18(6), 409–420 (2006)CrossRefMATHGoogle Scholar
  23. 23.
    Sleator, D., Tarjan, R.: Amortized efficiency of list update and paging rules. Communications of the ACM 28(2), 202–208 (1985)MathSciNetCrossRefGoogle Scholar
  24. 24.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Chryssis Georgiou
    • 1
  • Dariusz R. Kowalski
    • 2
  1. 1.Department of Computer ScienceUniversity of CyprusCyprus
  2. 2.Department of Computer ScienceUniversity of LiverpoolUK

Personalised recommendations