Workload Balancing and Throughput Optimization for Heterogeneous Systems Subject to Failures

  • Anne Benoit
  • Alexandru Dobrila
  • Jean-Marc Nicod
  • Laurent Philippe
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6852)


In this paper, we study the problem of optimizing the throughput of streaming applications for heterogeneous platforms subject to failures. The applications are linear graphs of tasks (pipelines), and a type is associated to each task. The challenge is to map tasks onto the machines of a target platform, but machines must be specialized to process only one task type, in order to avoid costly context or setup changes. The objective is to maximize the throughput, i.e., the rate at which jobs can be processed when accounting for failures. For identical machines, we prove that an optimal solution can be computed in polynomial time. However, the problem becomes NP-hard when two machines can compute the same task type at different speeds. Several polynomial time heuristics are designed, and simulation results demonstrate their efficiency.


Failure Rate Polynomial Time Greedy Algorithm Integer Linear Program Task Type 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bahi, J., Contassot-Vivier, S., Couturier, R.: Coupling dynamic load balancing with asynchronism in iterative algorithms on the computational grid. In: International Parallel and Distributed Processing Symposium, IPDPS 2003 (April 2003)Google Scholar
  2. 2.
    Benoit, A., Dobrila, A., Nicod, J.M., Philippe, L.: Workload balancing and throughput optimization for heterogeneous systems subject to failures. Research report, INRIA, France (February 2011),
  3. 3.
    Blaz̊ewicz, J., Drabowski, M., Weglarz, J.: Scheduling multiprocessor tasks to minimize schedule length. IEEE Trans. Comput. 35, 389–393 (1986)Google Scholar
  4. 4.
    Cirne, W., Brasileiro, F., Paranhos, D., Góes, L.F.W., Voorsluys, W.: On the efficacy, efficiency and emergent behavior of task replication in large distributed systems. Parallel Computing 33(3), 213–234 (2007)CrossRefGoogle Scholar
  5. 5.
    Descourvières, E., Debricon, S., Gendreau, D., Lutz, P., Philippe, L., Bouquet, F.: Towards automatic control for microfactories. In: IAIA 2007, 5th Int. Conf. on Industrial Automation (2007)Google Scholar
  6. 6.
    Garey, M.R., Johnson, D.S.: Computers and Intractability, a Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)zbMATHGoogle Scholar
  7. 7.
    Gröflin, H., Klinkert, A., Dinh, N.P.: Feasible job insertions in the multi-processor-task job shop. European J. of Operational Research 185(3), 1308–1318 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Jalote, P.: Fault Tolerance in Distributed Systems. Prentice-Hall, Englewood Cliffs (1994)Google Scholar
  9. 9.
    Litke, A., Skoutas, D., Tserpes, K., Varvarigou, T.: Efficient task replication and management for adaptive fault tolerance in mobile grid environments. Future Generation Computer Systems 23(2), 163–178 (2007)CrossRefGoogle Scholar
  10. 10.
    Parhami, B.: Voting algorithms. IEEE Trans. on Reliability 43(4), 617–629 (1994)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Algorithms and Combinatorics, vol. 24. Springer, Heidelberg (2003)zbMATHGoogle Scholar
  12. 12.
    Tanaka, M.: Development of desktop machining microfactory. Journal RIKEN Rev 34, 46–49 (2001) iSSN:0919-3405Google Scholar
  13. 13.
    Verettas, I., Clavel, R., Codourey, A.: Pocketfactory: a modular and miniature assembly chain including a clean environment. In: 5th Int. Workshop on Microfactories (2006)Google Scholar
  14. 14.
    Weissman, J.B., Womack, D.: Fault tolerant scheduling in distributed networks (1996)Google Scholar
  15. 15.
    West, R., Zhang, Y., Schwan, K., Poellabauer, C.: Dynamic window-constrained scheduling of real-time streams in media servers (2004)Google Scholar
  16. 16.
    West, R., Poellabauer, C.: Analysis of a window-constrained scheduler for real-time and best-effort packet streams. In: Proc. of the 21st IEEE Real-Time Systems Symp., pp. 239–248. IEEE, Los Alamitos (2000)CrossRefGoogle Scholar
  17. 17.
    West, R., Schwan, K.: Dynamic Window-Constrained Scheduling for Multimedia Applications. In: ICMCS, vol. 2, pp. 87–91 (1999)Google Scholar
  18. 18.
    Wieczorek, M., Hoheisel, A., Prodan, R.: Towards a general model of the multi-criteria workflow scheduling on the grid. Future Gener. Comput. Syst. 25(3), 237–256 (2009)CrossRefGoogle Scholar
  19. 19.
    Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. Research Report GRIDS-TR-2005-1, Grid Computing and Distributed Systems Laboratory, University of Melbourne, Australia (April 2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Anne Benoit
    • 1
  • Alexandru Dobrila
    • 2
  • Jean-Marc Nicod
    • 2
  • Laurent Philippe
    • 2
  1. 1.LIP laboratory (ENS, CNRS, INRIA, UCBL)ENS Lyon, Université de LyonFrance
  2. 2.Université de Franche-Comté, LIFC laboratory, (UFC)France

Personalised recommendations