Workload Balancing and Throughput Optimization for Heterogeneous Systems Subject to Failures
In this paper, we study the problem of optimizing the throughput of streaming applications for heterogeneous platforms subject to failures. The applications are linear graphs of tasks (pipelines), and a type is associated to each task. The challenge is to map tasks onto the machines of a target platform, but machines must be specialized to process only one task type, in order to avoid costly context or setup changes. The objective is to maximize the throughput, i.e., the rate at which jobs can be processed when accounting for failures. For identical machines, we prove that an optimal solution can be computed in polynomial time. However, the problem becomes NP-hard when two machines can compute the same task type at different speeds. Several polynomial time heuristics are designed, and simulation results demonstrate their efficiency.
KeywordsFailure Rate Polynomial Time Greedy Algorithm Integer Linear Program Task Type
Unable to display preview. Download preview PDF.
- 1.Bahi, J., Contassot-Vivier, S., Couturier, R.: Coupling dynamic load balancing with asynchronism in iterative algorithms on the computational grid. In: International Parallel and Distributed Processing Symposium, IPDPS 2003 (April 2003)Google Scholar
- 2.Benoit, A., Dobrila, A., Nicod, J.M., Philippe, L.: Workload balancing and throughput optimization for heterogeneous systems subject to failures. Research report, INRIA, France (February 2011), http://graal.ens-lyon.fr/~abenoit/
- 3.Blaz̊ewicz, J., Drabowski, M., Weglarz, J.: Scheduling multiprocessor tasks to minimize schedule length. IEEE Trans. Comput. 35, 389–393 (1986)Google Scholar
- 5.Descourvières, E., Debricon, S., Gendreau, D., Lutz, P., Philippe, L., Bouquet, F.: Towards automatic control for microfactories. In: IAIA 2007, 5th Int. Conf. on Industrial Automation (2007)Google Scholar
- 8.Jalote, P.: Fault Tolerance in Distributed Systems. Prentice-Hall, Englewood Cliffs (1994)Google Scholar
- 12.Tanaka, M.: Development of desktop machining microfactory. Journal RIKEN Rev 34, 46–49 (2001) iSSN:0919-3405Google Scholar
- 13.Verettas, I., Clavel, R., Codourey, A.: Pocketfactory: a modular and miniature assembly chain including a clean environment. In: 5th Int. Workshop on Microfactories (2006)Google Scholar
- 14.Weissman, J.B., Womack, D.: Fault tolerant scheduling in distributed networks (1996)Google Scholar
- 15.West, R., Zhang, Y., Schwan, K., Poellabauer, C.: Dynamic window-constrained scheduling of real-time streams in media servers (2004)Google Scholar
- 17.West, R., Schwan, K.: Dynamic Window-Constrained Scheduling for Multimedia Applications. In: ICMCS, vol. 2, pp. 87–91 (1999)Google Scholar
- 19.Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. Research Report GRIDS-TR-2005-1, Grid Computing and Distributed Systems Laboratory, University of Melbourne, Australia (April 2005)Google Scholar