Abstract
Mapping a pipelined application onto a distributed and parallel platform is a challenging problem. The problem becomes even more difficult when multiple optimization criteria are involved, and when the target resources are heterogeneous (processors and communication links) and subject to failures. This paper investigates the problem of mapping pipelined applications, consisting of a linear chain of stages executed in a pipeline way, onto such platforms. The objective is to optimize the reliability under a performance constraint, i.e., while guaranteeing a threshold throughput. In order to increase reliability, we replicate the execution of stages on multiple processors. We compare interval mappings, where the application is partitioned into intervals of consecutive stages, with general mappings, where stages may be partitioned without any constraint, thereby allowing a better usage of processors and communication network capabilities. However, the price to pay for general mappings is a dramatic increase in the problem complexity. We show that computing the period of a given general mapping is an NP-complete problem, and we give polynomial bounds to determine a (conservative) approximated value. On the contrary, the period of an interval mapping obeys a simple formula, and we provide an optimal dynamic programming algorithm for the bi-criteria interval mapping problem on homogeneous platforms. On the more practical side, we design a set of efficient heuristics, and we compare the performance of interval and general mapping strategies through extensive simulations.
Similar content being viewed by others
References
Cole M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30(3), 389–406 (2004)
Rabhi F., Gorlatch S.: Patterns and Skeletons for Parallel and Distributed Computing. Springer, Berlin (2002)
Hong, B., Prasanna, V.: Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput. In: Proceedings of the 32th International Conference on Parallel Processing (ICPP’2003). IEEE Computer Society Press (2003)
Subhlok, J., Vondran, G.: Optimal mapping of sequences of data parallel tasks. In: Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. (1995)
Subhlok, J., Vondran, G.: Optimal latency-throughput tradeoffs for data parallel pipelines. In: ACM Symposium on Parallel Algorithms and Architectures. (1996)
Benoit A., Robert Y.: Mapping pipeline skeletons onto heterogeneous platforms. J. Parallel Distrib. Comput. 68(6), 790–808 (2008)
Benoit, A., Marchal, L., Robert, Y., Sinnen, O.: Mapping pipelined applications with replication to increase throughput and reliability. LIP, ENS Lyon, France, Research Report 2009–2028, Oct. (2009). Available at graal.ens-lyon.fr/abenoit
Dogan A., Özgüner F.: Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans. Parallel Dist. Syst. 13(3), 308–323 (2002)
Assayad, I., Girault, A., Kalla, H.: A bi-criteria scheduling heuristics for distributed embedded systems under reliability and real-time constraints. In: International Conference on Dependable Systems and Networks, DSN’04, pp. 347–356. IEEE CS Press (2004)
Girault A., Kalla H.: A novel bicriteria scheduling heuristics providing a guaranteed global system failure rate. IEEE Trans. Dependable Secur. Comput. 6(4), 241–254 (2009)
Benoit, A., Bouziane, H.L., Robert, Y.: Optimizing the reliability of pipelined applications under throughput constraints. LIP, ENS Lyon, France, Research Report 2010–06, Jan. 2010. Available at graal.ens-lyon.fr/abenoit. Short version appears in ISPDC’ (2010)
Benoit, A., Bouziane, H.L., Robert, Y.: Assessing general mappings for period/reliability optimization of streaming applications. LIP, ENS Lyon, France, Research Report 2010–2020, June 2010, available at graal.ens-lyon.fr/abenoit. Short version appears in ICPADS’ (2010)
Awerbuch, B., Azar, Y., Fiat, A., Leighton, F.: Making commitments in the face of uncertainty: how to pick a winner almost every time. In: 28th ACM Symposium on Theory of Computing, pp. 519–530. ACM Press (1996)
Bhatt S., Chung F., Leighton F., Rosenberg A.: On optimal strategies for cycle-stealing in networks of workstations. IEEE Trans. Comput. 46(5), 545–557 (1997)
Rosenberg A.: Optimal schedules for cycle-stealing in a network of workstations with a bag-of-tasks workload. IEEE Trans. Parallel Distrib. Syst. 13(2), 179–191 (2002)
Benoit, A., Robert, Y.: Complexity results for throughput and latency optimization of replicated and data-parallel workflows. Algorithmica. Available online at http://dx.doi.org/10.1007/s00453-008-9229-4 (2009)
Wu, Q., Gu, Y.: Supporting distributed application workflows in heterogeneous computing environments. In: 14th International Conference on Parallel and Distributed Systems (ICPADS) IEEE Computer Society Press (2008)
Beynon M.D., Kurc T., Sussman A., Saltz J.: Optimizing execution of component-based applications using group instances. Futur. Gener. Comput. Syst. 18(4), 435–448 (2002)
Tel G.: Introduction to Distributed Algorithms. Cambridge University Press, Cambridge (2000)
MPICH2: High-performance and widely portable MPI http://www.mcs.anl.gov/research/projects/mpich2/
Garey M.R., Johnson D.S.: Computers and Intractability, a Guide to the Theory of NP-Completeness. W.H. Freeman and Company, San Francisco (1979)
Mills, M.P.: The internet begins with coal. Environment and Climate News, available at http://www.heartland.org/policybot/results/12989/The_Internet_Begins_with_Coal.html (1999)
Author information
Authors and Affiliations
Corresponding author
Additional information
Part of this work has appeared in ISPDC’2010 and in ICPADS’2010.
Rights and permissions
About this article
Cite this article
Benoit, A., Bouziane, H.L. & Robert, Y. Optimizing the Reliability of Streaming Applications Under Throughput Constraints. Int J Parallel Prog 39, 584–614 (2011). https://doi.org/10.1007/s10766-011-0165-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-011-0165-6