Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Benoit, Anne; Robert, Yves

doi:10.1007/s00453-008-9229-4

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Published: 04 October 2008

Volume 57, pages 689–724, (2010)
Cite this article

Algorithmica Aims and scope Submit manuscript

Anne Benoit¹ &
Yves Robert¹

188 Accesses
14 Citations
Explore all metrics

Abstract

Mapping applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline or fork graphs. Several antagonist criteria should be optimized for workflow applications, such as throughput and latency (or a combination). In this paper, we consider a simplified model with no communication cost, and we provide an exhaustive list of complexity results for different problem instances. Pipeline or fork stages can be replicated in order to increase the throughput by sending consecutive data sets onto different processors. In some cases, stages can also be data-parallelized, i.e. the computation of one single data set is shared between several processors. This leads to a decrease of the latency and an increase of the throughput. Some instances of this simple model are shown to be NP-hard, thereby exposing the inherent complexity of the mapping problem. We provide polynomial algorithms for other problem instances. Altogether, we provide solid theoretical foundations for the study of mono-criterion or bi-criteria mapping optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Throughput Optimization for Pipeline Workflow Scheduling with Setup Times

Optimization of data flow execution in a parallel environment

Article 22 August 2018

Search-Based Scheduling for Parallel Tasks on Heterogeneous Platforms

References

Ahmad, I., Kwok, Y.-K.: On exploiting task duplication in parallel program scheduling. IEEE Trans. Parallel Distrib. Syst. 9(9), 872–892 (1998)
Article Google Scholar
Amdahl, G.: The validity of the single processor approach to achieving large scale computing capabilities. In: AFIPS Conference Proceedings, vol. 30, pp. 483–485. AFIPS Press, Montvale (1967)
Google Scholar
Banikazemi, M., Moorthy, V., Panda, D.K.: Efficient collective communication on heterogeneous networks of workstations. In: Proceedings of the 27th International Conference on Parallel Processing (ICPP’98). IEEE Computer Society, Los Alamitos (1998)
Google Scholar
Beaumont, O., Legrand, A., Marchal, L., Robert, Y.: Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms. In: HeteroPar’2004: International Conference on Heterogeneous Computing. ISPDC’2004: International Symposium on Parallel and Distributed Computing, pp. 296–302. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Benoit, A., Robert, Y.: Mapping pipeline skeletons onto heterogeneous platforms. J. Parallel Distrib. Comput. 68(6), 790–808 (2008). Available as LIP Research Report 2007-05, graal.ens-lyon.fr/~abenoit/. Short version appeared in ICCS’2007
Article Google Scholar
Beynon, M., Sussman, A., Catalyurek, U., Kurc, T., Saltz, J.: Performance optimization for data intensive grid applications. In: Proceedings of the Third Annual International Workshop on Active Middleware Services (AMS’01). IEEE Computer Society, Los Alamitos (2001)
Google Scholar
Beynon, M.D., Kurc, T., Sussman, A., Saltz, J.: Optimizing execution of component-based applications using group instances. Future Gener. Comput. Syst. 18(4), 435–448 (2002)
Article MATH Google Scholar
Bhat, P., Raghavendra, C., Prasanna, V.: Efficient collective communication in distributed heterogeneous systems. In: ICDCS’99 19th International Conference on Distributed Computing Systems, pp. 15–24. IEEE Computer Society, Los Alamitos (1999)
Google Scholar
Bhat, P., Raghavendra, C., Prasanna, V.: Efficient collective communication in distributed heterogeneous systems. J. Parallel Distrib. Comput. 63, 251–263 (2003)
Article MATH Google Scholar
Bokhari, S.H.: Partitioning problems in parallel, pipeline, and distributed computing. IEEE Trans. Comput. 37(1), 48–57 (1988)
Article MathSciNet Google Scholar
Cole, M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30(3), 389–406 (2004)
Article Google Scholar
DataCutter Project: Middleware for filtering large archival scientific datasets in a grid environment. http://www.cs.umd.edu/projects/hpsl/ResearchAreas/DataCutter.htm
Garey, M.R., Johnson, D.S.: Computers and intractability, a guide to the theory of NP-completeness. Freeman, New York (1979)
MATH Google Scholar
Hansen, P., Lih, K.-W.: Improved algorithms for partitioning problems in parallel, pipeline, and distributed computing. IEEE Trans. Comput. 41(6), 769–771 (1992)
Article Google Scholar
Hong, B., Prasanna, V.: Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput. In: Proceedings of the 32th International Conference on Parallel Processing (ICPP’2003). IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Iqbal, M.A.: Approximate algorithms for partitioning problems. Int. J. Parallel Program. 20(5), 341–361 (1991)
Article MathSciNet Google Scholar
Iqbal, M.A., Bokhari, S.H.: Efficient algorithms for a class of partitioning problems. IEEE Trans. Parallel Distrib. Syst. 6(2), 170–175 (1995)
Article Google Scholar
Khuller, S., Kim, Y.: On broadcasting in heterogenous networks. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1011–1020. SIAM, Philadelphia (2004)
Google Scholar
Kwok, Y.-K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)
Article Google Scholar
Liu, P.: Broadcast scheduling optimization for heterogeneous cluster systems. J. Algorithms 42(1), 135–152 (2002)
Article MATH MathSciNet Google Scholar
Marchal, L., Rehn, V., Robert, Y., Vivien, F.: Scheduling and data redistribution strategies on star platforms. Research Report 2006-23, LIP, ENS Lyon, France, June 2006
Olstad, B., Manne, F.: Efficient partitioning of sequences. IEEE Trans. Comput. 44(11), 1322–1326 (1995)
Article MATH MathSciNet Google Scholar
Pinar, A., Aykanat, C.: Fast optimal load balancing algorithms for 1D partitioning. J. Parallel Distrib. Comput. 64(8), 974–996 (2004)
Article MATH Google Scholar
Rabhi, F., Gorlatch, S.: Patterns and Skeletons for Parallel and Distributed Computing. Springer, Berlin (2002)
Google Scholar
Saif, T., Parashar, M.: Understanding the behavior and performance of non-blocking communications in MPI. In: Proceedings of Euro-Par 2004: Parallel Processing. Lecture Notes in Computer Science, vol. 3149, pp. 173–182. Springer, Berlin (2004)
Google Scholar
Spencer, M., Ferreira, R., Beynon, M., Kurc, T., Catalyurek, U., Sussman, A., Saltz, J.: Executing multiple pipelined data analysis operations in the grid. In: 2002 ACM/IEEE Supercomputing Conference. Assoc. Comput. Mach., New York (2002)
Google Scholar
Subhlok, J., Vondran, G.: Optimal mapping of sequences of data parallel tasks. In: Proc. 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP’95, pp. 134–143. Assoc. Comput. Mach., New York (1995)
Google Scholar
Subhlok, J., Vondran, G.: Optimal latency-throughput tradeoffs for data parallel pipelines. In: ACM Symposium on Parallel Algorithms and Architectures SPAA’96, pp. 62–71. Assoc. Comput. Mach., New York (1996)
Google Scholar
Taura, K., Chien, A.A.: A heuristic algorithm for mapping communicating tasks on heterogeneous resources. In: Heterogeneous Computing Workshop, pp. 102–115. IEEE Computer Society, Los Alamitos (2000)
Google Scholar
Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Article Google Scholar
Vydyanathan, N., Catalyurek, U., Kurc, T., Saddayappan, P., Saltz, J.: An approach for optimizing latency under throughput constraints for application workflows on clusters. Research Report OSU-CISRC-1/07-TR03, Ohio State University, Columbus, OH, Jan. 2007. Available at ftp://ftp.cse.ohio-state.edu/pub/tech-report/2007. Short version appears in EuroPar’2008
Yang, T., Gerasoulis, A.: DSC: Scheduling parallel tasks on an unbounded number of processors. IEEE Trans. Parallel Distrib. Syst. 5(9), 951–967 (1994)
Article Google Scholar

Download references

Author information

Authors and Affiliations

LIP, ENS Lyon, 46 Allée d’Italie, 69364, Lyon Cedex 07, France
Anne Benoit & Yves Robert

Authors

Anne Benoit
View author publications
You can also search for this author in PubMed Google Scholar
Yves Robert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anne Benoit.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benoit, A., Robert, Y. Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows. Algorithmica 57, 689–724 (2010). https://doi.org/10.1007/s00453-008-9229-4

Download citation

Received: 06 April 2007
Accepted: 19 September 2008
Published: 04 October 2008
Issue Date: August 2010
DOI: https://doi.org/10.1007/s00453-008-9229-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Abstract

Access this article

Similar content being viewed by others

Throughput Optimization for Pipeline Workflow Scheduling with Setup Times

Optimization of data flow execution in a parallel environment

Search-Based Scheduling for Parallel Tasks on Heterogeneous Platforms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Abstract

Access this article

Similar content being viewed by others

Throughput Optimization for Pipeline Workflow Scheduling with Setup Times

Optimization of data flow execution in a parallel environment

Search-Based Scheduling for Parallel Tasks on Heterogeneous Platforms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation