Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications

Subotic, Vladimir; Ferrer, Roger; Sancho, Jose Carlos; Labarta, Jesús; Valero, Mateo

doi:10.1007/978-3-642-23400-2_5

Vladimir Subotic¹⁸,
Roger Ferrer¹⁸,
Jose Carlos Sancho¹⁸,
Jesús Labarta¹⁸ &
…
Mateo Valero¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6852))

Included in the following conference series:

European Conference on Parallel Processing

1652 Accesses
7 Citations

Abstract

Task-based parallel programming languages require the programmer to partition the traditional sequential code into smaller tasks in order to take advantage of the existing dataflow parallelism inherent in the applications. However, obtaining the partitioning that achieves optimal parallelism is not trivial because it depends on many parameters such as the underlying data dependencies and global problem partitioning. In order to help the process of finding a partitioning that achieves high parallelism, this paper introduces a framework that a programmer can use to: 1) estimate how much his application could benefit from dataflow parallelism; and 2) find the best strategy to expose dataflow parallelism in his application. Our framework automatically detects data dependencies among tasks in order to estimate the potential parallelism in the application. Furthermore, based on the framework, we develop an interactive approach to find the optimal partitioning of code. To illustrate this approach, we present a case study of porting High Performance Linpack from MPI to MPI/SMPSs. The presented approach requires only superficial knowledge of the studied code and iteratively leads to the optimal partitioning strategy. Finally, the environment provides visualization of the simulated MPI/SMPSs execution, thus allowing the developer to qualitatively inspect potential parallelization bottlenecks.

Download to read the full chapter text

Chapter PDF

Semi-automatic Code Modernization for Optimal Parallel I/O

On-the-Fly Calculation of Model Factors for Multi-paradigm Applications

Petal Tool for Analyzing and Transforming Legacy MPI Applications

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Top500 List: List of top 500 supercomputers, http://www.top500.org/
Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: An Efficient Multithreaded Runtime System. J. Parallel Distrib. Comput. 37, 55–69 (1996)
Article Google Scholar
Carpenter, P.M., Ramirez, A., Ayguade, E.: Starsscheck: A tool to find errors in task-based parallel programs. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6271, pp. 2–13. Springer, Heidelberg (2010)
Chapter Google Scholar
Girona, S., Labarta, J., Badia, R.M.: Validation of dimemas communication model for mpi collective operations. In: PVM/MPI, pp. 39–46 (2000)
Google Scholar
Kale, V., Gropp, W.: Load Balancing for Regular Meshes on SMPs with MPI. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 229–238. Springer, Heidelberg (2010)
Chapter Google Scholar
Leijen, D., Hall, J.: Parallel performance: Optimize managed code for multi-core machines. MSDN Magazine (2007)
Google Scholar
Mak, J., Faxén, K.-F., Janson, S., Mycroft, A.: Estimating and Exploiting Potential Parallelism by Source-Level Dependence Profiling. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6271, pp. 26–37. Springer, Heidelberg (2010)
Chapter Google Scholar
Marjanovic, V., Labarta, J., Ayguadé, E., Valero, M.: Overlapping communication and computation by using a hybrid MPI/SMPSs approach. In: ICS, pp. 5–16 (2010)
Google Scholar
Nethercote, N., Seward, J.: Valgrind, http://valgrind.org/
Pérez, J.M., Badia, R.M., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. In: CLUSTER, pp. 142–151 (2008)
Google Scholar
Proposed Industry Standard. Openmp: A proposed industry standard api for shared memory programming
Google Scholar
Reinders, J.: Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O’Reilly Media, Inc., Sebastopol (2007)
Google Scholar
Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI: The Complete Reference. The MIT Press, Cambridge (1998)
Google Scholar
Subotic, V., Sancho, J.C., Labarta, J., Valero, M.: A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications. In: CLUSTER 2010 (2010)
Google Scholar
Wall, D.W.: Limits of Instruction-Level Parallelism. In: ASPLOS (1991)
Google Scholar
Zhang, X., Navabi, A., Jagannathan, S.: Alchemist: A transparent dependence distance profiling infrastructure. In: CGO 2009 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center, Universitat Politecnica de Catalunya, Spain
Vladimir Subotic, Roger Ferrer, Jose Carlos Sancho, Jesús Labarta & Mateo Valero

Authors

Vladimir Subotic
View author publications
You can also search for this author in PubMed Google Scholar
Roger Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Jose Carlos Sancho
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Labarta
View author publications
You can also search for this author in PubMed Google Scholar
Mateo Valero
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Emmanuel Jeannot & Raymond Namyst &
Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France
Jean Roman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Subotic, V., Ferrer, R., Sancho, J.C., Labarta, J., Valero, M. (2011). Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23400-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-23400-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23399-9
Online ISBN: 978-3-642-23400-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications

Abstract

Chapter PDF

Similar content being viewed by others

Semi-automatic Code Modernization for Optimal Parallel I/O

On-the-Fly Calculation of Model Factors for Multi-paradigm Applications

Petal Tool for Analyzing and Transforming Legacy MPI Applications

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Quantifying the Potential Task-Based Dataflow Parallelism in MPI Applications

Abstract

Chapter PDF

Similar content being viewed by others

Semi-automatic Code Modernization for Optimal Parallel I/O

On-the-Fly Calculation of Model Factors for Multi-paradigm Applications

Petal Tool for Analyzing and Transforming Legacy MPI Applications

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation