A Fast Parallel Graph Partitioner for Shared-Memory Inspector/Executor Strategies

Krieger, Christopher D.; Strout, Michelle Mills

doi:10.1007/978-3-642-37658-0_13

Christopher D. Krieger¹⁷ &
Michelle Mills Strout¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7760))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

1091 Accesses
1 Citations

Abstract

Graph partitioners play an important role in many parallel work distribution and locality optimization approaches. Surprisingly, however, to our knowledge there is no freely available parallel graph partitioner designed for execution on a shared memory multicore system. This paper presents a shared memory parallel graph partitioner, ParCubed, for use in the context of sparse tiling run-time data and computation reordering. Sparse tiling is a run-time scheduling technique that schedules groups of iterations across loops together when they access the same data and one or more of the loops contains indirect array accesses. For sparse tiling, which is implemented with an inspector/executor strategy, the inspector needs to find an initial seed partitioning of adequate quality very quickly. We compare our presented hierarchical clustering partitioner, ParCubed, with GPart and METIS in terms of partitioning speed, partitioning quality, and the effect the generated seed partitions have on executor speed. We find that the presented partitioner is 25 to 100 times faster than METIS on a 16 core machine. The total edge cut of the partitioning generated by ParCubed was found not to exceed 1.27x that of the partitioning found by METIS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anderson, R.J., Woll, H.: Wait-free parallel algorithms for the union-find problem. In: Proceedings of the Twenty-Third Annual ACM Symposium on Theory of Computing, STOC 1991, pp. 370–380. ACM, New York (1991)
Chapter Google Scholar
I. Berman. Multicore programming in the face of metamorphosis: Union-find as an example. Master’s thesis, Tel-Aviv University, July 2010.
Google Scholar
Chevalier, C., Pellegrini, F.: PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 34(6-8), 318–331 (2008)
Article MathSciNet Google Scholar
Cybenko, G., Allen, T.G., Polito, J.E.: Practical parallel union-find algorithms for transitive closure and clustering. Int. J. Parallel Program 17(5), 403–423 (1989)
Article MathSciNet Google Scholar
Douglas, C.C., Hu, J., Kowarschik, M., Rüde, U., Weiss, C.: Cache optimization for structured and unstructured grid multigrid. Electronic Tranactions on Numerical Analysis 10, 21–40 (2000)
MATH Google Scholar
Han, H., Tseng, C.-W.: A Comparison of Locality Transformations for Irregular Codes. In: Dwarkadas, S. (ed.) LCR 2000. LNCS, vol. 1915, pp. 70–84. Springer, Heidelberg (2000)
Chapter Google Scholar
Han, H., Tseng, C.-W.: Improving Locality for Adaptive Irregular Scientific Codes. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, pp. 173–188. Springer, Heidelberg (2001)
Chapter Google Scholar
Karypis, G., Kumar, V.: Parallel multilevel k-way partitioning scheme for irregular graphs. In: Proceedings of the 1996 ACM/IEEE Conference on Supercomputing (CDROM), Supercomputing 1996, IEEE Computer Society, Washington, DC (1996)
Google Scholar
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
Article MathSciNet Google Scholar
Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009, pp. 36:1–36:12. ACM, New York (2009)
Google Scholar
Strout, M.M., Carter, L., Ferrante, J., Freeman, J., Kreaseck, B.: Combining Performance Aspects of Irregular Gauss-Seidel Via Sparse Tiling. In: Pugh, B., Tseng, C.-W. (eds.) LCPC 2002. LNCS, vol. 2481, pp. 90–110. Springer, Heidelberg (2005)
Chapter Google Scholar
Strout, M.M., Carter, L., Ferrante, J., Kreaseck, B.: Sparse tiling for stationary iterative methods. International Journal of High Performance Computing Applications 18(1), 95–114 (2004)
Article Google Scholar
Sui, X., Nguyen, D., Burtscher, M., Pingali, K.: Parallel Graph Partitioning on Multicore Architectures. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 246–260. Springer, Heidelberg (2011)
Chapter Google Scholar
Walshaw, C., Cross, M.: Parallel optimisation algorithms for multilevel mesh partitioning. Parallel Comput. 26(12), 1635–1660 (2000)
Article MathSciNet MATH Google Scholar
B. Wu, E. Z. Zhang, and X. Shen. Enhancing data locality for dynamic simulations through asynchronous data transformations and adaptive control. In Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, PACT 2011, pp. 243–252. IEEE Computer Society, Washington, DC (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Colorado State University, Fort Collins, CO, 80523, USA
Christopher D. Krieger & Michelle Mills Strout

Authors

Christopher D. Krieger
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Mills Strout
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Department of Computer Science and Engineering, Waseda University, 27 Waseda-machi, 162-0042, Shinjuku-ku, Tokyo, Japan
Hironori Kasahara & Keiji Kimura &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krieger, C.D., Strout, M.M. (2013). A Fast Parallel Graph Partitioner for Shared-Memory Inspector/Executor Strategies. In: Kasahara, H., Kimura, K. (eds) Languages and Compilers for Parallel Computing. LCPC 2012. Lecture Notes in Computer Science, vol 7760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37658-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-37658-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37657-3
Online ISBN: 978-3-642-37658-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics