Multidimensional Blocking in UPC

Barton, Christopher; Caşcaval, Călin; Almasi, George; Garg, Rahul; Amaral, José Nelson; Farreras, Montse

doi:10.1007/978-3-540-85261-2_4

Christopher Barton¹,
Călin Caşcaval²,
George Almasi²,
Rahul Garg¹,
José Nelson Amaral¹ &
…
Montse Farreras³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5234))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

446 Accesses
2 Citations

Abstract

Partitioned Global Address Space (PGAS) languages offer an attractive, high-productivity programming model for programming large-scale parallel machines. PGAS languages, such as Unified Parallel C (UPC), combine the simplicity of shared-memory programming with the efficiency of the message-passing paradigm by allowing users control over the data layout. PGAS languages distinguish between private, shared-local, and shared-remote memory, with shared-remote accesses typically much more expensive than shared-local and private accesses, especially on distributed memory machines where shared-remote access implies communication over a network.

In this paper we present a simple extension to the UPC language that allows the programmer to block shared arrays in multiple dimensions. We claim that this extension allows for better control of locality, and therefore performance, in the language.

We describe an analysis that allows the compiler to distinguish between local shared array accesses and remote shared array accesses. Local shared array accesses are then transformed into direct memory accesses by the compiler, saving the overhead of a locality check at runtime. We present results to show that locality analysis is able to significantly reduce the number of shared accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ayguade, E., Garcia, J., Girones, M., Labarta, J., Torres, J., Valero, M.: Detecting and using affinity in an automatic data distribution tool. In: Languages and Compilers for Parallel Computing, pp. 61–75 (1994)
Google Scholar
Bikshandi, G., Guo, J., Hoeflinger, D., Almási, G., Fraguela, B.B., Garzarán, M.J., Padua, D.A., von Praun, C.: Programming for parallelism and locality with hierarchically tiled arrays. In: PPOPP, pp. 48–57 (2006)
Google Scholar
Chamberlain, B.L., Choi, S.-E., Lewis, E.C., Lin, C., Snyder, L., Weathersby, D.: ZPL: A machine independent programming language for parallel computers. Software Engineering 26(3), 197–211 (2000)
Article Google Scholar
Dongarra, J.J., Du Croz, J., Hammarling, S., Hanson, R.J.: An extended set of FORTRAN Basic Linear Algebra Subprograms. ACM Transactions on Mathematical Software 14(1), 1–17 (1988)
Article MATH Google Scholar
ESSL User Guide, http://www-03.ibm.com/systems/p/software/essl.html
Blackford, L.S., et al.: ScaLAPACK: a linear algebra library for message-passing computers. In: Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing (Minneapolis, MN, 1997) (electronic), Philadelphia, PA, USA, p. 15. Society for Industrial and Applied Mathematics (1997)
Google Scholar
Gupta, M., Schonberg, E., Srinivasan, H.: A unified framework for optimizing communication in data-parallel programs. IEEE Transactions on Parallel and Distributed Systems 7(7), 689–704 (1996)
Article Google Scholar
HPL Algorithm description, http://www.netlib.org/benchmark/hpl/algorithm.html
Kremer, U.: Automatic data layout for distributed memory machines. Technical Report TR96-261, 14 (1996)
Google Scholar
Numrich, R.W., Reid, J.: Co-array fortran for parallel programming. ACM Fortran Forum 17(2), 1–31 (1998)
Article Google Scholar
Paek, Y., Navarro, A.G., Zapata, E.L., Padua, D.A.: Parallelization of benchmarks for scalable shared-memory multiprocessors. In: IEEE PACT, p. 401 (1998)
Google Scholar
Ponnusamy, R., Saltz, J.H., Choudhary, A.N., Hwang, Y.-S., Fox, G.: Runtime support and compilation methods for user-specified irregular data distributions. IEEE Transactions on Parallel and Distributed Systems 6(8), 815–831 (1995)
Article Google Scholar
Tu, P., Padua, D.A.: Automatic array privatization. In: Compiler Optimizations for Scalable Parallel Systems Languages, pp. 247–284 (2001)
Google Scholar
UPC Language Specification, V1.2 (May 2005)
Google Scholar
The X10 programming language (2004), http://x10.sourceforge.net
Yelick, K., Semenzato, L., Pike, G., Miyamoto, C., Liblit, B., Krishnamurthy, A., Hilfinger, P., Graham, S., Gay, D., Colella, P., Aiken, A.: Titanium: A high-performance java dialect. Concurrency: Practice and Experience 10(11-13) (September-November 1998)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Alberta, Edmonton, Canada
Christopher Barton, Rahul Garg & José Nelson Amaral
IBM T.J. Watson Research Center, ,
Călin Caşcaval & George Almasi
Barcelona Supercomputing Center, Universitat Politècnica de Catalunya,
Montse Farreras

Authors

Christopher Barton
View author publications
You can also search for this author in PubMed Google Scholar
Călin Caşcaval
View author publications
You can also search for this author in PubMed Google Scholar
George Almasi
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Garg
View author publications
You can also search for this author in PubMed Google Scholar
José Nelson Amaral
View author publications
You can also search for this author in PubMed Google Scholar
Montse Farreras
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vikram Adve María Jesús Garzarán Paul Petersen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barton, C., Caşcaval, C., Almasi, G., Garg, R., Amaral, J.N., Farreras, M. (2008). Multidimensional Blocking in UPC. In: Adve, V., Garzarán, M.J., Petersen, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2007. Lecture Notes in Computer Science, vol 5234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85261-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-85261-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85260-5
Online ISBN: 978-3-540-85261-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics