Abstract
High-performance scientific computing relies increasingly on high-level, large-scale, object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental component of such object-oriented frameworks—a parallel or serial array class library—provides an opportunity for increasingly sophisticated compile-time optimization techniques. This paper describes two optimizing transformations suitable for certain classes of numerical algorithms, one for reducing the cost of inter-processor communication, and one for improving cache utilization; demonstrates and analyzes the resulting performance gains; and indicates how these transformations are being automated.
This work was performed under the auspices of the U.S. Department of Energy by Los Alamos National Laboratory under Contract No. W-7405-Eng-36.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
David Brown, Geoff Chesshire, William Henshaw, and Dan Quinlan. Overture: An object-oriented software system for solving partial differential equations in serial and parallel environments. In Proceedings of the SIAM Parallel Conference, Minneapolis, MN, March 1997.
J.V.W. Reynders et. al. POOMA: A framework for scientific simulations on parallel architectures. In Parallel Programming using C++ by Gregory V. Wilson and Paul Lu, MIT Press, 1996.
Federico Bassetti, Kei Davis, and Dan Quinlan. Toward Fortran 77 performance from object-oriented scientific frameworks. In Proceedings of the High Performance Computing Conference (HPC’98), 1998.
Todd Veldhuizen. Expression templates. In S.B. Lippmann, editor, C++ Gems. Prentice-Hall, 1996.
Steven S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
Rebecca Parsons and Dan Quinlan. A++/P++ array classes for architecture independent finite difference computations. In Proceedings of the Second Annual Object-Oriented Numerics Conference (OONSKI’94), April 1994.
Bjarne Stroustrup. The C++ Programming Language. Addison Wesley, third edition edition, 1997.
Roldan Pozo. Template numerical toolkit. http://math.nist.gov/tnt/.
GNU scientific software library. http://KachinaTech.com.
Naraig Manjikian and Tarek Abdelrahman. Array data layout for the reduction of cache conflicts. In International Conference on Parallel and Distributed Computing Systems (PDCSâ«‚98), 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bassetti, F., Davis, K., Quinlan, D. (1998). Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures. In: Caromel, D., Oldehoeft, R.R., Tholburn, M. (eds) Computing in Object-Oriented Parallel Environments. ISCOPE 1998. Lecture Notes in Computer Science, vol 1505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49372-7_10
Download citation
DOI: https://doi.org/10.1007/3-540-49372-7_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65387-5
Online ISBN: 978-3-540-49372-3
eBook Packages: Springer Book Archive