Abstract
Block-recursive codes for dense numerical linear algebra computations appear to be well-suited for execution on machines with deep memory hierarchies because they are effectively blocked for all levels of the hierarchy. In this paper, we describe compiler technology to translate iterative versions of a number of numerical kernels into block-recursive form. We also study the cache behavior and performance of these compiler generated block-recursive codes.
This work was supported by NSF grants CCR-9720211, EIA-9726388, ACI-9870687, EIA-9972853.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ramesh C. Agarwal, Fred G. Gustavson, Joan McComb, and Stanley Schmidt. Engineering and Scientific Subroutine Library Release 3 for IBM ES/3090 Vector Multiprocessors. IBM Systems Journal, 28(2):345–350, 1989.
Nawaaz Ahmed, Nikolay Mateev, and Keshav Pingali. Synthesizing transformations for locality enhancement of imperfectly-nested loop nests. In Proc. International Conference on Supercomputing, Santa Fe, New Mexico, May 2000.
E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorensen, editors. LAPACK Users’ Guide. Second Edition. SIAM, Philadelphia, 1995.
Steve Carr and K. Kennedy. Compiler blockability of numerical algorithms. In Supercomputing, 1992.
L. Carter, J. Ferrante, and S. Flynn Hummel. Hierarchical tiling for improved superscalar performance. In International Parallel Processing Symposium, April 1995.
S. Chatterjee, V. Jain, A. Lebeck, S. Mundhra, and M. Thottethodi. Nonlinear array layouts for hierarchical memory systems. In International Conference on Supercomputing (ICS’99), June 1999.
Matteo Frigo, C.L. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Foundations of Computer Science. IEEE Press, 1999.
F. G. Gustavson. Recursion leads to automatic variable blocking for dense linear-algebra algorithms. IBM Journal of Research and Development, 41(6):737–755, November 1997.
Wayne Kelly, William Pugh, and Evan Rosser. Code generation for multiple mappings. In 5th Symposium on the Frontiers of Massively Parallel Computation, pages 332–341, February 1995.
Induprakas Kodukula, Nawaaz Ahmed, and Keshav Pingali. Data-centric multilevel blocking. In Programming Languages, Design and Implementation. ACM SIGPLAN, June 1997.
William Pugh. The Omega test: A fast and practical integer programming algorithm for dependence analysis. In Communications of the ACM, pages 102–114, August 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahmed, N., Pingali, K. (2000). Automatic Generation of Block-Recursive Codes. In: Bode, A., Ludwig, T., Karl, W., Wismüller, R. (eds) Euro-Par 2000 Parallel Processing. Euro-Par 2000. Lecture Notes in Computer Science, vol 1900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44520-X_48
Download citation
DOI: https://doi.org/10.1007/3-540-44520-X_48
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67956-1
Online ISBN: 978-3-540-44520-3
eBook Packages: Springer Book Archive