A Matrix-Type for Performance–Portability
When matrix computations are expressed in conventional programming languages, matrices are almost exclusively represented by arrays, but arrays are also used to represent many other kinds of entities, such as grids, lists, hash tables, etc. The responsibility for achieving efficient matrix computations is usually seen as resting on compilers, which in turn apply loop restructuring and reordering transformations to adapt programs and program fragments to target different architectures. Unfortunately, compilers are often unable to restructure conventional algorithms for matrix computations into their block or block-recursive counterparts, which are required to obtain acceptable levels of perfomance on most current (and future) hardware systems.
We present a datatype which is dedicated to the representation of dense matrices. In contrast to arrays, for which index-based element-reference is the basic (primitive) operation, the primitive operations of our specialized matrix-type are composition and decomposition of/into submatrices. Decomposition of a matrix into submatrices (of unspecified sizes) is a key operation in the development of block algorithms for matrix computations, and through direct and explicit expression of (ambiguous) decompositions of matrices into submatrices, block algorithms can be expressed explicitly and at the same time the task of finding good decomposition parameters (i.e., block sizes) for each specific target system, is exposed to and made suitable for compilers.
KeywordsDense Matrice Matrix Computation Cholesky Factorization Decomposition Pattern Program Fragment
Unable to display preview. Download preview PDF.
- 1.Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan Kaufmann Publishers, San Francisco (2002)Google Scholar
- 3.Carr, S., Kennedy, K.: Compiler blockability of numerical algorithms. In: Proc. Supercomputing 1992, Minneapolis, Minn., pp. 114–124 (1992)Google Scholar
- 6.Wolf, M., Maydan, D., Chen, D.K.: Combining loop transformations considering caches and scheduling. In: Proc. 29th IEEE/ACMIntl. Symp. on Microarchitecture, Paris, France (1996)Google Scholar
- 8.Lam, M.S., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. In: Proc. 4th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 63–74 (1991)Google Scholar
- 9.Coleman, S., McKinley, K.S.: Tile size selection using cache organization and data layout. In: Proc. Conf. on Prog. Lang. Design and Implementation, La Jolla, CA, pp. 279–290 (1995)Google Scholar
- 10.Anderson, E., et al.: LAPACK Users’ Guide, 2nd edn., pp. 19104–26880. SIAM, Philadelphia (1995)Google Scholar
- 12.Maslov, V., Pugh, W.: Simplifying polynomial constraints over integers to make dependence analysis more precise. Technical Report UMIACS-CS-TR-93-68.1, Dept. of Computer Science, University of Maryland, College Park, MD 20742 (1994)Google Scholar
- 14.Peyton Jones, S., Hughes, J. (eds.): Report on the programming language Haskell 98 (1998), http://www.haskell.org/report
- 15.Milner, R., Tofte, M., Harper, R., MacQueen, D.: The Definition of Standard ML. Revised edn. The MIT Press, Cambridge (1997)Google Scholar
- 20.Skedzielewski, S., Glauert, J.: IF1 An intermediate form for applicative languages. Technical Report M-170, Lawrence Livermore National Laboratory, Livermore, CA 94550 (1985)Google Scholar
- 23.Hilfinger, P.N., Colella, P.: FIDIL: A language for scientific programming. In: Grossman, R. (ed.) Symbolic Computing: Applications to ScientificComputing, pp. 97–138. Society for Industrial and Applied Mathematics, Philadelphia (1989)Google Scholar
- 24.Lin, C.: ZPL Language reference manual. Technical Report 94-10-06, Dept. of Computer Science, University of Washington (1994)Google Scholar
- 28.Demmel, J.W., Dongarra, J.J., et al.: Self-adapting linear algebra algorithms and software. In: Proc. IEEE, Special issue on program generation, optimization, and adaptation (2005) (to appear), See also http://bebop.cs.berkeley.edu/pubs/ieeesans.pdf