Fundamental Kernels

Gallopoulos, Efstratios; Philippe, Bernard; Sameh, Ahmed H.

doi:10.1007/978-94-017-7188-7_2

Fundamental Kernels

Efstratios Gallopoulos¹⁸,
Bernard Philippe¹⁹ &
Ahmed H. Sameh²⁰

Chapter
First Online: 01 January 2015

2837 Accesses

Part of the book series: Scientific Computation ((SCIENTCOMP))

Abstract

In this chapter we discuss the fundamental operations, that are the building blocks of dense and sparse matrix computations. They are termed kernels because in most cases they account for most of the computational effort. Because of this, their implementation directly impacts the overall efficiency of the computation. They occur often at the lowest level where parallelism is expressed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Lawson, C., Hanson, R., Kincaid, D., Krogh, F.: Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5(3), 308–323 (1979)
Article MATH Google Scholar
Dongarra, J., Croz, J.D., Hammarling, S., Hanson, R.: An extended set of FORTRAN basic linear algebra subprograms. ACM Trans. Math. Softw. 14(1), 1–17 (1988)
Article MATH Google Scholar
Dongarra, J., Du Croz, J., Hammarling, S., Duff, I.: A set of level-3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1), 1–17 (1990)
Article MATH Google Scholar
Intel company: Intel Math Kernel Library. http://software.intel.com/en-us/intel-mkl
Texas advanced computer center, University of Texas: GotoBLAS2. https://www.tacc.utexas.edu/tacc-software/gotoblas2
Netlib Repository at UTK and ORNL: Automatically Tuned Linear Algebra Software (ATLAS). http://www.netlib.org/atlas/
Whaley, R., Dongarra, J.: Automatically tuned linear algebra software. In: Proceedings of 1998 ACM/IEEE Conference on Supercomputing, Supercomputing’98, pp. 1–27. IEEE Computer Society, Washington (1998). http://dl.acm.org/citation.cfm?id=509058.509096
Yotov, K., Li, X., Ren, G., Garzarán, M., Padua, D., Pingali, K., Stodghill, P.: Is search really necessary to generate high-performance BLAS? Proc. IEEE 93(2), 358–386 (2005). doi:10.1109/JPROC.2004.840444
Article Google Scholar
Goto, K., van de Geijn, R.: Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34(3), 12:1–12:25 (2008). doi:10.1145/1356052.1356053. http://doi.acm.org/10.1145/1356052.1356053
Google Scholar
Gallivan, K.A., Plemmons, R.J., Sameh, A.H.: Parallel algorithms for dense linear algebra computations. SIAM Rev. 32(1), 54–135 (1990). doi:http://dx.doi.org/10.1137/1032002
Google Scholar
Gallivan, K., Jalby, W., Meier, U.: The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory. SIAM J. Sci. Stat. Comput. 8(6), 1079–1084 (1987)
Article MATH Google Scholar
Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 13, 354–356 (1969)
Article MathSciNet MATH Google Scholar
Winograd, S.: On multiplication of 2 \(\times \) 2 matrices. Linear Algebra Appl. 4(4), 381–388 (1971)
Article MathSciNet MATH Google Scholar
Ballard, G., Demmel, J., Holtz, O., Lipshitz, B., Schwartz, O.: Communication-optimal parallel algorithm for Strassen matrix multiplication. Technical report UCB/EECS-2012-32, EECS Department, University of California, Berkeley (2012). http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-32.html
Higham, N.J.: Exploiting fast matrix multiplication within the level 3 BLAS. ACM Trans. Math. Softw. 16(4), 352–368 (1990)
Article MathSciNet MATH Google Scholar
Ballard, G., Demmel, J., Holtz, O., Schwartz, O.: Graph expansion and communication costs of fast matrix multiplication. J. ACM 59(6), 32:1–32:23 (2012). doi:10.1145/2395116.2395121. http://doi.acm.org/10.1145/2395116.2395121
Google Scholar
Lipshitz, B., Ballard, G., Demmel, J., Schwartz, O.: Communication-avoiding parallel Strassen: implementation and performance. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC’12, pp. 101:1–101:11. IEEE Computer Society Press, Los Alamitos (2012). http://dl.acm.org/citation.cfm?id=2388996.2389133
Higham, N.J.: Stability of a method for multiplying complex matrices with three real matrix multiplications. SIAM J. Matrix Anal. Appl. 13(3), 681–687 (1992)
Article MathSciNet MATH Google Scholar
Golub, G., Van Loan, C.: Matrix Computations, 4th edn. Johns Hopkins (2013)
Google Scholar
Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)
Book Google Scholar
Blackford, L., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.: ScaLAPACK User’s Guide. SIAM, Philadelphia (1997). http://www.netlib.org/scalapack
Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, Cambridge (1994)
Google Scholar
Moler, C.: MATLAB incorporates LAPACK. Mathworks Newsletter (2000). http://www.mathworks.com/company/newsletters/articles/matlab-incorporates-lapack.html
Gallivan, K., Jalby, W., Meier, U., Sameh, A.: The impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomput. Appl. 2(1) (1988)
Google Scholar
Davis, T., Hu, Y.: The University of Florida Sparse Matrix Collection. ACM Trans. Math. Softw. 38(1), 1:1–1:25 (2011). http://doi.acm.org/10.1145/2049662.2049663
Duff, I., Erisman, A., Reid, J.: Direct Methods for Sparse Matrices. Oxford University Press Inc., New York (1989)
MATH Google Scholar
Davis, T.: Direct Methods for Sparse Linear Systems. SIAM, Philadelphia (2006)
Book MATH Google Scholar
Zlatev, Z.: Computational Methods for General Sparse Matrices, vol. 65. Kluwer Academic Publishers, Dordrecht (1991)
Book MATH Google Scholar
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H.: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000)
Book MATH Google Scholar
Melhem, R.: Toward efficient implementation of preconditioned conjugate gradient methods on vector supercomputers. Int. J. Supercomput. Appl. 1(1), 70–98 (1987)
Article Google Scholar
Philippe, B., Saad, Y.: Solving large sparse eigenvalue problems on supercomputers. Technical report RIACS TR 88.38, NASA Ames Research Center (1988)
Google Scholar
Schenk, O.: Combinatorial Scientific Computing. CRC Press, Switzerland (2012)
Book MATH Google Scholar
Kepner, J., Gilbert, J.: Graph Algorithms in the Language of Linear Algebra. SIAM, Philadelphia (2011)
Book MATH Google Scholar
George, J., Liu, J.: Computer Solutions of Large Sparse Positive Definite Systems. Prentice Hall (1981)
Google Scholar
Pissanetzky, S.: Sparse Matrix Technology. Academic Press, New York (1984)
MATH Google Scholar
Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of 24th National Conference Association Computer Machinery, pp. 157–172. ACM Publications, New York (1969)
Google Scholar
Liu, W., Sherman, A.: Comparative analysis of the Cuthill-McKee and the reverse Cuthill-McKee ordering algorithms for sparse matrices. SIAM J. Numer. Anal. 13, 198–213 (1976)
Article MathSciNet MATH Google Scholar
D’Azevedo, E.F., Forsyth, P.A., Tang, W.P.: Ordering methods for preconditioned conjugate gradient methods applied to unstructured grid problems. SIAM J. Matrix Anal. 13(3), 944–961 (1992)
Article MathSciNet MATH Google Scholar
Duff, I., Meurant, G.: The effect of ordering on preconditioned conjugate gradients. BIT 29, 635–657 (1989)
Article MathSciNet MATH Google Scholar
Reid, J., Scott, J.: Reducing the total bandwidth of a sparse unsymmetric matrix. SIAM J. Matrix Anal. Appl. 28(3), 805–821 (2005)
Article MathSciNet Google Scholar
Barnard, S., Pothen, A., Simon, H.: A spectral algorithm for envelope reduction of sparse matrices. Numer. Linear Algebra Appl. 2, 317–334 (1995)
Article MathSciNet MATH Google Scholar
Spielman, D., Teng, S.: Spectral partitioning works: planar graphs and finite element meshes. Numer. Linear Algebra Appl. 421, 284–305 (2007)
Article MathSciNet MATH Google Scholar
Duff, I.: On algorithms for obtaining a maximum transversal. ACM Trans. Math. Softw. 7, 315–330 (1981)
Article Google Scholar
Duff, I., Koster, J.: On algorithms for permuting large entries to the diagonal of a sparse matrix. SIAM J. Matrix Anal. Appl. 22, 973–966 (2001)
Article MathSciNet MATH Google Scholar
Duff, I., Koster, J.: The design and use of algorithms for permuting large entries to the diagonal of sparse matrices. SIAM J. Matrix Anal. Appl. 20, 889–901 (1999)
Article MathSciNet MATH Google Scholar
The HSL mathematical software library. See http://www.hsl.r1.ac.uk/index.html
Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972)
Article MathSciNet MATH Google Scholar
Cheriyan, J., Mehlhorn, K.: Algorithms for dense graphs and networks on the random access computer. Algorithmica 15, 521–549 (1996)
Article MathSciNet MATH Google Scholar
Dijkstra, E.: A Discipline of Programming, Chapter 25. Prentice Hall, Englewood Cliffs (1976)
Google Scholar
Manguoğlu, M., Mehmet, K., Sameh, A., Grama, A.: Weighted matrix ordering and parallel banded preconditioners for iterative linear system solvers. SIAM J. Sci. Comput. 32(3), 1201–1206 (2010)
Article MathSciNet MATH Google Scholar
Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16(2), 452–469 (1995). http://citeseer.nj.nec.com/hendrickson95improved.html
Google Scholar
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Math. J. 23, 298–305 (1973)
MathSciNet Google Scholar
Kruyt, N.: A conjugate gradient method for the spectral partitioning of graphs. Parallel Comput. 22, 1493–1502 (1997)
Article MathSciNet MATH Google Scholar
Chan, P., Schlag, M., Zien, J.: Spectral k-way ratio-cut partitioning and clustering. IEEE Trans. CAD-Integr. Circuits Syst. 13, 1088–1096 (1994)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering and Informatics Department, University of Patras, Patras, Greece
Efstratios Gallopoulos
Campus de Beaulieu, INRIA/IRISA, Rennes Cedex, France
Bernard Philippe
Department of Computer Science, Purdue University, West Lafayette, IN, USA
Ahmed H. Sameh

Authors

Efstratios Gallopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Philippe
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed H. Sameh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Efstratios Gallopoulos .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gallopoulos, E., Philippe, B., Sameh, A.H. (2016). Fundamental Kernels. In: Parallelism in Matrix Computations. Scientific Computation. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7188-7_2

Download citation

DOI: https://doi.org/10.1007/978-94-017-7188-7_2
Published: 26 July 2015
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-7187-0
Online ISBN: 978-94-017-7188-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics