Performance Tuning of Matrix Triple Products Based on Matrix Structure

  • Eun-Jin Im
  • Ismail Bustany
  • Cleve Ashcraft
  • James W. Demmel
  • Katherine A. Yelick
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3732)

Abstract

Sparse matrix computations arise in many scientific and engineering applications, but their performance is limited by the growing gap between processor and memory speed. In this paper, we present a case study of an important sparse matrix triple product problem that commonly arises in primal-dual optimization method.

Instead of a generic two-phase algorithm, we devise and implement a single pass algorithm that exploits the block diagonal structure of the matrix. Our algorithm uses fewer floating point operations and roughly half the memory of the two-phase algorithm. The speed-up of the one-phase scheme over the two-phase scheme is 2.04 on a 900 MHz Intel Itanium-2, 1.63 on an 1 GHz Power-4, and 1.99 on a 900 MHz Sun Ultra-3.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Goemans, M.X., Williamson, D.P.: The primal-dual method for approximation algorithms and its application to network design problems. In: Approximation Algorithms for NP-hard Problems, pp. 144–191. PWS Publishing Co., Boston (1996)Google Scholar
  2. 2.
    Gilbert, J.R., Moler, C., Schreiber, R.: Sparse matrices in Matlab: Design and implementation. SIAM J. Matrix Analysis and Applications 13, 333–356 (1992)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Im, E., Yelick, K.A.: Optimizing sparse matrix computations for register reuse in SPARSITY. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS-ComputSci 2001. LNCS, vol. 2073, pp. 127–136. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. 4.
    Im, E., Yelick, K.A., Vuduc, R.: SPARSITY: Framework for Optimizing Sparse Matrix- Vector Multiply. International Journal of High Performance Computing Applications 18(1), 135–158 (2004)CrossRefGoogle Scholar
  5. 5.
    Vuduc, R., Gyulassy, A., Demmel, J.W., Yelick, K.A.: Memory Hierarchy Optimizations and Bounds for Sparse A TAx. In: Proceedings of the ICCS Workshop on Parallel Linear Algebra, Melbourne, Australia, June 2003. LNCS, vol. 2660, pp. 705–714. Springer, Heidelberg (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Eun-Jin Im
    • 1
  • Ismail Bustany
    • 2
  • Cleve Ashcraft
    • 3
  • James W. Demmel
    • 4
  • Katherine A. Yelick
    • 4
  1. 1.Kookmin UniversitySeoulKorea
  2. 2.Barcelona Design IncUSA
  3. 3.Livermore Software Technology CorporationUSA
  4. 4.U.C. BerkeleyUSA

Personalised recommendations