Advertisement

Hardware mapping of a parallel algorithm for matrix-vector multiplication overlapping communications and computations

  • C. N. Ojeda-Guerra
  • R. Esper-Chaín
  • M. Estupiñán
  • E. Macías
  • A. Suárez
Miscellaneous
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1482)

Abstract

The parallelization of numerical algorithms is very important in scientific applications, but many points of this parallelization remain open today. Specifically, the overhead introduced by loading and unloading the data degrades the efficiency, and in a realistic approach should be taking into account for performance estimation. The authors of this paper present a way of overcoming the bottleneck of loading and unloading the data by overlapping computations and communications in a specific algorithm such as matrix-vector multiplication. Also, a way of mapping this algorithm in hardware is presented in order to demonstrate the parallelization methodology.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Quinn M.J.: Parallel Computing. Theory and Practice. McGraw-Hill International Editions. (1994)Google Scholar
  2. 2.
    Banerjee U.: Dependence Analysis for Supercomputing. Kluwer Academic Publishers. (1988)Google Scholar
  3. 3.
    Booth A.D.: A Signed Binary Multiplication Technique. Quart. Journ. Mech. and Appl. Math vol. 4 part 2. (1951) 236–240MATHMathSciNetGoogle Scholar
  4. 4.
    Golub G.H., Van Loan C.F.: Matrix Computations. Second edition. The Johns Hopkins University Press. (1989)Google Scholar
  5. 5.
    Moldovan D.I., Fortes J.A.B.: Partitioning and mapping algorithms into fixed systolic arrays. IEEE transactions on computers vol. C-35 no. 1. (1986)Google Scholar
  6. 6.
    Ojeda-Guerra C.N., Suárez A.: Solving Linear Systems of Equations Overlapping Computations and Communications in Torus Networks. Fifth Euromicro Workshop on Parallel and Distributed Processing. (1997) 453–460Google Scholar
  7. 7.
    Suárez A., Ojeda-Guerra C.N.: Overlapping Computations and Communications on Torus Networks. Fourth Euromicro Workshop on Parallel and Distributed Processing. (1996) 162–169Google Scholar
  8. 8.
    Trimberger S.N.: Field Programmable Gate Array Technology. Kluwer Academic Publishers. (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • C. N. Ojeda-Guerra
    • 1
  • R. Esper-Chaín
    • 2
  • M. Estupiñán
    • 1
  • E. Macías
    • 1
  • A. Suárez
    • 1
  1. 1.Dpto. de Ingeniería TelemáticaU.L.P.G.C.Spain
  2. 2.Dpto. de Ingeniería Electrónica y AutomáticaU.L.P.G.C.Spain

Personalised recommendations