Overlapping communication and computation in hypercubes

  • Luis Díaz de Cerio
  • Miguel Valero-García
  • Antonio González
Workshop 02 Routing and Communication in Interconnection Networks
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1123)


This paper presents a method to derive efficient algorithms for hypercubes. The method exploits two features of the underlying hardware: a) the parallelism provided by the multiple communication links of each node and b) the possibility of overlapping computations and communications, which is a feature of machines supporting an asynchronous communication protocol. The method can be applied to a generic class of hypercube algorithms. Many examples of this class of algorithms are found in the literature for different problems. The paper shows the efficiency of the method using two of these problems as an example: FFT and Vector Add. The results show that the reduction in communication overhead is very significant in many cases and the algorithms produced by our method are always very close to the optimum in terms of execution time.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, R. C., Gustavson, F. G., Zubair, M.: An Efficient Algorithm for the 3-D FFT NAS Parallel Benchmark. Scalable High-Performance Computing Conf. (1994) 129–133Google Scholar
  2. 2.
    Aykanat, C., Dervis, A.: An Overlapped FFT Algorithm for Hypercube Multicomputer. ICPP (1991) III-316–III-317Google Scholar
  3. 3.
    Aykanat, C., Dervis, A.: Efficient Fast Hartley Transform Algorithms for Hypercube — Connected Multicomputers. IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 6(1995) 561–577CrossRefGoogle Scholar
  4. 4.
    Clement, M. J., Quinn, M. J.: Overlapping Computations, Communications and I/O in Parallel Sorting. Journal of Parallel and Distributed Computing 28 (1995) 162–172CrossRefGoogle Scholar
  5. 5.
    Díaz de Cerio, L., González, A., Valero-García, M.: Communication Pipelining in Hypercubes (submitted for publishing)Google Scholar
  6. 6.
    Díaz de Cerio, L., Valero-García, M., González, A.: Overlapping Communication and Computation in Hypercubes. DAC/UPC Research Report No. RR-96/02 (1996)Google Scholar
  7. 7.
    Fox, G. et al.: Solving Problems on Concurrent Processors. Englewood Cliffs, N. J. Prentice-Hall (1988)Google Scholar
  8. 8.
    Johnsson, S. L., Ho, C. T.: Optimum broadcasting and Personalized Communication in Hypercubes. IEEE Trans. Comput. 38 (1989) 1249–1268CrossRefGoogle Scholar
  9. 9.
    Johnsson, S. L., Krawitz, R. L.: Cooley-Tukey FFT on the Connection Machine. Parallel Computing 18 (1992) 1201–1221CrossRefGoogle Scholar
  10. 10.
    Lam, M.: Software Pipelining: An Effective Scheduling Technique for VLIW machines. Conf. on Programming Language Design and Implementation (1988) 318–328Google Scholar
  11. 11.
    Mantharam, M., Eberlein, P. J.: Block Recursive Algorithm to Generate Jacobi-sets. Parallel Computing 19 (1993) 481–496CrossRefGoogle Scholar
  12. 12.
    Sahay, A.: Hiding Communication Costs in Bandwidth-Limited Parallel FFT Computation Report: UCB/CSD 93/722, University of California (1993)Google Scholar
  13. 13.
    Suarez A., Ojeda-Guerra, C.: Overlapping Computations and Communications in Tours Networks. 4th Euromicro Workshop on Parallel and Distributed Processing (1996) 163–169Google Scholar
  14. 14.
    Thomson Leighton, F.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees and Hypercubes. Morgan Kaufmann Publishers (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Luis Díaz de Cerio
    • 1
  • Miguel Valero-García
    • 1
  • Antonio González
    • 1
  1. 1.Dept. d'Arquitectura de ComputadorsUniv. Polit. de CatalunyaBarcelonaSpain

Personalised recommendations