Advertisement

Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers

  • Yariv Aridor
  • Tamar Domany
  • Oleg Goldshmidt
  • Edi Shmueli
  • Jose Moreira
  • Larry Stockmeier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3277)

Abstract

Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases.

This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.

Keywords

Additional Link Machine Utilization Rectangular Partition Allocation Unit Connection Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Cray Research, Inc. Cray T3D System Architecture Overview, Technical Report (September 1993)Google Scholar
  3. 3.
    Feitelson, D.G., Jette, M.A.: Improved Utilization and Responsiveness with Gang Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 238–261. Springer, Heidelberg (1997)Google Scholar
  4. 4.
    Kessler, R., Schwarzmeier, J.: CRAY T3D: A New Dimension for Cray Research. In: COMPCON, pp. 176–182 (1993)Google Scholar
  5. 5.
    Earth Simulator, http://www.es.jamstec.go.jp
  6. 6.
    Krevat, E., Castanos, J.G., Moreira, J.E.: Job Scheduling for the BlueGene/L System. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 38–54. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Weil, M., Feitelson, D.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Trans. Parallel & Distributed Syst. 12(6) (2001)Google Scholar
  8. 8.
    Lifka, D.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)Google Scholar
  9. 9.
    Das Sharma, D., Pradhan, D.K.: A Fast and Efficient Strategy for Submesh Allocation in Mesh-Connected Parallel Computers. In: IEEE Symposium on Parallel and Distributed Processing, pp. 682–689 (1993)Google Scholar
  10. 10.
    Chuang, P.J., Tzeng, N.F.: An Efficient Submesh Allocation Strategy for Mesh Connected Systems. In: International Conference. on Distributed Computing Systems, pp. 256–263 (1991)Google Scholar
  11. 11.
    Ding, J., Bhuyan, L.N.: An Adaptive Submesh Allocation Strategy for Two- Dimensional Mesh Connected Systems. In: International Conference on Parallel Processing, pp. 193–200 (1993)Google Scholar
  12. 12.
    Qiao, W., Ni, L.M.: Efficient Processor Allocation for 3D Tori., Technical Report, Michigan State University, East Lansing, MI, 48824-1027 (1994)Google Scholar
  13. 13.
    Yoo, S.M., Choo, H., Youn, H.Y.: Processor Scheduling and Allocation for 3D Torus Multicomputer Systems. IEEE Trans. on Parallel and Dist. Syst., 475–484 (2000)Google Scholar
  14. 14.
    Yoo, S., Das, C.R.: Processor Management Techniques for Mesh-Connected Multiprocessors. In: International Conference on Parallel Processing, pp. 105–112 (1995)Google Scholar
  15. 15.
    Zhu, Y.: Efficient Processor Allocation Strategies for Mesh-Connected Parallel Computers. Journal of Parallel and Distributed Computing, v 16, 328–337 (1992)MATHCrossRefGoogle Scholar
  16. 16.
    Adiga, N.R., et al.: An Overview of the BlueGene/L Supercomputer. In: Supercomputing (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Yariv Aridor
    • 1
  • Tamar Domany
    • 1
  • Oleg Goldshmidt
    • 1
  • Edi Shmueli
    • 1
  • Jose Moreira
    • 2
  • Larry Stockmeier
    • 3
  1. 1.IBM Haifa Research LabsHaifaIsrael
  2. 2.IBM Watson Research CenterYorktown
  3. 3.IBM Almaden Research Center 

Personalised recommendations