Scalable Ethernet Clos-Switches

  • Norbert Eicker
  • Thomas Lippert
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4128)


Scalability of Cluster-Computers utilizing Gigabit-Ethernet as an interconnect is limited by the unavailability of scalable switches that provide full bisectional bandwidth. Clos’ idea of connecting small crossbar-switches to a large, non-blocking crossbar – wide-spread in the field of high-performance networks – is not applicable in a straight-forward manner to Ethernet fabrics. This paper presents techniques necessary to implement such large crossbar-switches based on available Gigabit-Ethernet technology. We point out the ability to build Gigabit-Ethernet crossbar switches of up to 1152 ports providing full bisectional bandwidth. The cost of our configuration is at about €125 per port, with an observed latency of less than 10 μsec. We were able to find a bi-directional point-to-point throughput of 210 MB/s using the ParaStation Cluster middle-ware [2].


Span Tree IEEE Standard Address Resolution Protocol Broadcast Packet Network Fabric 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    IEEE standard 802.3z, IEEE standard 802.3abGoogle Scholar
  2. 2.
  3. 3.
    Clos, C.: A Study of Non-blocking Switching Networks. The Bell System Technical Journal 32(2), 406–424 (1953)Google Scholar
  4. 4.
  5. 5.
  6. 6.
  7. 7.
    Plumme, D.C.: An Ethernet Address Resolution Protocol. RFC 826 (November 1982)Google Scholar
  8. 8.
    IEEE standard 802.1DGoogle Scholar
  9. 9.
    IEEE standard 802.1Q, IEEE standard 802.3acGoogle Scholar
  10. 10.
    IEEE standard 802.1sGoogle Scholar
  11. 11.
    Sharma, S., Gopalan, K., Nanda, S., Chiueh, T.: Viking: A Multi-Spanning-Tree Ethernet Architecture for Metropolitan Area and Cluster Networks. In: IEEE INFOCOM (2004)Google Scholar
  12. 12.
    Dubinski, J., Humble, R., Pen, U.-L., Loken, C., Martin, P.: High Performance Commodity Networking in a 512-CPU Teraflops Beowulf Cluster for Computational Astrophysics. In: SC 2003 Conference (submitted, 2003) eprint arXiv:astro-ph/0305109Google Scholar
  13. 13.
  14. 14.
    Fodor, Z., Katz, S.D., Papp, G.: Better than $1/Mflops sustained: a scalable PC-based parallel computer for lattice QCD. Comput. Phys. Commun. 152, 121–134 (2003)CrossRefGoogle Scholar
  15. 15.
  16. 16.
  17. 17.
    Pallas MPI Benchmark now available from Intel as Intel MPI Benchmark (IMB),
  18. 18.
    Patent: Data Communication System and Method, EP 05 012 567.3Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Norbert Eicker
    • 1
  • Thomas Lippert
    • 1
    • 2
  1. 1.Central Institute for Applied Mathematics, John von Neumann Institute for Computing (NIC)Research Center JülichJülichGermany
  2. 2.Department CBergische Universität WuppertalWuppertalGermany

Personalised recommendations