Skip to main content
Log in

Explicit coordination to prevent congestion in data center networks

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Large cluster-based cloud computing platforms increasingly use commodity Ethernet technologies, such as Gigabit Ethernet, 10GigE, and Fibre Channel over Ethernet (FCoE), for intra-cluster communication. Traffic congestion can become a performance concern in the Ethernet due to consolidation of data, storage, and control traffic over a common layer-2 fabric, as well as consolidation of multiple virtual machines (VMs) over less physical hardware. Even as networking vendors race to develop switch-level hardware support for congestion management, we make the case that virtualization has opened up a complementary set of opportunities to reduce or even eliminate network congestion in cloud computing clusters. We present the design, implementation, and evaluation of a system called XCo, that performs explicit coordination of network transmissions over a shared Ethernet fabric to proactively prevent network congestion. XCo is a software-only distributed solution executing only in the end-nodes. A central controller uses explicit permissions to temporally separate (at millisecond granularity) the transmissions from competing senders through congested links. XCo is fully transparent to applications, presently deployable, and independent of any switch-level hardware support. We present a detailed evaluation of our XCo prototype across a number of network congestion scenarios, and demonstrate that XCo significantly improves network performance during periods of congestion. We also evaluate the behavior of XCo for large topologies using NS3 simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. In: Proc. of SIGCOMM 2008, Aug. 2008

    Google Scholar 

  2. Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., Vahdat, A.: Hedera: Dynamic flow scheduling for data center networks. In: Proc. of Networked Systems Design and Implementation (NSDI) Symposium, San Jose, CA, April 2010

    Google Scholar 

  3. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Symposium on Operating Systems Principles (2003)

    Google Scholar 

  4. Braam, P.J.: File systems for clusters from a protocol perspective. http://www.lustre.org

  5. Caesar, M., Caldwell, D., Feamster, N., Rexford, J., Shaikh, A., van der Merwe, J.: Design and implementation of a routing control platform. In: Proc. of NSDI (2005)

    Google Scholar 

  6. Casado, M., Freedman, M.J., Pettit, J., Luo, J., McKeown, N., Shenker, S.: Ethane: Taking control of the enterprise. SIGCOMM Comput. Commun. Rev. 37(4), 1–12 (2007)

    Article  Google Scholar 

  7. Chen, Y., Griffith, R., Liu, J., Katz, R.H., Joseph, A.D.: Understanding TCP Incast throughput collapse in datacenter networks. In: Workshop on Research on Enterprise Networking, pp. 73–82 (2009)

    Google Scholar 

  8. Clark, C., Fraser, K., Hand, S., Hansen, J., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proc. of NSDI (2005)

    Google Scholar 

  9. Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  10. Diot, C., Boudec, J.-Y.L.: Control of best effort traffic. IEEE Netw. 14–15 (2001)

  11. Feuser, O., Wenzel, A.: On the effects of the IEEE 802.3x flow control in full-duplex Ethernet lans. In: Proc. of Local Computer Networks, Lowell, MA (1999)

    Google Scholar 

  12. Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003)

  13. Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D.A., Patel, P., Sengupta, S.: VL2: A scalable and flexible data center network. In: SIGCOMM (2009)

    Google Scholar 

  14. Gumanow, G.: Solving the hypervisor network I/O bottleneck solarflare virtualization acceleration. White Paper, SF-101233-TM, Solarflare Communications (2007)

  15. IEEE 802.1: 802.1Q—Virtual LANs, http://www.ieee802.org/1/pages/802.1Q.html

  16. IEEE 802.1 Data Center Bridging Task Group: http://www.ieee802.org/1/pages/dcbridges.html

  17. INCITS Technical Committee T11: Fibre Channel over Ethernet, http://www.t11.org/fcoe

  18. Input/Output Memory Management Unit: http://en.wikipedia.org/wiki/IOMMU

  19. Internet Small Computer Systems Interface (iSCSI): http://tools.ietf.org/rfc/rfc3720.txt

  20. Kant, K.: Towards a virtualized data center transport protocol. In: Workshop on High Speed Networks (2008)

    Google Scholar 

  21. Kim, C., Caesar, M., Rexford, J.: Floodless in seattle: a scalable Ethernet architecture for large enterprises. In: Proc. of the ACM SIGCOMM (2008)

    Google Scholar 

  22. Kuzmanovic, A., Knightly, E.W.: Low-rate TCP-targeted denial of service attacks: The shrew vs. the mice and elephants. In: SIGCOMM (2003)

    Google Scholar 

  23. Linux Advanced Routing and Traffic Control. http://lartc.org/howto/

  24. Memcached. A distributed memory object caching system, http://memcached.org/

  25. Mudigonda, J., Yalagandula, P., Al-Fares, M., Mogul, J.C.: Spain: Cots data-center Ethernet for multipathing over arbitrary topologies. In: Proc. of Networked Systems Design and Implementation (NSDI) Symposium, San Jose, CA, April 2010

    Google Scholar 

  26. Mysore, R.N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., Vahdat, A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. In: SIGCOMM (2009)

    Google Scholar 

  27. Nagle, D., Serenyi, D., Matthews, A.: The Panasas ActiveScale storage cluster: Delivering scalable high bandwidth storage. In: Proc. of Supercomputing (2004)

    Google Scholar 

  28. Netperf. http://www.netperf.org/netperf/

  29. Phanishayee, A., Krevat, E., Vasudevan, V., Andersen, D.G., Ganger, G.R., Gibson, G.A., Seshan, S.: Measurement and analysis of TCP throughput collapse in cluster-based storage systems. In: Proc. of File and Storage Technologies, pp. 1–14 (2008)

    Google Scholar 

  30. Raghavan, B., Vishwanath, K., Ramabhadran, S., Yocum, K., Snoeren, A.C.: Cloud control with distributed rate limiting. In: SIGCOMM (2007)

    Google Scholar 

  31. Rajanna, V.S., Shah, S., Jahagirdar, A., Gopalan, K.: Xco: Explicit coordination for preventing congestion in data center Ethernet. In: Proc. of International Workshop on Storage Network Architecture and Parallel I/Os, May 2010

    Google Scholar 

  32. Scaling memcached at Facebook. http://www.facebook.com/note.php?note_id=39391378919

  33. Sharma, S., Gopalan, K., Nanda, S., Chiueh, T.: Viking: A multi-spanning-tree Ethernet architecture for metropolitan area and cluster networks. In: Proc. of IEEE Infocom, Hong Kong, China, March (2004)

    Google Scholar 

  34. Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M., Noveck, D.: Network file system (NFS) version 4 protocol. Request for Comments—RFC 3530, April 2003

  35. Stanojevic, R., Shorten, R.: Generalized distributed rate limiting. In: Proc. of International Workshop on Quality of Service (IWQoS), Charleston, SC (2009)

    Google Scholar 

  36. Vasudevan, V., Phanishayee, A., Shah, H., Krevat, E., Andersen, D.G., Ganger, G.R., Gibson, G.A., Mueller, B.: Safe and effective fine-grained TCP retransmissions for datacenter communication. In: SIGCOMM (2009)

    Google Scholar 

  37. Zhang, H.: Service disciplines for guaranteed performance service in packet-switching networks. Proc. IEEE 83(10), 1374–1396 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kartik Gopalan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajanna, V.S., Jahagirdar, A., Shah, S. et al. Explicit coordination to prevent congestion in data center networks. Cluster Comput 15, 183–200 (2012). https://doi.org/10.1007/s10586-011-0156-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-011-0156-9

Keywords

Navigation