Abstract
Large cluster-based cloud computing platforms increasingly use commodity Ethernet technologies, such as Gigabit Ethernet, 10GigE, and Fibre Channel over Ethernet (FCoE), for intra-cluster communication. Traffic congestion can become a performance concern in the Ethernet due to consolidation of data, storage, and control traffic over a common layer-2 fabric, as well as consolidation of multiple virtual machines (VMs) over less physical hardware. Even as networking vendors race to develop switch-level hardware support for congestion management, we make the case that virtualization has opened up a complementary set of opportunities to reduce or even eliminate network congestion in cloud computing clusters. We present the design, implementation, and evaluation of a system called XCo, that performs explicit coordination of network transmissions over a shared Ethernet fabric to proactively prevent network congestion. XCo is a software-only distributed solution executing only in the end-nodes. A central controller uses explicit permissions to temporally separate (at millisecond granularity) the transmissions from competing senders through congested links. XCo is fully transparent to applications, presently deployable, and independent of any switch-level hardware support. We present a detailed evaluation of our XCo prototype across a number of network congestion scenarios, and demonstrate that XCo significantly improves network performance during periods of congestion. We also evaluate the behavior of XCo for large topologies using NS3 simulations.
Similar content being viewed by others
References
Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. In: Proc. of SIGCOMM 2008, Aug. 2008
Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., Vahdat, A.: Hedera: Dynamic flow scheduling for data center networks. In: Proc. of Networked Systems Design and Implementation (NSDI) Symposium, San Jose, CA, April 2010
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Symposium on Operating Systems Principles (2003)
Braam, P.J.: File systems for clusters from a protocol perspective. http://www.lustre.org
Caesar, M., Caldwell, D., Feamster, N., Rexford, J., Shaikh, A., van der Merwe, J.: Design and implementation of a routing control platform. In: Proc. of NSDI (2005)
Casado, M., Freedman, M.J., Pettit, J., Luo, J., McKeown, N., Shenker, S.: Ethane: Taking control of the enterprise. SIGCOMM Comput. Commun. Rev. 37(4), 1–12 (2007)
Chen, Y., Griffith, R., Liu, J., Katz, R.H., Joseph, A.D.: Understanding TCP Incast throughput collapse in datacenter networks. In: Workshop on Research on Enterprise Networking, pp. 73–82 (2009)
Clark, C., Fraser, K., Hand, S., Hansen, J., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proc. of NSDI (2005)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Diot, C., Boudec, J.-Y.L.: Control of best effort traffic. IEEE Netw. 14–15 (2001)
Feuser, O., Wenzel, A.: On the effects of the IEEE 802.3x flow control in full-duplex Ethernet lans. In: Proc. of Local Computer Networks, Lowell, MA (1999)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. 37(5) (2003)
Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D.A., Patel, P., Sengupta, S.: VL2: A scalable and flexible data center network. In: SIGCOMM (2009)
Gumanow, G.: Solving the hypervisor network I/O bottleneck solarflare virtualization acceleration. White Paper, SF-101233-TM, Solarflare Communications (2007)
IEEE 802.1: 802.1Q—Virtual LANs, http://www.ieee802.org/1/pages/802.1Q.html
IEEE 802.1 Data Center Bridging Task Group: http://www.ieee802.org/1/pages/dcbridges.html
INCITS Technical Committee T11: Fibre Channel over Ethernet, http://www.t11.org/fcoe
Input/Output Memory Management Unit: http://en.wikipedia.org/wiki/IOMMU
Internet Small Computer Systems Interface (iSCSI): http://tools.ietf.org/rfc/rfc3720.txt
Kant, K.: Towards a virtualized data center transport protocol. In: Workshop on High Speed Networks (2008)
Kim, C., Caesar, M., Rexford, J.: Floodless in seattle: a scalable Ethernet architecture for large enterprises. In: Proc. of the ACM SIGCOMM (2008)
Kuzmanovic, A., Knightly, E.W.: Low-rate TCP-targeted denial of service attacks: The shrew vs. the mice and elephants. In: SIGCOMM (2003)
Linux Advanced Routing and Traffic Control. http://lartc.org/howto/
Memcached. A distributed memory object caching system, http://memcached.org/
Mudigonda, J., Yalagandula, P., Al-Fares, M., Mogul, J.C.: Spain: Cots data-center Ethernet for multipathing over arbitrary topologies. In: Proc. of Networked Systems Design and Implementation (NSDI) Symposium, San Jose, CA, April 2010
Mysore, R.N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., Vahdat, A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. In: SIGCOMM (2009)
Nagle, D., Serenyi, D., Matthews, A.: The Panasas ActiveScale storage cluster: Delivering scalable high bandwidth storage. In: Proc. of Supercomputing (2004)
Netperf. http://www.netperf.org/netperf/
Phanishayee, A., Krevat, E., Vasudevan, V., Andersen, D.G., Ganger, G.R., Gibson, G.A., Seshan, S.: Measurement and analysis of TCP throughput collapse in cluster-based storage systems. In: Proc. of File and Storage Technologies, pp. 1–14 (2008)
Raghavan, B., Vishwanath, K., Ramabhadran, S., Yocum, K., Snoeren, A.C.: Cloud control with distributed rate limiting. In: SIGCOMM (2007)
Rajanna, V.S., Shah, S., Jahagirdar, A., Gopalan, K.: Xco: Explicit coordination for preventing congestion in data center Ethernet. In: Proc. of International Workshop on Storage Network Architecture and Parallel I/Os, May 2010
Scaling memcached at Facebook. http://www.facebook.com/note.php?note_id=39391378919
Sharma, S., Gopalan, K., Nanda, S., Chiueh, T.: Viking: A multi-spanning-tree Ethernet architecture for metropolitan area and cluster networks. In: Proc. of IEEE Infocom, Hong Kong, China, March (2004)
Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M., Noveck, D.: Network file system (NFS) version 4 protocol. Request for Comments—RFC 3530, April 2003
Stanojevic, R., Shorten, R.: Generalized distributed rate limiting. In: Proc. of International Workshop on Quality of Service (IWQoS), Charleston, SC (2009)
Vasudevan, V., Phanishayee, A., Shah, H., Krevat, E., Andersen, D.G., Ganger, G.R., Gibson, G.A., Mueller, B.: Safe and effective fine-grained TCP retransmissions for datacenter communication. In: SIGCOMM (2009)
Zhang, H.: Service disciplines for guaranteed performance service in packet-switching networks. Proc. IEEE 83(10), 1374–1396 (1995)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rajanna, V.S., Jahagirdar, A., Shah, S. et al. Explicit coordination to prevent congestion in data center networks. Cluster Comput 15, 183–200 (2012). https://doi.org/10.1007/s10586-011-0156-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-011-0156-9