ER-TCP: An Efficient Fault-Tolerance Scheme for TCP Connections

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3758)


This paper proposes a novel scheme, called ER-TCP, which transparently masks the failures on the server nodes in a cluster from clients at TCP connection level. Connections at the server side are actively and fully replicated to remain consistency. A log mechanism is designed to cooperate with the replication to achieve small sacrifice on the performance of communication and makes the scheme scale beyond a few nodes, even when they have different processing capacities. The scheme is justified by experiments conducted on prototype implementation.


Failure Detection Request Message Server Side Server Node Incoming Request 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alvisi, L., Bressoud, T.C., El-Khashab, A., Marzullo, K., Zagorodnov, D.: Wrapping Server-Side TCP to Mask Connection Failures. In: Proceedings of IEEE INFOCOM, Anchorage, Alaska, USA, pp. 329–337 (2001)Google Scholar
  2. 2.
    Armstrong, S., Freier, A., Marzullo, K.: Multicast Transport Protocol, Internet RFC 1301, IETF (February 1992)Google Scholar
  3. 3.
    Burton-Krahn, N.: HotSwap - Transparent Server Failover for Linux. In: Proceedings of USENIX Sixteenth Systems Administration Conference (LISA 2002), Berkeley, California, November 2002, pp. 205–212 (2002)Google Scholar
  4. 4.
    Chandra, T.D., Toueg, S.: Unreliable failure detectors for asynchronous systems. In: Proceedings of the 10th ACM Symposium on Principles of Distributed Computing, Montreal, Quebec, Canada, pp. 325–340 (1991)Google Scholar
  5. 5.
    Cisco White Papers,
  6. 6.
    Floyd, S., Jacobson, V., Liu, C., McCanne, S., Zhang, L.: A Reliable Multicast Framework for Lightweighted Sessions and Application Level Framing. IEEE/ACM Transactions on Networking 5(6), 784–803 (1997)CrossRefGoogle Scholar
  7. 7.
    Linux Virtual Server,
  8. 8.
    Marwah, M., Mishra, S., Fetzer, C.: TCP Server Fault Tolerance Using Connection Migration to a Backup Server. In: Proceedings of the 2003 IEEE International Conference on Dependable Systems and Networks (DSN 2003), San Francisco, CA, pp. 373–382 (2003)Google Scholar
  9. 9.
    Schulzrinne, H., Casner, S., Frederick, R., Jacobson, V.: RTP: A Transport Protocol for Real-Time Applications, Internet RFC 1889 (1996)Google Scholar
  10. 10.
    Shao, Z., Jin, H., Chen, B., Xu, J., Yue, J.: HARTS: High Availability Cluster Architecture with Redundant TCP Stacks. In: Proceedings of the International Performance Computing and Communication Conference (IPCCC 2003), Phoenix, Arizona, USA, pp. 255–262 (2003)Google Scholar
  11. 11.
    Shenoy, G., Satapati, S.K., Bettati, R.: HydraNet-FT: Network Support for Dependable Services. In: Proceedings of the 20th IEEE International Conference on Distributed Computing Systems (ICDCS 2000), Taipei, pp. 699–706 (2000)Google Scholar
  12. 12.
    Snell, Q.O., Mikler, A., Gustafson, J.L.: Netpipe: A Network Protocol Independent Performance Evaluator. In: Proceedings of IASTED International Conference on Intelligent Information Management and Systems, Washington, DC, USA, pp. 196–204 (1996)Google Scholar
  13. 13.
    Sultan, F., Srinivasan, K., Iyer, D., Iftode, L.: Migratory TCP: Connection migration for service continuity in the Internet. In: Proceedings of the International Conference on Distributed Computing Systems (ICDCS 2002), Vienna, Austria, pp. 469–470 (2002)Google Scholar
  14. 14.
    Whang, Z., Crowcroft, J., Diot, C., Ghosh, A.: Framework for Reliable Multicast Application Design. In: Proceedings of High Performance Protocol Architecture (HIPPARCH), Uppsala, Sweden, pp. 123–131 (1997)Google Scholar
  15. 15.
    Zhang, R., Abdelzaher, T.F., Stankovic, J.A.: Efficient TCP connection failover in web server clusters. In: Proceedings of the IEEE INFOCOM, Hong Kong, China, pp. 1220–1229 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  1. 1.Cluster and Grid Computing Lab., School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhanChina

Personalised recommendations