Abstract
Nowadays most of cloud data centers deploy high available system in order to provide continuous services, so it’s very important for a high available cluster to detect the node failure (physical machine failure) accurately and timely in a low bandwidth occupation way. However, compared to the traditional cluster environment, the scale of cloud data center increases rapidly with the use of virtualization, so traditional node failure detection models have already faced several new problems. In this paper, we present a three roles and two layers node failure detection model, named as Smart Ring, which fits cloud data center well and strikes a balance between accuracy, instantaneity and bandwidth occupation. It can simultaneously detect the status of physical machines and virtual machines and deal well with multiple nodes failure and network partition. Our experiment results show that Smart Ring has a better performance than most existing models.
Chapter PDF
References
Rudolph, G.: Parallel clustering on a unidirectional ring. In: Proceedings of the 1993 World Transputer Congress on Transputer Applications and Systems, Aachen, Germany, September 20-22, vol. 36, p. 487. Ios Pr. Inc. (1993)
Pasqua, J.: Cluster communication in heartbeat messages. US Patent 7,330,444 (February 12, 2008)
Wang, L., Han, X.: Stability and hopf bifurcation analysis in bidirectional ring network model. Communications in Nonlinear Science and Numerical Simulation (2010)
Savari, S., Kramer, G.: The multimessage unicast capacity region for bidirectional ring networks. In: 2006 IEEE International Symposium on Information Theory, pp. 763–767. IEEE (2006)
Robertson, A.: Linux-ha heartbeat system design. In: Proceedings of the 4th Annual Linux Showcase and Conference, ALS 2000 (2000)
Khan, N., Mahajan, A.: Centralized framework with ring broadcasting for real time traffic in vehicular ad hoc networks. In: 2010 3rd International Conference on Emerging Trends in Engineering and Technology, CETET, pp. 842–847. IEEE (2010)
Hamlyn, A., Cheung, H., Yang, C.: Computer network distributed monitoring and centralized forecasting of utility distribution system operations. In: IEEE Canadian Conference on Electrical and Computer Engineering, CCECE 2008, pp. 001719–001722 (2008)
SGI: Linux failsafe (2009), http://oss.sgi.com/projects/failsafe/
Nilausen, J.: Token ring network management: Performance management. International Journal of Network Management 5(1), 47–53 (1995)
Hutchison, D., Coffield, D.: Simple token ring local area network. Microprocessors and Microsystems 8(4), 171–176 (1984)
Gopal, T., Raja, G., Vijaykumar, D., Sankaranarayanan, V.: Novel fault tolerant token ring network. Microelectronics Reliability 36(5), 707–710 (1996); Fault tolerance; Token ring networks
Siddesh, G., Srinivasa, K., Venugopal, K.: Grm: a reliable and fault tolerant data replication middleware for grid environment. In: Proceedings of the International Conference & Workshop on Emerging Trends in Technology, pp. 810–815. ACM (2011)
Dake, S., Caulfield, C., Beekhof, A.: The corosync cluster engine. In: Linux Symposium 85
Symantec: Network partition (1995), http://www.symantec.com/security_response/glossary/defin-e.jsp?letter=n&word=network-partition
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Xu, L., Chen, W., Wang, Z., Ni, H., Wu, J. (2012). Smart Ring: A Model of Node Failure Detection in High Available Cloud Data Center. In: Park, J.J., Zomaya, A., Yeo, SS., Sahni, S. (eds) Network and Parallel Computing. NPC 2012. Lecture Notes in Computer Science, vol 7513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35606-3_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-35606-3_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35605-6
Online ISBN: 978-3-642-35606-3
eBook Packages: Computer ScienceComputer Science (R0)