Adaptive Failure Detection Algorithm for Grid Systems

  • Dong Tian
  • Tai-ping Mao
  • Jun Xie
Part of the Advances in Soft Computing book series (AINSC, volume 54)


Aimed at the grid system being more in failure and existing failure detection algorithms being not able to satisfy the unique requirement of grids, it was presented to a kind of adaptive failure detection algorithm in this paper. According to the characteristics of grids and the small world theory, the authors established a sort of small world based grid system model and a sort of failure detection model. By means of combining unreliable fault detection method with heartbeat strategy and grey prediction model, it was designed to dynamic heartbeat mechanism, and presented to the adaptive failure detection algorithm for grid systems further. Experimental result demonstrates that it is valid and effective in method, and it can be used for fault detection under grid environment.


Grid Small-world Grey prediction Heartbeat strategy Fault detection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Li, Y., Lan, Z.: Using Adaptive Fault Tolerance to Improve Application Robustness on the TeraGrid. In: Proc. of TeraGrid 2007 (2007)Google Scholar
  2. 2.
    Hwang, S., Kesselman, C.: A Flexible Framework for Fault Tolerance in the Grid. Journal of Grid Computing 1(3), 251–272 (2003)zbMATHCrossRefGoogle Scholar
  3. 3.
    Abawajy, J.H.: Fault Detection Service Architecture for Grid Computing Systems. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004, vol. 3044, pp. 107–115. Springer, Heidelberg (2004)Google Scholar
  4. 4.
    Shi, X., Jin, H., Han, Z., et al.: ALTER: Adaptive failuredetection services for Grids. In: Proc. of the 2005 IEEE International Conference on Services Computing (SCC 2005), pp. 355–358 (2005)Google Scholar
  5. 5.
    Jain, A., Shyamasundar, R.K.: Failure Detection and Membership in Grid Environments. In: Proceedings of the Fifth IEEE/ACM International Workshop on Grid Computing (GRID 2004), pp. 44–52 (2004)Google Scholar
  6. 6.
    Kleinberg, J.: The Small-World Phenomenon: An algorithmic perspective. ACM Synpon Theory of Computing (2000)Google Scholar
  7. 7.
    Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of ACM 43(2), 225–267 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Deng, J.: Control problems of grey system. Systems Control Lett. 5, 288–294 (1982)Google Scholar
  9. 9.
    Kleinberg, J.: Complex networks and decentralized search algorithms. In: Proc. of the International Congress of Mathematicians (ICM) (2006)Google Scholar
  10. 10.
    Erfan, S., Zhuhui, D.: Efficient grid service location mechanism based on virtual organization and the small-world theory. Jounal of Computer Research and Development 40(12), 1743–1748 (2003)Google Scholar
  11. 11.
    Chen, W., Toueg, S., Aguilera, M.K.: On the quality of service of failure detectors. IEEE Transactions on Computers 51(2), 13–32 (2002)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Dong Tian
    • 1
    • 2
  • Tai-ping Mao
    • 2
  • Jun Xie
    • 2
  1. 1.College of Software EngineeringChongqing UniversityChongqingP.R. China
  2. 2.Guizhou Electronic Computer Software Development CenterGuizhouP.R. China

Personalised recommendations