Abstract
This paper proposes a policy driven and multi-agent based model to enhance the fault tolerance and recovery capabilities of Web services in distributed environment. The evaluation function of fault specifications and the corresponding handling mechanisms of the services are both defined in policies, which are expressed in XML. During the implementation of the services, the occurrences of faults are monitored by the service monitor agent through the local knowledge on the faults. Such local knowledge is dynamically generated by the service policy agent through querying and parsing the service policies from the service policies repository. When the fault occurs, the service process agent will focus on the process of fault handling and service recovery, which will be directed with the actions defined in the policies upon the specific conditions. Such a policy driven and multi-agent based fault handling approach can address the issues of flexibility, automation and availability.
Similar content being viewed by others
References
Alonso, G., Hagen, C., Agrawal, D., Abbadi, A.E., Mohan, C. 2000. Enhancing the fault tolerance of workflow management systems.IEEE Concurrency,8(3):74–81.
Avizienis, A., 1985. TheN-version approach to fault-tolerant software.IEEE Transactions on Software Engineering,SE-11(12):1491–1501.
Bivens, A., Gao, L., Hulber, M., Szymanski, B., 1999. Agent-Based Network Monitoring. Proceedings of the 3rd International Conference on Autonomous Agents, p.41–53.
Chang, W., 2001. A Resource Efficient Scheme for Network Service Recovery in a Cluster. IEEE 2001, p. 1087–1091.
Clematis, A., Deconinck, G., Gianuzzi, V., 1998. A Flexible State-saving Library for Message-passing Systems. Proc. 6th Euromicro Workshop on Parallel and Distributed Processing, IEEE Comp. Soc. Press.
Ding, Y., Malaka, R., 2000. An Agent-based Architecture for Resource-Aware Mobile Computing. Proc. Intelligent Interactive Assistance and Mobile Multimedia Computing (IMC2000).
El-Darieby, M., Petriu, D., Rlia, J., 2003. Hierarchical End-to-End Service Recovery. Proceedings of the 8th IEEE Symposium on Integrated Network Management (IM’03), p.649–661.
Hong, L., Dong, B., Wei, D., 2002. A Policy-Based Solution for Management of Enhanced Network Services. Proceedings of IEEE TENCON’02, p.1684–1687.
Hwang, S., Kesselman, C., 2003. Grid Workflow: A Flexible Failure Handling Framework for the Grid. Proceedings of 12th IEEE International Symposium on High Performance Distributed Computing (HPDC’03).
Katchabaw, M.J., Lutfiyya, H.L., Marshall, A.D., Bauer, M.A., 1996. Policy-Driven Fault Management in Distributed Systems. Proceedings of the Seventh International Symposium on Software Reliability Engineering (ISSRE’96).
Lee, B., Weissman, J., 2001. Dynamic Replica Management in the Service Grid. 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10’01), p.433–434.
Liabotis, I., Prnjat, O., Sacks, L., 2001. Policy-based Resource Management for Application Level Active Networks. Second IEEE Latin American Network Operations and Management Symposium.
Liu, Z.L., Zhu, M.L., Jiang, M., Wu, T.F., 2003. An overview of QoS protocols and architecture.Journal of Zhejiang University (Engineering Science),37(3):288–294 (in Chinese).
Overeinder, B.J., Wijngaards, N.J.E., van Steen, M., Brazier, F.M.T., 2002. Multi-Agent Support for Internet-Scale Grid Management. Proceedings of the AISB’02 Symposium on AI and Grid Computing, p.18–22.
Sacks, L., Prnjat, O., Liabotis, I., Olukemi, T., Ching, A., Fisher, M., Mckee, P., Georgalas, N., Yoshii, H., 2003. Active robust resource management in cluster computing using policies.Journal of Network and Systems Management, Special Issue on Policy Based Management of Networks and Services,11(3):329–350.
Seilonen, I., Appelqvist, P., Halme, A., Koskinen, K., 2002. Agent-Based Approach to Fault-tolerance in Process Automation Systems. Proceedings of the 3rd International Symposium on Robotics and Automation.
Sloman, M., Lupu, E., 2002. Security and management policy specification.IEEE Network Special Issue on Policy,16(2):10–19.
Yang, K., Gailis, A., Todd, C., 2002. Policy-Based Active Grid Management Architecture. Proceedings of 10th IEEE International Conference on Networks (ICON02), p.243–248.
Zhang, Y., Chakrabarty, K., 2003. Fault Recovery Based on Checkpointing for Hard Real-Time Embedded Systems. Proceedings of the 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, p.320–327.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jing-fan, T., Bo, Z. & Zhi-jun, H. Policy driven and multi-agent based fault tolerance for Web services. J Zheijang Univ Sci A 6, 676–682 (2005). https://doi.org/10.1631/jzus.2005.A0676
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.2005.A0676