Abstract
A fault in the computer system is the failure of a component which prevents the computer systems from operating normally. As the computer system operates, it may experience faults due to a variety of reasons. Each fault would generate some type of alerts or error messages to be reported in the monitoring infrastructure. These monitored alert messages will be stored in the management database that is responsible for fault management.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
E. Manoel, M.J. Nielsen, A. Salahshour, S. Sampath, and S. Sudarshanan, Problem Determination using Self-Managing Autonomic Technology, IBM Redbook Number SG-24-6665-00, June 2005.
OASIS Web Services Distributed Management Working Group Common Base event Specification, October 2003.
IBM Support Assistant, http://www.ibm.com/software/support/isa/
T. Acorn and Walden, S., SMART: Support management automated reasoning technology for Compaq customer service. In Proceedings of the Tenth National Conference Conference on Artificial Intelligence. MIT Press, Cambridge, 1992.
M. Steinder and A.S. Sethi, A Survey of fault localization techniques in computer networks, Science of Computer Programming, Special Edition on Topics in System Administration, 53(2): 165–194, November 2004.
A. Ganek and T. Corbi, The dawning of the autonomic computing era, Autonomic Computing . IBM Systems Journal, 42(1): 5–18, 2003.
A.T. Bouloutas, S.B. Calo, A. Finkel, and I. Katzela, Distributed fault identification in telecommunication networks, Journal of Network and Systems Management, 3(3): 295–312, 1995.
S. Brugnoni, R. Manione, E. Montariolo, E. Paschetta, and L. Sisto, An expert system for real time diagnosis of the Italian telecommunications network, In: H.G. Hegering, Y. Yemini (Eds.), Integrated Network Management III, North-Holland, Amsterdam, 1993.
G. Forman, M. Jain, J. Martinka, M. Mansouri-Samani, and A. Snoeren, Automated end-to-end system diagnosis of networked printing services using model based reasoning, In: Ninth International Workshop on Distributed Systems: Operations and Management, University of Delaware, Newark, DE, October 1998, pp. 142–154 [87].
R.D. Gardner and D.A. Harle, Alarm correlation and network fault resolution using the Kohonen self-organizing map, In: Proceedings of IEEE GLOBECOM, Toronto, Canada, September 1997.
P. Hong and P. Sen, Incorporating non-deterministic reasoning in managing heterogeneous network faults, Integrated Network Management II, North-Holland, Amsterdam, 1991, pp. 481–492.
C. Joseph, J. Kindrick, K. Muralidhar, and T. Toth-Fejel, MAP fault management expert system, In: B. Meandzija, J. Westcott (Eds.), Integrated Network Management I, North-Holland, Amsterdam, 1989, pp. 627–636 [68].
S. Katker, A modeling framework for integrated distributed systems fault management, Proceedings of the IFIP/IEEE International Conference on Distributed Platforms, Dresden, Germany, 1996, pp. 187–198.
S. Katker and K. Geihs, A generic model for fault isolation in integrated management systems, Journal of Network and Systems Management, 5(2): 109–130, 1997.
I. Katzela and M. Schwartz, Schemes for fault identification in communication networks, IEEE/ACM Transactions on Networking, 3(6): 733–764, 1995.
S. Kliger, S. Yemini, Y. Yemini, D. Ohsie, and S. Stolfo, A coding approach to event correlation, Proceedings of Integrated Network Managemen, Chapman and Hall, London, 1995, pp. 266–277 [86].
L. Lewis, A case-based reasoning approach to the resolution of faults in communications networks, In: Proceedings of Integrated Network Management III, North-Holland, Amsterdam, 1993, pp. 671–681 [36].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Chandra Verma, D. (2009). Fault Management. In: Principles of Computer Systems and Network Management. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-89009-8_6
Download citation
DOI: https://doi.org/10.1007/978-0-387-89009-8_6
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-89008-1
Online ISBN: 978-0-387-89009-8
eBook Packages: EngineeringEngineering (R0)