Abstract
Network failures may have a major impact on our society. There are many possible causes of network failures, of which the most significant is operator errors. Consequently, the development of new network management schemes to tackle operator errors is important. We have already proposed a basic idea of a new network-wide rollback scheme to tackle operator errors. In the proposed scheme, we introduce a server to manage historical versions of sets of device configuration. An operator rolls back a set of device configuration via the server when the operator detects a network failure. In this paper, we present a detail of the network-wide rollback scheme. In addition, we provide three rollback procedures, and implement a prototype system to evaluate their rollback time. The proposed scheme will serve for fast recovery from operator errors, as the minimum rollback time is about 41 seconds, when 50 routers are rolled back.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Patterson, D., et al.: Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies. Computer Science Technical Report UCB//CSD-02-1175, U.C.Berkeley (2002)
Oppenheimer, D., Ganapathi, A., Patterson, D.A.: Why do Internet services fail, and what can be done about it? In: Proc. of USITS 2003 (2003)
International Electrotechnical Vocabulary Chapter 191 (May 2008), http://www.electropedia.org/iev/iev.nsf/Welcome?OpenForm
Jeng, M., Sieqel, H.J.: Design and Analysis of Dynamic Redundancy Networks. IEEE Trans. on Computers 37(9), 1019–1029 (1988)
Chen, M., Kiciman, E., Fratkin, E., Fox, A., Brewew, E.: Pinpoint: Problem Determination in Large, Dynamic Internet Services. In: Proc. IPDS Track 2002, pp. 595–604 (2002)
Kiciman, E., Fox, A.: Detecting Application- Level Failures in Component-based Internet Services. IEEE Transactions on Neural Networks 16(5), 1027–1041 (2005)
Brown, A.B., Patterson, D.A.: Undo for Operators: Building an Undoable E-mail Store. In: Proc. of USENIX 2003, pp. 1–14 (2003)
Yoshihara, K., Arai, D., Idoue, A., Horiuchi, H.: Proposal on Network-Wide Rollback Scheme for Fast Recovery from Operator Errors. In: Clemm, A., Granville, L.Z., Stadler, R. (eds.) DSOM 2007. LNCS, vol. 4785, pp. 199–202. Springer, Heidelberg (2007)
Cisco Systems Inc.: Cisco IOS Configuration Fundamentals. MacMillan Technical Publishing (1997)
CAIDA (May 2008), http://www.caida.org
Xen (May 2008), http://xen.org/
Quagga Routing Suite (May 2008), http://www.quagga.net/
Labovitz, C., Ahuja, A., Jahanian, F.: Experimental Study of Internet Stability and Backbone Failures. In: The 29th International Symposium on Fault-Tolerant Computing, pp. 278–285 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arai, D., Yoshihara, K., Idoue, A. (2008). Network-Wide Rollback Scheme for Fast Recovery from Operator Errors Toward Dependable Network. In: Ma, Y., Choi, D., Ata, S. (eds) Challenges for Next Generation Network Operations and Service Management. APNOMS 2008. Lecture Notes in Computer Science, vol 5297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88623-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-88623-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88622-8
Online ISBN: 978-3-540-88623-5
eBook Packages: Computer ScienceComputer Science (R0)