Abstract
A common approach in embedded systems to achieve faulttolerance is to reboot the computer whenever some non-permanent error is detected. All the system code and data are recreated from scratch, and a previously established checkpoint, hopefully not corrupted, is used to restart the application data. The confidence is thus restored on the activity of the computer. The idea explored in this paper is that of unconditionally resetting the computer in each control frame (the classic read sensors → calculate control action → update actuators cycle). A stable-storage based in RAM is used to preserve the system’s state between consecutive cleanups and a standard watchdog timer guarantees that a reset is forced whenever an error crashes the system. We have evaluated this approach by using fault-injection in the controller of a standard temperature control system. The experimental observations show that the Reset-Driven Fault Tolerance is a very simple yet effective technique to improve reliability at an extremely low cost since it is a conceptually simple, software only solution with the advantage of being application independent.
This work was partially supported by the Portuguese Foundation for Science and Technology under the POSI programme and the FEDER programme of the European Union, through the R&D Unit 326/94 (CISUC) and the project PRAXIS/P/EEI/10205/1998 (CRON).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Åström, K. J., Hägglund, T.: PID Controllers: Theory, Design, and Tuning. Second edition. Instrument Society of America (1995) ISBN 1-55617-516-7
Avizienis, A., Kelly, J. P. J.: Fault Tolerance by Design Diversity: Concepts and Experiments. IEEE Computer, Vol. 17. No. 8, August (1984) 67–80
Bennet, S. Real-time Computer Control-An Introduction. Second edition, Prentice Hall Series in Systems and Control Engineering, M. J. Grimble ed. (1994)
Carreira, J., Madeira, H., Silva, J.G.: Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers. IEEE Transactions on Software Engineering, February (1998) 125–135
Cunha, J.C., Maia, R., Rela, M. Z., Silva, J.G.: A Study on Failure Models in Feedback Control Systems. International Conference on Dependable Systems and Networks, Goteborg, Sweden (2001)
Cunha, J. C., Silva, J. G.: Software-Implemented Stable Storage in Main Memory. Brazilian Symposium on Fault-Tolerant Computing, Florianópolis, Brazil (2001)
Cunha, J. C, Rela, M. Z., Silva, J. G.: Can Software-Implemented Fault-Injection be used on Real-Time Systems? 3rd European Dependable Computing Conference, Prague, Czech Republic (1999)
Huang, Y., Kintala, C., Kolettis, N., Fulton, N. D.: Software Rejuvenation: Analysis, Module and Applications. 25th International Symposium on Fault Tolerant Computing Systems (1995) 381–390
Jahanian, F.: Fault-Tolerance in Embedded Real-Time Systems. Hardware and Software Architectures for Fault Tolerance, Michel Banâtre and Peter A. Lee Eds. Springer-Verlag (1994)
Kopetz, H.: Real-Time Systems: Design Principles for Distributed Embedded Applications. Kluwer Academic Series in Engineering and Computer Science, John A. Stankovic ed. (1997)
Powell, D., Veríssimo, P., Bonn, G., Waeselynck, F., Seaton, D.: The Delta-4 Approach to Dependability in Open Distributed Computing Systems. International Symposium on Fault Tolerant Computing Systems, Tokyo (1988)
Process Trainer PT326. Feedback Instruments Limited. http://www.fbk.com
Randell, B.: System Sructure for Software Fault-Tolerance. IEEE Transactions on Software Engineering, Vol. SE-1, No. 2, June (1975) 220–232
SMX Simple Multitasking Executive. http://www.smxinfo.com
Somani, A. K., Vaidya, N. H.: Understanding Fault Tolerance and Reliability. IEEE Computer, April (1997) 45–50
Vinter, J., Aidemark, J., Folkesson, P., Karlsson, J.: Reducing Critical Failures for Control Algorithms Using Executable Assertions and Best Effort Recovery. Int. Conference on Dependable Systems and Networks, Goteborg, Sweden (2001)
Yurcik, W., Doss, D.: Achieving Fault-Tolerant Software with Rejuvenation and Reconfiguration. IEEE Software, July/August (2001) 48–52
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cunha, J.C., Correia, A., Henriques, J., Rela, M.Z., Silva, J.G. (2002). Reset-Driven Fault Tolerance. In: Bondavalli, A., Thevenod-Fosse, P. (eds) Dependable Computing EDCC-4. EDCC 2002. Lecture Notes in Computer Science, vol 2485. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36080-8_13
Download citation
DOI: https://doi.org/10.1007/3-540-36080-8_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00012-9
Online ISBN: 978-3-540-36080-3
eBook Packages: Springer Book Archive