Abstract
The idea that diverse or dissimilar computations could be used to detect errors can be traced back to Dynosius Lardner’s analysis of Babbage’s mechanical computers in the early 19th century. In the modern era of electronic computers, diverse redundancy techniques were pioneered in the 1970’s by Elmendorf, Randell, Aviz̆ienis and Chen. Since then, the tolerance of design faults has been a very active research topic, which has had practical impact on real critical applications. In this paper, we present a brief history of the topic and then describe two contemporary studies on the application of diversity in the fields of robotics and security.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abrial, J.: The B-Book - Assigning Programs to Meanings. Cambridge University Press, Cambridge (1996)
Alami, R., Chatila, R., Fleury, S., Ghallab, M., Ingrand, F.: An architecture for autonomy. International Journal of Robotic Research 17(4), 315–337 (1998)
Amman, P.E., Knight, J.C.: Data diversity: An approach to software fault tolerance. IEEE Trans. on Computers 37(4), 418–425 (1988)
Anderson, T., Barrett, P., Halliwell, D., Moulding, M.: Software fault tolerance: an evaluation. IEEE Trans. on Software Engineering SE 11(12), 1502–1510 (1985)
Anderson, T., Lee, P.: Fault Tolerance - Principles and Practice. Prentice-Hall, Englewood Cliffs (1981)
Arlat, J., Kanekawa, N., Amendola, A., Dufour, J.L., Hirao, Y., Profeta III, J.: Dependability of railway control systems. In: 16th IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-16), pp. 150–155. IEEE CS Press, Vienna (1996)
Arlat, J., Kanoun, K., Laprie, J.C.: Dependability modeling and evaluation of software fault-tolerant systems. IEEE Trans. on Computers 39(4), 504–513 (1990)
Avižienis, A.: The N-version approach to fault-tolerant systems. IEEE Trans. on Software Engineering 11(12), 1491–1501 (1985)
Avižienis, A., Chen, L.: On the implementation of N-version programming for software fault tolerance during execution. In: 1st IEEE-CS Int. Computer Software and Applications Conference (COMPSAC 1977), pp. 149–155. IEEE CS Press, Chicago (1977)
Avižienis, A., Kelly, J.: Fault-tolerance by design diversity: Concepts and experiments. Computer 17(8), 67–80 (1984)
Avižienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and terminology of dependable and secure computing. IEEE Trans. on Dependable and Secure Computing 1(1), 11–33 (2004)
Avižienis, A., Lyu, M., Schutz, W., Tso, K., Voges, U.: DEDIX 87 - a supervisory system for design diversity experiments at UCLA. In: Voges, U. (ed.) Software Diversity in Computerized Control Systems, pp. 129–168. Springer, Wien (1988)
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: 19th ACM Symp. on Operating Systems Principles (SOSP), pp. 164–177. ACM, New York (2003)
Bartlett, J., Gray, J., Horst, B.: Fault tolerance in Tandem computer systems. In: Avižienis, A., Kopetz, H., Laprie, J.C. (eds.) The Evolution of Fault-Tolerant Systems, pp. 55–76. Springer, Vienna (1987)
Beder, D., Randell, B., Romanovsky, A., Rubira, C.: On applying atomic actions and dependable software architectures for developing complex systems. In: 4th IEEE Int. Symp. on Object-Oriented Real-Time Distributed Computing, pp. 103–112. IEEE CS Press, Magdeburg (2001)
Bishop, P., Esp, D., Barnes, M., Humphreys, P., Dahl, G., Lahti, J., Yoshimura, S.: Project on diverse software - an experiment in software reliability. In: Safety of Computer Control Systems (SAFECOMP), pp. 153–158 (1985)
Brière, D., Traverse, P.: Airbus A320/A330/A340 electrical flight controls - a family of fault-tolerant systems. In: 23rd IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-23), pp. 616–623. IEEE CS Press, Toulouse (1993)
Castelli, V., Harper, R., Heidelberger, P., Hunter, S., Trivedi, K., Vaidyanathan, K., Zeggert, W.: Proactive management of software aging. IBM Journal of Research and Development 45(2), 311–332 (2001)
Crosby, P.B.: Cutting the cost of quality; the defect prevention workbook for managers. Industrial Education Institute, Boston (1967)
Crouzet, Y., Waeselynck, H., Lussier, B., Powell, D.: The SESAME experience: from assembly languages to declarative models. In: Mutation 2006 - The Second Workshop on Mutation Analysis, 17th IEEE Int. Symp. on Software Reliability Engineering (ISSRE 2006). IEEE, Raleigh (2006)
Deswarte, Y., Kanoun, K., Laprie, J.C.: Diversity against accidental and deliberate faults. In: Amman, P., Barnes, B., Jajodia, S., Sibley, E. (eds.) Computer Security, Dependability and Assurance: From Needs to Solutions, pp. 171–182. IEEE CS Press, Los Alamitos (1999)
Duflot, L., Levillain, O., Morin, B.: ACPI: Design principles and concerns. In: Chen, L., Mitchell, C., Martin, A. (eds.) Trust 2009. LNCS, vol. 5471, pp. 14–28. Springer, Heidelberg (2009)
Dugan, J., Lyu, M.: Dependability modeling for fault-tolerant software and systems. In: Lyu, M. (ed.) Software Fault Tolerance, pp. 109–138. Wiley & Sons, Chichester (1995)
Eckhardt, D., Caglayan, A., Knight, J., Lee, L., McAllister, D., Vouk, M., Kelly, J.: An experimental evaluation of software redundancy as a strategy for improving reliability. IEEE Trans. on Software Engineering 17(7), 692–6702 (1991)
Eckhardt, D., Lee, L.: A theoretical basis of multiversion software subject to coincident errors. IEEE Trans. on Software Engineering SE-11, 1511–1517 (1985)
Elmendorf, W.: Fault-tolerant programming. In: 2nd IEEE Int. Symp. on Fault Tolerant Computing (FTCS-2), pp. 79–83. IEEE CS Press, Newton (1972)
Ghallab, M., Laruelle, H.: Representation and control in IxTeT, a temporal planner. In: 2nd Int. Conf. on Artificial Intelligence Planning Systems (AIPS 1994), pp. 61–67. AIAA Press, Chicago (1994)
Goldberg, A., Havelund, K., McGann, C.: Runtime verification for autonomous spacecraft software. In: IEEE Aerospace Conference, pp. 507–516 (2005)
Gray, J.: Why do computers stop and what can be done about it? In: 5th Symp. on Reliability in Distributed Software and Database Systems, pp. 3–12. IEEE CS Press, Los Angeles (1986)
Grnarov, A., Arlat, J., Avižienis, A.: On the performance of software fault-tolerant strategies. In: 10th IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-10), pp. 251–253. IEEE CS Press, Kyoto (1980)
Hatton, L.: N-version design vs. one good version. IEEE Software 14(6), 71–76 (1997)
Hennebert, C., Guiho, G.: SACEM: A fault-tolerant system for train speed control. In: 23rd IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-23), pp. 624–628. IEEE CS Press, Toulouse (1993)
Howey, R., Long, D., Fox, M.: VAL: Automatic plan validation, continuous effects and mixed initiative planning using PDDL. In: 16th IEEE International Conference on Tools with Artificial Intelligence, pp. 294–301 (2004)
Huang, Y., Kintala, C., Kolettis, N., Fulton, N.: Software rejuvenation: Analysis, module and applications. In: 25th IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-25), pp. 381–390. IEEE CS Press, Pasadena (1995)
ISO/IEC-15408: Common criteria for information technology security evaluation
Jaeger, E., Hardin, T.: A few remarks about formal development of secure systems. In: 11th IEEE High Assurance Systems Engineering Symposium (HASE), pp. 165–1174 (2008)
Jalote, P., Huang, Y., KIntala, C.: A framework for understanding and handling transient software failures. In: 2nd ISSAT Int. Conf. Reliability and Quality in Design, Orlando, FL, USA, pp. 231–237 (1995)
Kanoun, K.: Real-world design diversity: a case study on cost. IEEE Software 18(4), 29–233 (2001)
Kelly, J., Eckhardt Jr., D.E., Vouk, M., McAllister, D., Caglayan, A.: A large scale second generation experiment in multi-version software: description and early results. In: 18th IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-18), pp. 9–14. IEEE CS Press, Los Alamitos (1988)
Khatib, L., Muscettola, N., Havelund, K.: Mapping temporal planning constraints into timed automata. In: 8th Int. Symp. on Temporal Representation and Reasoning (TIME 2001), pp. 21–27. IEEE, Cividale Del Friuli (2001)
Kim, K., Welch, H.: Distributed execution of recovery blocks: an approach to uniform treatment of hardware and software faults in real-time applications. IEEE Trans. on Computers 38(5), 626–636 (1989)
Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, F., Derrin, P., Elkaduwe, F., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: sel4: Formal verification of an OS kernel. In: 22nd Symp. on Operating Systems Principles (SOSP), pp. 207–220. ACM, Big Sky (2009)
Knight, J., Leveson, N.: An experimental evaluation of the assumption of independence in multi-version programming. IEEE Trans. on Software Engineering SE-12(1), 96–109 (1986)
Laarouchi, Y., Deswarte, Y., Powell, D., Arlat, J., de Nadai, E.: Connecting commercial computers to avionics systems. In: IEEE/AIAA 28th Digital Avionics Systems Conference (DASC 2009), Orlando, FL, USA, pp. 6.D.1–6.D.9 (2009)
Lacombe, E., Nicomette, V., Deswarte, Y.: Enforcing kernel constraints by hardware-assisted virtualization. Journal in Computer Virology 7(1), 1–21 (2011)
Laprie, J.C.: Dependability: Basic concepts and associated terminology. Dependability : Basic Concepts and Terminology LAAS-CNRS, 7 Ave. Colonel Roche, 31077 Toulouse, France, p. 33 (1990)
Laprie, J.C., Arlat, J., Béounes, C., Kanoun, K.: Definition and analysis of hardware-and-software fault-tolerant architectures. Computer 23(7), 39–51 (1990)
Laprie, J.C., Arlat, J., Béounes, C., Kanoun, K.: Architectural issues in software fault tolerance. In: Lyu, M. (ed.) Software Fault Tolerance, pp. 47–78. Wiley & Sons, Chichester (1995)
Lardner, D.: Babbage’s calculating engine. Edinburgh Review 59, 263–327 (1834)
Leveson, N., Cha, S., Knight, J., Shimeall, T.: The use of self checks and voting in software error detection: an empirical study. IEEE Transactions on Software Engineering 16(4), 432–4443 (1990)
Littlewood, B., Popov, P., Strigini, L.: Assessment of the reliability of fault-tolerant software: A bayesian approach. In: Koornneef, F., van der Meulen, M.J.P. (eds.) SAFECOMP 2000. LNCS, vol. 1943, pp. 294–308. Springer, Heidelberg (2000)
Lone Sang, F., Lacombe, É., Nicomette, V., Deswarte, Y.: Exploiting an I/OMMU vulnerability. In: 5th Int’l Conf. on Malicious and Unwanted Software (MALWARE), pp. 7–14 (2010)
Lussier, B., Gallien, M., Guiochet, J., Ingrand, F., Killijian, M.O., Powell, D.: Fault tolerant planning for critical robots. In: 37th Annual IEEE/IFIP Int. Conf. on Dependable Systems and Networks (DSN 2007), pp. 144–153. IEEE CS Press, Edinburgh (2007)
Lussier, B., Gallien, M., Guiochet, J., Ingrand, F., Killijian, M.O., Powell, D.: Planning with diversified models for fault-tolerant robots. In: 17th. Int. Conf. on Automated Planning and Scheduling (ICAPS), pp. 216–223. AAAI, Providence (2007)
Lussier, B., Lampe, A., Chatila, R., Guiochet, J., Ingrand, F., Killijian, M.O., Powell, D.: Fault tolerance in autonomous systems: How and how much? In: 4th IARP - IEEE/RAS - EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments, Nagoya, Japan (2005)
Meulen, M.J.P., van der Revilla, M.A.: The effectiveness of software diversity in a large population of programs. IEEE Trans. on Software Engineering 34(6), 753–764 (2008)
Migneault, G.E.: The cost of software fault tolerance. In: AGARD Symposium on Software Avionics, The Hague, The Netherlands, pp. 37/1–37/8 (1992)
Mongardi, G.: Dependable computing for railway control systems. In: 3rd IFIP Working Conf. on Dependable Computing for Critical Applications (DCCA-3), Palermo, Italy, pp. 255–273 (1993)
Muscettola, N., Dorais, G., Fry, C., Levinson, R., Plaunt, C.: IDEA: Planning at the core of autonomous reactive agents. In: 3rd Int. NASA Workshop on Planning and Scheduling for Space, Houston, TX, USA (2002)
Nguyen-Tuong, A., Evans, D., Knight, J., Cox, B., Davidson, J.: Security through redundant data diversity. In: IEEE/IFIP Int. Conf. on Dependable Systems and Networks, Ancorage, Alaska, USA, pp. 187–196 (2008)
Nicola, V., Goyal, A.: Modeling of correlated failures and community error recovery in multiversion software. IEEE Trans. on Software Engineering 16(3), 350–359 (1990)
Oh, N., Mitra, S., McCluskey, E.: ED4I: Error detection by diverse data and duplicated instructions. IEEE Trans. on Computers 51(2), 180–199 (2002)
Penix, J., Pecheur, C., Havelund, K.: Using model checking to validate AI planner domain models. In: 23rd Annual Software Engineering Workshop, NASA Goddard (1998)
Popov, P., Strigini, L.: Assessing asymmetric fault-tolerant software. In: 21st Int. Symp. on Software Reliability Engineering (ISSRE), pp. 41–450. IEEE CS Press, Los Alamitos (2010)
Randell, B.: System structure for software fault tolerance. IEEE Trans. on Software Engineering SE-1(2), 220–232 (1975)
Randell, B., Romanovsky, A., Rubira, C., Stroud, R., Wu, Z., Xu, J.: From recovery blocks to coordinated atomic actions. In: Randell, B., Laprie, J.C., Kopetz, H., Littlewood, B. (eds.) Predictably Dependable Computer Systems, pp. 87–101. Springer, Heidelberg (1995)
Scott, R., Gault, J., McAllister, D.: Fault-tolerant software reliability modeling. IEEE Trans. on Software Engineering SE-13(5), 582–592 (1987)
Smith, J., Nair, R.: Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann, San Francisco (2005)
Sullivan, G.F., Masson, G.M.: Certification trails for data structures. In: 21st IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-21), pp. 240–247. IEEE CS Press, Montreal (1991)
Tai, A., Meyer, J., Avižienis, A.: Performability enhancement of fault-tolerant software. IEEE Trans. on Reliability 42(2), 227–2237 (1993)
Tomek, L., Muppala, J., Trivedi, K.: Analyses using reward nets. In: Lyu, M. (ed.) Software Fault Tolerance, pp. 139–165. Wiley & Sons, Chichester (1995)
Traverse, P., Lacaze, I., Souyris, J.: Airbus fly-by-wire: A total approach to dependability. In: Jacquart, J. (ed.) Building the Information Society, 18th IFIP World Computer Congress, pp. 191–212. Kluwer Awademic Publishers, Dordrecht (2004)
Varian, M.: VM and the VM community: Past, present, and future (1997), http://web.me.com/melinda.varian/
Voges, U.: Software Diversity in Computerized Control Systems, vol. 2. Springer, Heidelberg (1988)
Xia, C., Lyu, M.: An empirical study on reliability modeling for diverse software systems. In: 15th Int. Symp. on Software Reliability Engineering (ISSRE), pp. 125–136 (2004)
Xia, C., Lyu, M., Vouk, M.: An experimental evaluation on reliability features of N-version programming. In: 16th Int. Symp. on Software Reliability Engineering, ISSRE, pp. 10pp.– 170 (2005)
Xu, J.: The t/(n − 1)-diagnosability and its applications to fault tolerance. In: 21st IEEE Int. Symp. on Fault-Tolerant Computing (FTCS-21), pp. 496–503. IEEE CS Press, Montreal (1991)
Yeh, Y.: Dependability of the 777 primary flight control system. In: Iyer, R., Morganti, M., Fuchs, W.K., Gligor, V. (eds.) Dependable Computing for Critical Applications (DCCA-5), pp. 3–17. IEEE CS Press, Los Alamitos (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Powell, D., Arlat, J., Deswarte, Y., Kanoun, K. (2011). Tolerance of Design Faults. In: Jones, C.B., Lloyd, J.L. (eds) Dependable and Historic Computing. Lecture Notes in Computer Science, vol 6875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24541-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-24541-1_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24540-4
Online ISBN: 978-3-642-24541-1
eBook Packages: Computer ScienceComputer Science (R0)