Software-Based Fault Detection and Recovery for Cyber-Physical Systems
Cyber-physical systems demand higher levels of reliability for several reasons. First, unlike traditional computer-based systems, cyber-physical systems are more vulnerable to various faults since they operate under harsh working conditions. For instance, sensors and actuator may not always obey their specification due to wear-out or radiation. Second, even a minor fault in cyber-physical systems may lead to serious consequences since they operate under minimal supervision of human operators. In this paper we propose a software framework of fault detection and recovery for cyber-physical systems, called Fault Detection and Recovery for CPS (FDR-CPS). FDR-CPS focuses on specific types of faults related to sensors and actuators, which seem to be the likely cause of critical system failures such as system hangs and crashes. We divide such critical failures into four classes and then present the design and implementation of FDR-CPS that can successfully handle the four classes of critical failures. We also describe a case study with quadrotor to demonstrate how FDR-CPS can be applied in a real world application.
KeywordsCyber-physical system Reliability Fault Detection Recovery
This work was supported partly by Mid-career Researcher Program through NRF (National Research Foundation) grant NRF-2011-0015997 funded by the MEST (Ministry of Education, Science and Technology), partly by the IT R&D Program of MKE/KEIT [10035708, “The Development of CPS (Cyber-Physical Systems) Core Technologies for High Confidential Autonomic Control Software”], partly by Seoul Creative Human Development Program (HM120006), and partly by the MKE (The Ministry of Knowledge Economy), Korea, under the CITRC (Convergence Information Technology Research Center) support program (NIPA-2013-H0401-13-1009) supervised by the NIPA (National IT Industry Promotion Agency).
- 1.Kadav A, Renzelmann MJ, Swift MM (2009) Tolerating Hardware Device Failures in Software. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, pp 59–72Google Scholar
- 2.Graham S (2002) Writing drivers for reliability, robustness and fault tolerant systems. http://www.microsoft.com/whdc/archive/FTdrv.ms
- 3.Ploski J, Rohr M, Schwenkenberg P, Hasselbring W (2007) Research issues in software fault categorization. ACM SIGSOFT Softw Eng Notes 32(6): 1–8 (article No. 6)Google Scholar
- 4.Ball T, Bounimova E, Cook B, Levin V, Lichtenberg J, McGarvey C, Ondrusek B, Rajamani SK, Ustuner A (2006) Thorough static analysis of device drivers. In: Proceedings of the 1st ACM SIGOPS/EuroSys European conference on computer systems, pp 73–85Google Scholar
- 5.Candea G, Fox A (2003) Crash-only software. In: Proceedings of HotOS IX: The 9th workshop on hot topics in operating systemsGoogle Scholar