Advertisement

Transient Processor/Bus Fault Tolerance for Embedded Systems

With hybrid redundancy and data fragmentation
  • Alain Girault
  • Hamoudi Kalla
  • Yves Sorel
Part of the IFIP International Federation for Information Processing book series (IFIPAICT, volume 225)

Abstract

We propose an approach to build fault-tolerant distributed real-time embedded systems. From a given system description (application algorithm and architecture) and a given fault hypothesis (type and number of faults to be tolerated), we generate automatically a static fault-tolerant multiprocessor schedule of the algorithm components on the target architecture, which minimizes the schedule length, and tolerates transient faults of both processors and communication media. Our approach is dedicated to heterogeneous architectures with multiple processors linked by several shared buses. It is based on hybrid redundancy and data fragmentation strategies, which allow fast fault detection and handling. This scheduling problem is NP-hard and we rely on a heuristic algorithm to obtain efficiently an approximate solution. Our simulation results show that our approach generally reduces the schedule length overhead.

Keywords

real-time embedded systems safety-critical systems transient faults scheduling heuristics hybrid redundancy data fragmentation heterogeneous architectures 

REFERENCES

  1. [1]
    P. Jalote. Fault-Tolerance in Distributed Systems. Prentice Hall, Englewood Cliffs, New Jersey, 1994.Google Scholar
  2. [2]
    K. Hashimoto, T. Tsuchiya, and T. Kikuno. Effective scheduling of duplicated tasks for fault-tolerance in multiprocessor systems. IEICE Trans. on Information and Systems, E85-D(3):525–534, March 2002.Google Scholar
  3. [3]
    A. Girault, H. Kalla, M. Sighireanu, and Y. Sorel. An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In International Conference on Dependable Systems and Networks, DSN’03, San-Francisco, USA, June 2003. IEEE.Google Scholar
  4. [4]
    K. Ahn, J. Kim, and S. Hong. Fault-tolerant real-time scheduling using passive replicas. In Pacific Rim International Symposium on Fault-Tolerant Systems, Taipei, Taiwan, December 1997.Google Scholar
  5. [5]
    X. Qin, H. Jiang, and D. R. Swanson. An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems. In International Conference on Parallel Processing, pages 360–386, Vancouver, Canada, August 2002.Google Scholar
  6. [6]
    Y. Oh and S. H. Son. Scheduling real-time tasks for dependability. Journal of Operational Research Society, 48(6):629–39, June 1997.CrossRefGoogle Scholar
  7. [7]
    N. Kandasamy, J.E Hayes, and B.T. Murray. Dependable communication synthesis for distributed embedded systems. In International Conference on Computer Safety, Reliability and Security, SAFECOMP’ 03, Edinburgh, UK, September 2003.Google Scholar
  8. [8]
    S. Duhnan, T. Nieberg, J. Wu, and E. Havinga. Trade-off between traffic overhead and reliability in multipath routing for wireless sensor networks. In Wireless Communications and Networking Conference, 2003.Google Scholar
  9. [9]
    B. Kao, H. Garcia-Molina, and D. Barbara. Aggressive transmissions of short messages over redundant paths. IEEE Trans. on Parallel and Distributed Systems, 5(1):102–109, January 1994.CrossRefGoogle Scholar
  10. [10]
    H. Kopetz and G. Bauer. The time-triggered architecture. Proceedings of the IEEE, 91(1): 112–126, October 2003.Google Scholar
  11. [11]
    C. Dima, A. Girault, and Y. Sorel. Static fault-tolerant scheduling with “pseudotopological” orders. In Joint Conference FORMATS-FTRTFT’04, volume 3253 of LNCS, Grenoble, France, September 2004. Springer-Verlag.Google Scholar
  12. [12]
    R. Vaidyanathan and S. Nadella. Fault-tolerant multiple bus networks for fan-in algorithms. In International Parallel Processing Symposium, pages 674–681, April 1996.Google Scholar
  13. [13]
    M. Pizza, L. Strigini, A. Bondavalli, and F. Di Giandomenico. Optimal discrimination between transient and permanent faults. In 3rd IEEE High Assurance System Engineering Symposium, pages 214–223, Bethesda, MD, USA, 1998.Google Scholar

Copyright information

© International Federation for Information Processing 2006

Authors and Affiliations

  • Alain Girault
    • 1
  • Hamoudi Kalla
    • 2
  • Yves Sorel
    • 3
  1. 1.INRIA Rhône-AlpesSaint-Ismier cedexFRANCE
  2. 2.IRISACampus Universitaire de BeaulieuRennes Cedex France CedexFRANCE
  3. 3.INRIA RocquencourtLe Chesnay CedexFRANCE

Personalised recommendations