Advertisement

Unreliable Error Correction in Dynamic Systems

  • Christoforos N. Hadjicostis
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 660)

Abstract

This chapter focuses on constructing reliable dynamic systems exclusively out of unreliable components, including unreliable components in the error-correcting mechanism. At each time step, a particular component can suffer a transient fault with a probability that is bounded by a constant. Faults between different components and between different time steps are treated as independent. Essentially, the chapter considers an extension of the techniques described in Chapter 2 to a dynamic system setting. Since dynamic systems evolve in time according to their internal state, the major task is to effectively deal with the effects of error propagation, i.e., the effects of errors that corrupt the system state.

Keywords

Fault Tolerance Linear Code LDPC Code Parity Check Propagation Failure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avizienis, A. (1981). Fault-tolerance by means of external monitoring of computer systems. In Proceedings of the 1981 National Computational Conference, pages 27–40.Google Scholar
  2. Bhattacharyya, A. (1983). On a novel approach of fault detection in an easily testable sequential machine with extra inputs and extra outputs. IEEE Transactions on Computers, 32(3):323–325.CrossRefMATHGoogle Scholar
  3. Gács, P. (1986). Reliable computation with Cellular Automata. Journal of Computer and System Sciences, 32(2): 15–78.MathSciNetCrossRefMATHGoogle Scholar
  4. Gallager, R. G. (1963). Low-Density Parity Check Codes. MIT Press, Cambridge, Massachusetts.Google Scholar
  5. Gallager, R. G. (1968). Information Theory and Reliable Communication. John Wiley & Sons, New York.MATHGoogle Scholar
  6. Hadjicostis, C. N. (1999). Coding Approaches to Fault Tolerance in Dynamic Systems. PhD thesis, EECS Department, Massachusetts Institute of Technology, Cambridge, Massachusetts.Google Scholar
  7. Hadjicostis, C. N. (2000). Fault-tolerant dynamic systems. In Proceedings of ISIT 2000, the Int. Symp. on Information Theory, page 444.Google Scholar
  8. Hadjicostis, C. N. and Verghese, G. C. (1999). Fault-tolerant linear finite state machines. In Proceedings of the 6th IEEE Int. Conf. on Electronics, Circuits and Systems, pages 1085–1088.Google Scholar
  9. Iyengar, V. S. and Kinney, L. L. (1985). Concurrent fault detection in microprogrammed control units. IEEE Transactions on Computers, 34(9):810–821.CrossRefGoogle Scholar
  10. Johnson, B. (1989). Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley, Reading, Massachusetts.Google Scholar
  11. Larsen, R. W. and Reed, I. S. (1972). Redundancy by coding versus redundancy by replication for failure-tolerant sequential circuits. IEEE Transactions on Computers, 21(2): 130–137.CrossRefMATHGoogle Scholar
  12. Leveugle, R., Koren, Z., Koren, I., Saucier, G., and Wehn, N. (1994). The Hyeti defect tolerant microprocessor: A practical experiment and its cost-effectiveness analysis. IEEE Transactions on Computers, 43(12): 1398–1406.CrossRefGoogle Scholar
  13. Leveugle, R. and Saucier, G. (1990). Optimized synthesis of concurrently checked controllers. IEEE Transactions on Computers, 39(4):419–425.CrossRefGoogle Scholar
  14. Parekhji, R. A., Venkatesh, G., and Sherlekar, S. D. (1991). A methodology for designing optimal self-checking sequential circuits. In Proceedings of the Int. Conf. VLSI Design, pages 283–291. IEEE CS Press.Google Scholar
  15. Parekhji, R. A., Venkatesh, G., and Sherlekar, S. D. (1995). Concurrent error detection using monitoring machines. IEEE Design and Test of Computers, 12(3):24–32.CrossRefGoogle Scholar
  16. Pippenger, N. (1990). Developments in the synthesis of reliable organisms from unreliable components. In Proceedings of Symposia in Pure Mathematics, volume 50, pages 311–324.Google Scholar
  17. Pradhan, D. K. (1996). Fault-Tolerant Computer System Design. Prentice Hall, Englewood Cliffs, New Jersey.Google Scholar
  18. Robinson, S. H. and Shen, J. P. (1992). Direct methods for synthesis of self-monitoring state machines. In Proceedings of 22nd Fault-Tolerant Computing Symp., pages 306–315. IEEE CS Press.Google Scholar
  19. Siewiorek, D. and Swarz, R. (1998). Reliable Computer Systems: Design and Evaluation. A.K. Peters.MATHGoogle Scholar
  20. Sipser, M. and Spielman, D. A. (1996). Expander codes. IEEE Transactions on Information Theory, 42(6): 1710–1722.MathSciNetCrossRefMATHGoogle Scholar
  21. Spielman, D. A. (1996a). Highly fault-tolerant parallel computation. In Proceedings of the Annual Symp. on Foundations of Computer Science, volume 37, pages 154–160.Google Scholar
  22. Spielman, D. A. (1996b). Linear-time encodable and decodable error-correcting codes. IEEE Transactions on Information Theory, 42(6):1723–1731.MathSciNetCrossRefMATHGoogle Scholar
  23. Taylor, M. G. (1968a). Reliable computation in computing systems designed from unreliable components. The Bell System Journal, 47(10):2239–2366.Google Scholar
  24. Taylor, M. G. (1968b). Reliable information storage in memories designed from unreliable components. The Bell System Journal, 47(10):2299–2337.MATHGoogle Scholar
  25. Wang, G. X. and Redinbo, G. R. (1984). Probability of state transition errors in a finite state machine containing soft failures. IEEE Transactions on Computers, 33(3):269–277.CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2002

Authors and Affiliations

  • Christoforos N. Hadjicostis
    • 1
  1. 1.Coordinated Science Laboratory and Department of Electrical and Computer EngineeringUniversity of Illinois at Urbana-ChampaignUSA

Personalised recommendations