Skip to main content

Self-Checking and Self-Exercising Design for Hierarchic Long-Life Fault-Tolerant Systems

  • Chapter
Foundations of Dependable Computing

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 285))

  • 53 Accesses

Abstract

This research deals with fault-tolerant computers capable of operating for extended periods without external maintenance. Conventional fault-tolerance techniques such as majority voting are unsuitale for these applications, because performance is too low, power consumption is too high and ab exces- sive number of spares must be included to keep all of the replicated systems working over an extended life. The preferred design approach is to operate as many different computations as possible on single computers, thus maximiz- ing the amount of processing available from limited hardware resources. Fault-tolerance is implemented in a hierarchic fashion. Fault recovery is either done locally within an afflicted computer or, if that unsuccsessfull, by the other working computers when one fails. Concurrent error detrection is required in the computer making up these system since errors must be quickly detected and isolated to allow recovery to begin.

This chaptrer discusses ways of implementing concurrent error detection (i.e., self-checking) and in addition providing self-exercising capabilities that can rapidly expose dormant faults and latent errors. The fundamentals of self- checking design are presented along with an example -- the design of a self - checking self-exercising memory system. A new methodology for implement- ing self-checking in asynchoronous subsystems is discussed along with error simulation result to examine its effectiveness.

This work was supported by the Office of Naval Research, grant N00014-91-J-1009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rennels, D. and J. Rohr, “Fault-Tolerant Parallel Processors for Avionics with Reduced Maintenance,” Proc. 9th Digital Avionics Systems Conference, October 15–18, 1990, Virginia Beach, Virginia.

    Google Scholar 

  2. W.C. Carter, A.B. Wadia, and D.C. Jessep Jr., “Computer Error Control by Testable Morphic Boolean Functions — A Way of Removing Hardcore”, In Proc. 1972 Int. Symp. Fault-Tolerant Computing, pages 154–159, Newton, Massachusetts, June 1972.

    Google Scholar 

  3. Rennels, D., “Architectures for Fault-Tolerant Spacecraft Computers”, Proc. of the IEEE, October 1978, 66–10: 1255–1268.

    Google Scholar 

  4. David A. Rennels and Hyeongil Kim, “VLSI Implementation of A Self-Checking Self-Exercising Memory System”. Proc. 21th Int. Symp. Fault-Tolerant Computing, pages 170–177, Montreal, Canada, June 1991.

    Google Scholar 

  5. Meyer, J. and L. Wei, “Influence of Workload on Error Recovery in Random Access Memories,” IEEE Trans. Computers, April 1988, pp. 500–507.

    Google Scholar 

  6. Z. Barziiai, V.S. Iyengar, B.K. Rosen, and G.M. Silberman, “Accurate Fault Modeling and Efficient Simulation of Differential CVS Circuits” In International Test Conference, pages 722–729, Philadelphia, PA, Nov 1985.

    Google Scholar 

  7. R. K. Montoye, “Testing Scheme for Differential Cascode Voltage Switch Circuits”. IBM Technical Disclosure Bulletin, 27(10B):6148–6152, Mar 1985.

    Google Scholar 

  8. Niraj K. Jha, “Fault Detection in CVS Parity Trees: Application to SSC CVS Parity and Two-Rail Checkers”, In Proc. 19th Int. Symp. Fault-Tolerant Computing, pages 407–414, Chicago, IL, June 1989.

    Google Scholar 

  9. Niraj K. Jha, “Testing of Differential Cascode Voltage Switch One-Count Generators”. IEEE Journal of Solid-State Circuits, 25(1):246–253, Feb 1990

    Article  Google Scholar 

  10. Andres R. Takach and Niraj K. Jha., “Easily Testable DCVS Multiplier”. In IEEE International Symposium on Circuits and Systems, pages 2732–2735, New Orleans, LA., June 1990.

    Google Scholar 

  11. N. Kanopoulos and N. Vasanthavada, “Testing of Differential Cascode Voltage Switch (DCVS) Circuits”, IEEE Journal of Solid-State Circuits, 25(3):806–813. June 1990.

    Article  Google Scholar 

  12. N. Kanopoulos, Dimitris Pantzartzis, and Frederick R. Bartram, “Design of Self-Checking Circuits Using DCVS Logic: A Case Study”, IEEE Transactions on Computers, 41(7):891–896, July 1992.

    Article  Google Scholar 

  13. Alain J. Martin, Steven M. Burns, T. K. Lee, Drazen Borkovic, and Pieter J. Hazewindus, “The Design of an Asynchronous Microprocessor”. Technical Report Caltech-CS-TR-89-2, CSD, Caltech, 1989

    Google Scholar 

  14. Gordon M. Jacobs and Robert W. Broderson, “A Fully Asynchronous Digital Signal Processor Using Self-timed Circuits”. IEEE Journal of Solid-State Circuits, 25(6):1526–1537, Dec 1990.

    Article  Google Scholar 

  15. W.C. Carter and P.R. Schneider, “Design of Dynamically Checked Computers”, In Proc. IFIP Congress 68, pages 878–883, Edinburgh, Scotland, Aug 1968.

    Google Scholar 

  16. Richard M. Sedmak and Harris L. Liebergot, “Fault Tolerance of a General Purpose Computer Implemented by Very Large Scale Integration”. IEEE Transactions on Computer, 29(6):492–500, June 1980.

    Google Scholar 

  17. Teresa H. Meng. Synchronization Design for Digital Systems, Kluwer Academic Publishers, 1991.

    Google Scholar 

  18. A. Avizienis and D. Renneis, “Fault-Tolerance Experiments with the JPL-STAR Computer”. Dig. of the 6th Annual IEEE Computer Society Int. Conf. (COMPCON), San Francisco, 1972, pp. 321–324.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Kluwer Academic Publishers

About this chapter

Cite this chapter

Rennels, D., Kim, H. (1994). Self-Checking and Self-Exercising Design for Hierarchic Long-Life Fault-Tolerant Systems. In: Koob, G.M., Lau, C.G. (eds) Foundations of Dependable Computing. The Kluwer International Series in Engineering and Computer Science, vol 285. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-28002-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-0-585-28002-8_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-9486-0

  • Online ISBN: 978-0-585-28002-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics