Self-Checking and Self-Exercising Design for Hierarchic Long-Life Fault-Tolerant Systems

Rennels, David; Kim, Hyeongil

doi:10.1007/978-0-585-28002-8_1

David Rennels &
Hyeongil Kim²

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 285))

53 Accesses

Abstract

This research deals with fault-tolerant computers capable of operating for extended periods without external maintenance. Conventional fault-tolerance techniques such as majority voting are unsuitale for these applications, because performance is too low, power consumption is too high and ab exces- sive number of spares must be included to keep all of the replicated systems working over an extended life. The preferred design approach is to operate as many different computations as possible on single computers, thus maximiz- ing the amount of processing available from limited hardware resources. Fault-tolerance is implemented in a hierarchic fashion. Fault recovery is either done locally within an afflicted computer or, if that unsuccsessfull, by the other working computers when one fails. Concurrent error detrection is required in the computer making up these system since errors must be quickly detected and isolated to allow recovery to begin.

This chaptrer discusses ways of implementing concurrent error detection (i.e., self-checking) and in addition providing self-exercising capabilities that can rapidly expose dormant faults and latent errors. The fundamentals of self- checking design are presented along with an example -- the design of a self - checking self-exercising memory system. A new methodology for implement- ing self-checking in asynchoronous subsystems is discussed along with error simulation result to examine its effectiveness.

This work was supported by the Office of Naval Research, grant N00014-91-J-1009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rennels, D. and J. Rohr, “Fault-Tolerant Parallel Processors for Avionics with Reduced Maintenance,” Proc. 9th Digital Avionics Systems Conference, October 15–18, 1990, Virginia Beach, Virginia.
Google Scholar
W.C. Carter, A.B. Wadia, and D.C. Jessep Jr., “Computer Error Control by Testable Morphic Boolean Functions — A Way of Removing Hardcore”, In Proc. 1972 Int. Symp. Fault-Tolerant Computing, pages 154–159, Newton, Massachusetts, June 1972.
Google Scholar
Rennels, D., “Architectures for Fault-Tolerant Spacecraft Computers”, Proc. of the IEEE, October 1978, 66–10: 1255–1268.
Google Scholar
David A. Rennels and Hyeongil Kim, “VLSI Implementation of A Self-Checking Self-Exercising Memory System”. Proc. 21th Int. Symp. Fault-Tolerant Computing, pages 170–177, Montreal, Canada, June 1991.
Google Scholar
Meyer, J. and L. Wei, “Influence of Workload on Error Recovery in Random Access Memories,” IEEE Trans. Computers, April 1988, pp. 500–507.
Google Scholar
Z. Barziiai, V.S. Iyengar, B.K. Rosen, and G.M. Silberman, “Accurate Fault Modeling and Efficient Simulation of Differential CVS Circuits” In International Test Conference, pages 722–729, Philadelphia, PA, Nov 1985.
Google Scholar
R. K. Montoye, “Testing Scheme for Differential Cascode Voltage Switch Circuits”. IBM Technical Disclosure Bulletin, 27(10B):6148–6152, Mar 1985.
Google Scholar
Niraj K. Jha, “Fault Detection in CVS Parity Trees: Application to SSC CVS Parity and Two-Rail Checkers”, In Proc. 19th Int. Symp. Fault-Tolerant Computing, pages 407–414, Chicago, IL, June 1989.
Google Scholar
Niraj K. Jha, “Testing of Differential Cascode Voltage Switch One-Count Generators”. IEEE Journal of Solid-State Circuits, 25(1):246–253, Feb 1990
Article Google Scholar
Andres R. Takach and Niraj K. Jha., “Easily Testable DCVS Multiplier”. In IEEE International Symposium on Circuits and Systems, pages 2732–2735, New Orleans, LA., June 1990.
Google Scholar
N. Kanopoulos and N. Vasanthavada, “Testing of Differential Cascode Voltage Switch (DCVS) Circuits”, IEEE Journal of Solid-State Circuits, 25(3):806–813. June 1990.
Article Google Scholar
N. Kanopoulos, Dimitris Pantzartzis, and Frederick R. Bartram, “Design of Self-Checking Circuits Using DCVS Logic: A Case Study”, IEEE Transactions on Computers, 41(7):891–896, July 1992.
Article Google Scholar
Alain J. Martin, Steven M. Burns, T. K. Lee, Drazen Borkovic, and Pieter J. Hazewindus, “The Design of an Asynchronous Microprocessor”. Technical Report Caltech-CS-TR-89-2, CSD, Caltech, 1989
Google Scholar
Gordon M. Jacobs and Robert W. Broderson, “A Fully Asynchronous Digital Signal Processor Using Self-timed Circuits”. IEEE Journal of Solid-State Circuits, 25(6):1526–1537, Dec 1990.
Article Google Scholar
W.C. Carter and P.R. Schneider, “Design of Dynamically Checked Computers”, In Proc. IFIP Congress 68, pages 878–883, Edinburgh, Scotland, Aug 1968.
Google Scholar
Richard M. Sedmak and Harris L. Liebergot, “Fault Tolerance of a General Purpose Computer Implemented by Very Large Scale Integration”. IEEE Transactions on Computer, 29(6):492–500, June 1980.
Google Scholar
Teresa H. Meng. Synchronization Design for Digital Systems, Kluwer Academic Publishers, 1991.
Google Scholar
A. Avizienis and D. Renneis, “Fault-Tolerance Experiments with the JPL-STAR Computer”. Dig. of the 6th Annual IEEE Computer Society Int. Conf. (COMPCON), San Francisco, 1972, pp. 321–324.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of California at Los Angeles, USA
Hyeongil Kim

Authors

David Rennels
View author publications
You can also search for this author in PubMed Google Scholar
Hyeongil Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Office of Naval Research, USA
Gary M. Koob & Clifford G. Lau &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rennels, D., Kim, H. (1994). Self-Checking and Self-Exercising Design for Hierarchic Long-Life Fault-Tolerant Systems. In: Koob, G.M., Lau, C.G. (eds) Foundations of Dependable Computing. The Kluwer International Series in Engineering and Computer Science, vol 285. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-28002-8_1

Download citation

DOI: https://doi.org/10.1007/978-0-585-28002-8_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-9486-0
Online ISBN: 978-0-585-28002-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics