The Evolution of Fault Tolerant Computing at the University of Illinois

Abraham, J. A.; Metze, G.; Iyer, R. K.; Patel, J. H.

doi:10.1007/978-3-7091-8871-2_11

J. A. Abraham⁷,
G. Metze⁷,
R. K. Iyer⁷ &
…
J. H. Patel⁷

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 1))

99 Accesses

Abstract

The University of Illinois has been active in research in the fault-tolerant computing field for over 25 years. Fundamental ideas have been proposed and major contributions made by researchers at the University of Illinois in the areas of testing and diagnosis, concurrent error detection, and fault tolerance. This paper traces the origins of these ideas and their development within the University of Illinois, as well as their influence upon research at other institutions, and outlines current directions of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abraham, J. A., Metze, G., 1978: Roving Diagnosis for High-Performance Digital Systems. Proc. Conf. on Information Sciences and Systems, pp. 221–226.
Google Scholar
Abraham, J. A., 1979: “An Improved Algorithm for Network Reliability”, IEEE Trans, on Network Reliability R-28, pp 58–61
Article Google Scholar
Abraham, J. A., Gajski, D. D., 1981: “Design of Testable Structures Defined by Simple Loops”, IEEE Trans, on Computers C-30, pp. 875–884
Google Scholar
Abraham, J. A., Davidson, E. S., Patel, J. H., 1983: “Memory System Design for Tolerating Single-Event Upsets”, IEEE Trans, on Nuclear Science NS-30, No. 6, pp. 4339–4344
Article Google Scholar
Abraham, J. A., Shih, H. -C., 1985: “Testing of MOS VLSI Circuits”, Proc. Int. Symp. on Circuits and Systems, pp. 1297–1300.
Google Scholar
Anderson, D. A., 1971: “Design of Self-Checking Digital Networks”, Coordinated Science Laboratory Technical Report R-527, University of Illinois, Urbana, Illinois.
Google Scholar
Anderson, D. A., Metze, G., 1973: “Design of Totally Self-Checking Circuits for m-out-of-n Codes”, IEEE Trans, on Computers C-22, No. 3, pp. 263–269
Article Google Scholar
Banerjee, P., Abraham, J. A., 1984a: “Characterization and Testing of Physical Failures in MOS Logic Circuits”, IEEE Design and Test 1, pp. 76–86
Article Google Scholar
Banerjee, P., Abraham, J. A., 1984b: “Fault-Secure Algorithms for Multiple Processor Systems”, Proc. 11th Int. Symp. on Computer Architecture, pp. 279–287.
Google Scholar
Banerjee, P., Abraham, J. A., 1985: “A Multivalued Algebra for Modeling Physical Failures in MOS VLSI Circuits”, IEEE Trans, on Computer-Aided Design, CAD-4, No. 3, pp. 312–321
Article Google Scholar
Banerjee, P., Abraham, J. A., 1986: “Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems”, IEEE Trans, on Computers C-35, No. 4,pp. 296–306
Article Google Scholar
Bose, P., Abraham, J. A., 1982: “Test Generation for Programmable Logic Arrays”, Proc. ACM/IEEE 19th Design Automation Conf., pp. 574–580.
Google Scholar
Brahme, D., Abraham, J. A., 1984: “Functional Testing of Microprocessors”, IEEE Trans, on Computers C-33, No. 6, pp. 475–485
Article Google Scholar
Breuer, M. A., Ismaeel, A. A., 1983: “Roving Emulation as a Fault Detection Mechanism”, Proc. 13th Int. Symp. on Fault-Tolerant Computing, pp. 206–215.
Google Scholar
Carter, W. C., Schneider, P. R., 1968: “Design of Dynamically Checked Computers”, Proc. IFIP Congress 2, pp. 878–883
Google Scholar
Cha, C. W., 1974: “Multiple Fault Diagnosis in Combinational Networks”, Coordinated Science Laboratory Technical Report R-650, University of Illinois, Urbana, Illinois.
Google Scholar
Chang, H. Y., Manning, E., Metze, G, 1970: “Fault Diagnosis of Digital Systems”, Huntington, NY: Robert E., Krieger Publishing Company.
Google Scholar
Cheng, W. -T., Patel, J. H., 1984: “Concurrent Error Detection in Iterative Logic Arrays”, Proc. 14th Int. Symp. on Fault-Tolerant Computing, pp. 10–15.
Google Scholar
Cheng, W. -T., Patel, J. H., 1985a: “A Minimum Test Set for Multiple-Fault Detection in Ripple-Carry Adders”
Google Scholar
Proc. Int. Conf. on Computer Design, pp. 435–438.
Google Scholar
Cheng, W. -T., Patel, J. H., 1985b: “Multiple-Fault Detection in Iterative Logic Arrays”, Proc. Int. Test Conf., pp. 493–499.
Google Scholar
Cheng, W. -T. Patel, J. H., 1985c: “A Shortest Length Test Sequence for Sequential-Fault Detection in Ripple Carry Adders”, Proc. Int. Conf. on Computer-Aided Design, pp. 71–73.
Google Scholar
Chillarege, R., Iyer, R. K., 1985: “The Effect of System Workload on Error Latency: An Experimental Study”, Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, pp. 69–77.
Google Scholar
Chillarege, R., Iyer, R. K., 1986: “Fault Latency in the Memory-An Experimental Study on VAX 11/780”, Proc. 16th Int. Symp. on Fault-Tolerant Computing.
Google Scholar
Choi, Y. -H., Malek, M., 1985: “A Fault-Tolerant FFT Processor”, Proc. 15th Int. Symp. on Fault-Tolerant Computing, pp. 266–271.
Google Scholar
Chou, T. C. -K., Abraham, J. A., 1980: “Performance/Availability Modeling of Shared Resource Multiprocessors”, IEEE Trans, on Reliability R-29, pp. 70–74
Google Scholar
Chou, T. C. -K., Abraham, J. A., 1983: “Load Redistribution under Failure in Distributed Systems”, IEEE Trans, on Computers C-32, pp. 799–808
Article Google Scholar
Dahbura, A. T., Masson, G. M., 1984: “An Order 0(n2 5) Fault Identification Algorithm for Diagnosable Systems”, IEEE Trans, on Computers C-33, pp. 486–492
Google Scholar
Davis, T. A., Kunda, R. P., Fuchs, W. K., 1985: “Testing of Bit-Serial Multipliers”, Proc. Int. Conf. on Computer Design, pp. 430–434.
Google Scholar
Dussault, J., 1977: “On the Design of Self-Checking Systems under Various Fault Models”, Coordinated Science Laboratory Technical Report R-781, University of Illinois, Urbana, Illinois.
Google Scholar
Friedman, A. D., Simoncini, L., 1980: “System-Level Fault Diagnosis”, Computer (Special Issue on Fault-Tolerant Computing) 13, No. 3, pp. 47–53
Google Scholar
Fuchs, W. K., Abraham, J. A., Huang, K. -H., 1983: “Concurrent Error Detection in VLSI Interconnection Networks”, Proc. 10th Int. Symp. on Computer Architecture, pp. 309–315. Also reprinted in: Interconnection Networks for Parallel and Distributed Processing (Wu, C. -H., Fung, T. -Y., eds.), pp. 380–386. IEEE Press.
Google Scholar
Fuchs, W. K., Abraham, J. A., 1984: “A Unified Approach to Concurrent Error Detection in Highly Structured Logic Arrays”, Proc. 14th Int. Symp. on Fault-Tolerant Computing, pp. 4–9
Google Scholar
Fujii, R., Abraham, J. A., 1985: “Self-Test for Microprocessors”, Proc. Int. Test Conf., pp. 356–361.
Google Scholar
Fujiwara, H., Kinoshita, K., 1981: “A Design of Programmable Logic Arrays with Universal Tests”, IEEE Trans, on Computers CD-30, No. 11, pp. 823–828
Article Google Scholar
Hayes, J. P., 1971: “A NAND Model for Fault Diagnosis in Combinational Logic Networks”, IEEE Trans, on Computers C-20, pp. 1496–1506
Google Scholar
Hong, S. J., Ostapko, D. L., 1980: “FITPLA: A Programmable Logic Array for Function-Independent Testing”, Proc. 10th Int. Conf. on Fault-Tolerant Computing, pp. 131–136.
Google Scholar
Hua, K. A., Jou, J. -Y., Abraham, J. A., 1984: “Built-in Tests for VLSI Finite-State Machines”, Proc. 14th Int. Conf. on Fault-Tolerant Computing, pp. 292–297.
Google Scholar
Huang, K. -H., Abraham, J. A., 1982: “Low-Cost Schemes for Fault Tolerance in Matrix Operations with Array Processors”, Proc. 12th Int. Symp. on Fault-Tolerant Computing, pp. 330–337.
Google Scholar
Huang, K. -H., Abraham, J. A., 1984a: “Algorithm-Based Fault Tolerance for Matrix Operations”, IEEE Trans, on Computers (Special Issue on Reliable and Fault-Tolerant Computing) C-33, pp. 518–528
Google Scholar
Huang, K. -H., Abraham, J. A., 1984b: “Fault-Tolerant Algorithms and their Applications to Solving Laplace Equations”, Proc. Int. Conf. on Parallel Processing, pp. 117–122.
Google Scholar
Iyer, R. K., Rossetti, D. J., 1985: “Effect of System Workload on Operating System Reliability: A Study on the IBM 3081”, IEEE Trans, on Software Engineer ng (Special Issue on Software Reliability, Part 1) SE-11, No.: pp. 1438–1448.
Google Scholar
Iyer, R. K., Rossetti, D. J., 1986: “A Measurement-Based Model for Workload Dependency of CPU Errors”, IEEE Trans, on Computers C-35, No. 6 (to appear).
Google Scholar
Jansch, I., Courtois, B., 1985: “Strongly Language Disjoint Checkers”, Proc. 15th Int. Symp. on Fault-Tolerant Computing, pp. 390–395.
Google Scholar
Jha, N. K., Abraham, J. A., 1984: “The Design of Totally Self-Checking Embedded Checkers”, Proc. 14th Int. Symp. on Fault-Tolerant Computing, pp. 265–270.
Google Scholar
Jha, N. K., Abraham, J. A. 1985a: “Techniques for Efficient MOS Implementation of Totally Self-Checking Checkers”, Proc. 15th Int. Symp. on Fault-Tolerant Computing, pp. 430–435.
Google Scholar
Jha, N. K., Abraham, J. A., 1985b: “Design of Testable CMOS Logic Circuits under Arbitrary Delays”, IEEE Trans, on Computer-Aided Design, CAD-4, No. 3, pp. 312–321
Google Scholar
Jou, J. -Y., Abraham, J. A., 1984: “Fault-Tolerant Matrix Operations on Multiple Processor Systems using Weighted Checksums”, Proc. SPIE Conf., pp. 94–101.
Google Scholar
Jou, J. -Y., Abraham, J. A., 1985: “Fault-Tolerant FFT Networks”, Proc. Int. Symp. on Fault-Tolerant Computing, pp. 338–343.
Google Scholar
Laha, S., Patel, J. H., 1983: “Error Correction in Arithmetic Operations using Time Redundancy”, Proc. 13th Int. Symp. on Fault-Tolerant Computing, pp. 298–305.
Google Scholar
Luk, F. T., 1985: “Algorithm-Based Fault Tolerance for Parallel Matrix Equation Solvers”, Proc. SPIE Conf. ( Real-Time Signal Processing VIII ) 564.
Google Scholar
Mak, G. -P., Davidson, E. S., Abraham, J. A., 1982: “The Design of PLAs with Concurrent Error Detection”, Proc. 12th Int. Symp. on Fault-Tolerant Computing, pp. 303–310.
Google Scholar
Manning, E., 1966: “On Computer Self-Diagnosis: Part I and II”, IEEE Trans. Electronic Computers EC-15, pp. 873–890
Google Scholar
Marlett, R. A., 1966: “On the Design and Testing of Self-Diagnosable Computers”, Coordinated Science Laboratory Technical Report R-293, University of Illinois, Urbana, Illinois.
Google Scholar
McCluskey, E. J., Clegg, F. W., 1971: “Fault Equivalence in Combinational Logic Networks”, IEEE Trans, on Computers C-20, pp. 1286–1293.
Google Scholar
Meagher, R. E., Nash, J. P., 1952: “The ORDVAC”, Review of Electronic Digital Computers, pp. 37–43.
Google Scholar
Muller, D. E., Bartky, J. S., 1959: “A Theory of Asynchronous Circuits”, Proc. Int. Symp. on Theory of Switching, pp. 204–243.
Google Scholar
Nair, R., Thatte, S. M., Abraham, J. A., 1978: “Efficient Algorithms for Testing Semiconductor Random-Access Memories”, IEEE Trans, on Computers C-27, No. 6, pp. 572–576
Article MathSciNet Google Scholar
Patel, J. H., Fung, L. Y., 1982: “Concurrent Error Detection in ALUs by Recomputing with Shifted Operands”, IEEE Trans, on Computers, vol. C-31, pp. 589–595.
Google Scholar
Patel, J. H., Fung, L. Y., 1983: “Concurrent Error Detection in Multiply and Divide Arrays”, IEEE Trans, on Computers, vol. C-32, pp. 417–422.
Article Google Scholar
Pollard, L. H., Patel, J. H., 1983: “Correction of Errors in Data Transmission using Time Redundancy”, Proc. 13th Int. Symp. on Fault-Tolerant Computing, pp. 314–317.
Google Scholar
Preparata, F. P., Metze, G., Chien, R. T., 1967: “On the Connection Assignment Problem of Diagnosable Systems”, IEEE Trans, on Electronic Computers EC-16, No. 6, pp. 848–854
Article Google Scholar
Reynolds, D. A., Metze, G., 1978: “Fault Detection Capabilities of Alternating Logic”, IEEE Trans, on Computers, vol. C-27, pp. 1093–1098.
Google Scholar
Rogers, W. A., Abraham, J. A., 1985a: “High-Level Hierarchical Fault Simulation Techniques”, Proc. ACM Computer Science Conference, pp. 89–97.
Google Scholar
Rogers, W. A., Abraham, J. A., 1985b: “CHIEFS: A Concurrent, Hierarchical, and Extensible Fault Simulator”, Proc. Int. Test Conf., pp. 710–716.
Google Scholar
Schertz, D. R., Metze, G., 1968: “On the Indistinguishability of Faults in Digital Systems”, Proc. 6th Ann. Allerton Conf. on Circuit and System Theory, pp. 752–760.
Google Scholar
Schertz, D. R., 1969: “On the Representation of Digital Faults”, Coordinated Science Laboratory Technical Report R-418, University of Illinois, Urbana, Illinois.
Google Scholar
Schertz, D. R. and Metze, G., 1972: “A New Representation for Faults in Combinational Digital Circuits”, IEEE Trans, on Computers, C-21, No. 8, pp. 858–866
Article Google Scholar
Seshu, S., Freeman, D. N., 1962: “The Diagnosis of Asynchronous Sequential Switching Systems”, IRE Trans, on Electronic Computers EC-11, No. 4, pp. 459–465
Article MathSciNet Google Scholar
Seshu, S., 1964: “The Logic Organizer and Diagnosis Programs”, Coordinated Science Laboratory Technical Report R-226, University of Illinois, Urbana, Illinois.
Google Scholar
Seshu, S., 1965: “On an Improved Diagnosis Program”, IEEE Trans, on Electronic Computers EC-14, No. 1, pp. 76–79
Article Google Scholar
Shih, H. -C., Rahmeh, J. T., Abraham, J. A., 1985: “An MOS Fault Simulator with Timing Information”, Proc. Int. Conf. on Computer-Aided Design, pp. 45–47.
Google Scholar
Smith, J. E., Metze, G., 1975: “On the Existence of Combinational Networks with Arbitrary Multiple Redundancies”, Coordinated Science Laboratory Technical Report R-692, University of Illinois, Urbana, Illinois.
Google Scholar
Smith, J. E., 1976: “The Design of Totally Self-Checking Combinational Circuits”, Coordinated Science Laboratory Technical Report R-737, University of Illinois, Urbana, Illinois.
Google Scholar
Smith, J. E., Metze, G., 1978: “Strongly Fault-Secure Logic Networks”, IEEE Trans, on Computers C-27, No. 6, pp. 491–499.
Article MathSciNet Google Scholar
Smith, J. E., 1979: “On Necessary and Sufficient Conditions for Multiple Fault Undetectability”, IEEE Trans, on Computers C-28, pp. 801–802
Google Scholar
Suk, D. S., Reddy, S. M., 1981: “A March Test for Functional Faults in Semiconductor Random Access Memories”, IEEE Trans, on Computers C-30, pp. 982–984
Article Google Scholar
Thatte, S. M., Abraham, J. A., 1977: “Testing of Semiconductor Random Access Memories”, Proc. 7th Int. Symp. on Fault-Tolerant Computing, pp. 81–87.
Google Scholar
Thatte, S. M., Abraham, J. A., 1980: “Test Generation for Microprocessors”, IEEE Trans, on Computers C-29, No. 6, pp. 429–441.
Article MathSciNet Google Scholar
To, K., 1973: “Fault Folding for Irredundant and Redundant Combinational Circuits”, IEEE Trans, on Computers C-22, No. 11, pp. 1008–1015.
Article MathSciNet Google Scholar
Treuer, R., Fujiwara, H., Agarwal, V. K., 1985: “A Low-Overhead, High Coverage, Built-in Self-Test PLA Design”, Proc. 15th Int. Symp. on Fault-Tolerant Computing, pp. 112–117.
Google Scholar
Wheeler, D. J., Robertson, J. E., 1953: “Diagnostic Programs for the ILLIAC”, Proc. IRE 41, pp. 1320–1325.
Article MathSciNet Google Scholar
Wong, C. -Y., Fuchs, W. K., Abraham, J. A., Davidson, E. S., 1983: “The Design of a Microprogram Control Unit with Concurrent Error Detection”, Proc. 13th Int. Symp. on Fault-Tolerant Computing, pp. 476–483.
Google Scholar
Yen, M. M., 1984: “Design of a Microprogram Control Unit with Concurrent Error Detection”, Computer Systems Group Technical Report CSG-30, Coordinated Science Laboratory, University of Illinois, Urbana, Illinois.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering and the Coordinated Science Laboratory, University of Illinois, Urbana, Illinois, 61801, USA
J. A. Abraham, G. Metze, R. K. Iyer & J. H. Patel

Authors

J. A. Abraham
View author publications
You can also search for this author in PubMed Google Scholar
G. Metze
View author publications
You can also search for this author in PubMed Google Scholar
R. K. Iyer
View author publications
You can also search for this author in PubMed Google Scholar
J. H. Patel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UCLA, Los Angeles, Calif., USA
Algirdas Avižienis
Technical University, Wien, Austria
Hermann Kopetz
LAAS, Toulouse, France
Jean-Claude Laprie

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abraham, J.A., Metze, G., Iyer, R.K., Patel, J.H. (1987). The Evolution of Fault Tolerant Computing at the University of Illinois. In: Avižienis, A., Kopetz, H., Laprie, JC. (eds) The Evolution of Fault-Tolerant Computing. Dependable Computing and Fault-Tolerant Systems, vol 1. Springer, Vienna. https://doi.org/10.1007/978-3-7091-8871-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-7091-8871-2_11
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-8873-6
Online ISBN: 978-3-7091-8871-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics