Skip to main content

Experimental Research in Reliable Computing at Carnegie Mellon University

  • Conference paper
The Evolution of Fault-Tolerant Computing

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 1))

  • 98 Accesses

Abstract

In 1945 the Carnegie Plan for higher education was evolved. The basic philosophy of the plan is “learning by doing”. The strong emphasis on experimental research at Carnegie Mellon University (CMU) is an example of the Carnegie plan in operation. Research in reliable computing at Carnegie Mellon University has spanned three decades. In the early 1960’s, Westinghouse Corporation in Pittsburgh had an active research program in the use of redundancy to enhance system reliability. William Mann, who had been associated with Carnegie Mellon University, was one of those researchers. In 1962, a symposium on redundancy techniques was held in Washington, DC.; it lead to the first comprehensive book on the topic of redundancy and reliability. Bill Mann was one of the coauthors of that book [73]. One of the papers in that volume, on adaptive voting, was written by CMU’s Professor William H. Pierce [41]. Later Professor Pierce published one of the first text books on redundancy [42].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Accetta, M., R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young, Mach: A New Kernel Foundation for UNIX Development, In Proceedings of Summer Usenix, Atlanta, July, 1986.

    Google Scholar 

  2. Barbacci, M., instruction Set Processor Specifications (ISPS): The Notation and Its Application, In IEEE Transactions on Computers, C-30, nr 1, pp. 24–40, January, 1981.

    Google Scholar 

  3. Barbacci, M., G. Barnes, R. Cattell, and D. P. Siewiorek, 77ie ISPS Computer Description Language, Carnegie Mellon University, Department of Computer Science Technical Report, 1977.

    Google Scholar 

  4. Bell, C. G. and A. Newell, Computer Structures: Readings and Examples, McGraw-Hill, New York, 1971.

    Google Scholar 

  5. Bhandarkar, D. P., Analytic Models for Memory Interference in Multiprocess Computer Systems, PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania, September, 1972.

    Google Scholar 

  6. Bloch, J. J., D. S. Daniels, and A. Z. Spector, Weighted Voting for Directories: A Comprehensive Study, Technical Report CMU-CS-84–114, April, 1984.

    Google Scholar 

  7. Castillo, X., A Compatible Hardware /Software Reliability Prediction Model, PhD thesis, Carnegie Mellon University, July 1981, Also Department of Computer Science Technical Report.

    Google Scholar 

  8. Castillo, X. and D. P. Siewiorek, A Workload Dependent Software Reliability Prediction Model, In 12th International Fault Tolerant Computing Symposium, June, 1982.

    Google Scholar 

  9. Castillo, X., S. R. McConnel, and D. P. Siewiorek, Derivation and Calibration of a Transient Error Reliability Model, IEEE Transactions on Computers, C-31, nr. 7, pp. 658–671, July 1982

    Google Scholar 

  10. Clune, E., Z. Segall and D. P. Siewiorek, Validation of Fault-Free Behavior of a Reliable Multiprocessor System, FTMP: A Case Study, In International Conference on Automation and Computer Control, San Diego, CA, June 6–8, 1984

    Google Scholar 

  11. Clune, E., Z. Segall, and D. Siewiorek, FIFault-Free Behavior of Reliable Multiprocessor Systems: FTMP Experiments in AIR-LAB, NASA Contractor Report 177967, Grant NAG1–190, Carnegie Mellon University, August 1985.

    Google Scholar 

  12. Czeck, E. W., F. E. Feather, A. M. Grizzaffi, G. B. Finelli, Z. Z. Segall, and D. P. Siewiorek, Fault-Free Performance Validation of Avionic Multiprocessors, In 7th Digital Avionic Systems Conference, October 1986, Dallas, Texas

    Google Scholar 

  13. Daniels, D. S. and A. Z. Spector, An Algorithm for Replicated Directories, In Proceedings of the Second Annual Symposium on Principles of Distributed Computing, pp. 104–113, August 1983, Montreal, Canada.

    Google Scholar 

  14. Eifert, J. and J. P. Shen, Processor Monitoring Using Asynchronous Signatured instruction Streams, In Proceedings of 14th International Fault-Tolerant Computing Symposium, June 1984.

    Google Scholar 

  15. Elkind, S. A., LAMBDA User Manual, Carnegie Mellon University, 1983.

    Google Scholar 

  16. Elkind, S. A., Approaches to Reliable Systems Design, PhD thesis, Carnegie Mellon University, May 1985

    Google Scholar 

  17. Elkind, S. and D. P. Siewiorek, Reliability and Performance of Error-Correcting Memory and Register Arrays, In IEEE Transactions on Computers, vol. 29, nr. 10, pp. 920–927, October 1980.

    Google Scholar 

  18. Feather, F., D. Siewiorek, and Z. Segall, Validation of a Fault-Tolerant Multiprocessor: Baseline Experiments and Workload Implementation, Technical Report CMU-CS-85–145, Carnegie Mellon University, July 1985.

    Google Scholar 

  19. Feather, Frank, Daniel P. Siewiorek, and Zary Segall, Validation of a Fault-Tolerant Multiprocessor: Baseline Experiments and Workload Implementation, Technical Report CMU-CS-8 5–127, Carnegie Mellon University, July 1985.

    Google Scholar 

  20. Ferguson, F. J. and J. P. Shen, The Design of Thvo Easily-Testable Array Multipliers, In Proceedings of the 6th Computer Arithmetic Conference, June 1983.

    Google Scholar 

  21. Gehringer, E. F., D. P. Siewiorek, and Z. Segall, Parallel Processing: The Cm*Experience, Digital Press, Bedford MA, 1987.

    Google Scholar 

  22. Guise, D., D. P. Siewiorek, and W. P. Birmingham, DEMETER: A Design Methodology and Environment, In Proceedings of the IEEE International Conference on Computer Design/VLSI in Computers, 1983.

    Google Scholar 

  23. Kini, V., Automatic Generation of Reliability Functions for Processor-Memory-Switch Structures, PhD thesis, Carnegie Mellon University, Department of Electrical Engineering, February 1981.

    Google Scholar 

  24. Kini, V. and D. P. Siewiorek, Automatic Generation of Symbolic Reliability Functions for Processor-Memory-Switch Structures, IEEE Transactions on Computers, vol. C-31, nr. 8, pp. 752–771, August 1982.

    Article  Google Scholar 

  25. Lai, K. W., Functional Testing of Digital Systems, PhD thesis, Carnegie Mellon University, Department of Computer Science, 1981

    Google Scholar 

  26. Lai, K. W. and D. P. Siewiorek, Functional Testing of Digital Systems, In 20th Design Automation Conference Proceedings, Miami Beach, FL, June 27–29 1983.

    Google Scholar 

  27. Lee, D. C. H. and J. P. Shen, Easily-Testable (N,K) Shuffle-Exchange Networks, In Proceedings of International Conference on Parallel Processing, August 1983.

    Google Scholar 

  28. Lin, T-T. Y. and D. P. Siewiorek, Architectural Issues for On-Line Diagnostics in a Distributed Environment, In IEEE International Conference on Computer Design, Port Chester NY, October 1986.

    Google Scholar 

  29. Maly, W., F. J. Ferguson, and J. P. Shen, Systematic Characterization of Physical Defects for Fault Analysis of MOS IC Cells, In Proceedings of International Test Conference, October 1984.

    Google Scholar 

  30. Mashburn, H. H., The C. rrvmp/Hydra Project: An Architectural Overview, In Siewiorek, D. P., C. G. Bell, and A. Newell, Computer Structures: Principles and Examples, pp. 350–370, McGraw-Hill, New York 1982.

    Google Scholar 

  31. Maxion, R. A., Distributed Diagnostic Performance Reporting and Analysis, In IEEE International Conference on Computer Design, Port Chester NY, October 1986.

    Google Scholar 

  32. Maxion, R. A., Toward Fault-Tolerant User interfaces, In Proceedings of the 5th IFAC International Conference on Achieving Safe Real-Time Computing Systems (SAFECOMP-86), Sarlat, France, October 1986.

    Google Scholar 

  33. Maxion, R. A., Human and Machine Diagnosis of Computer Hardware Faults, IEEE Computer Society Workshop on Reliability of Local Area Networks, South Padre Island, Texas, February 1982.

    Google Scholar 

  34. McConnel, S. R., Analysis and Modeling of Transient Errors in Digital Computers, PhD thesis, Carnegie Mellon University, Department of Electrical Engineering, June 1981, Also Department of Computer Science Technical Report.

    Google Scholar 

  35. McConnel S. R. and D. P. Siewiorek, The CMU Voter Chip, Technical Report Carnegie Mellon University, Department of Computer Science, 1980

    Google Scholar 

  36. McConnel, S. R. and D. P. Siewiorek, Synchronization and Voting, In IEEE Transactions on Computers, vol. C-30, nr. 2, pp. 161–164, February 1981.

    Google Scholar 

  37. McConnel, S. R., D. P. Siewiorek and M. M. Tsao, Transient Error Data Analysis, Technical Report, Carnegie Mellon University, Department of Computer Science, May 1979

    Google Scholar 

  38. NASA-Langley Research Center, Validation Methods for Fault-Tolerant Avionics and Control Systems-Working Group Meeting /, NASA Conference Publication 2114, Research Triangle Institute, 1979.

    Google Scholar 

  39. NASA-Langley Research Center, Validation Methods for Fault-Tolerant Avionics and Control Systems-Working Group Meeting //, NASA Conference Publication 2130, Research Triangle Institute, 1979.

    Google Scholar 

  40. Perq System Overview, March Edition, Perq Systems Corporation, Pittsburgh, Pennsylvania, 1984.

    Google Scholar 

  41. Pierce, W. H., Adaptive Vote-Takers Improve the Use of Redundancy, In Wilcox R. H. and W. C. Mann, Redundancy Techniques for Computing Systems, pp. 229–250, Spartan Books, Washington, D. C. 1962.

    Google Scholar 

  42. Pierce, W. H., Failure Tolerant Design, Academic Press, New York 1965.

    Google Scholar 

  43. Rashid, R. and G. G. Robertson, Accent: A Communication-Oriented Network Operating System Kernel, Computer Science Department Technical Report, Carnegie Mellon University, 1981.

    Google Scholar 

  44. Robinson, S. H. and J. P. Shen, Switch-Level Automatic Test Generation for CMOS Circuits, In Proceedings of International Conference on Computer-Aided Design, November 1985.

    Google Scholar 

  45. Schuette, M. A., J. P. Shen, D. P. Siewiorek and Y. X. Zhu, An Experimental Evaluation of Two Concurrent Error Detection Approaches, In Proceedings of 16th International Fault-Tolerant Computing Symposium, July 1986.

    Google Scholar 

  46. Schwarz, P. M., Transactions on Typed Objects, PhD thesis, Computer Science Department, Carnegie Mellon University, December 1984, Available as Technical Report CMU-CS-84–166, Carnegie Mellon University

    Google Scholar 

  47. Shen, J. P. and F. J. Ferguson, Easily-Testable Array Multipliers, In Proceedings of 13th International Fault-Tolerant Computing Symposium, June 1983.

    Google Scholar 

  48. Shen, J. P. and F. J. Ferguson, The Design of Easily-Testable VLSI Array Multipliers, In IEEE Transactions on Computers, June 1984.

    Google Scholar 

  49. Shen, J. P. and M. A. Schuette, On-Line Seif-Monitoring Using Signatured instruction Streams, hi Proceedings of international Test Conference, October 1983.

    Google Scholar 

  50. Shen, J. P. and M. A. Schuette, Processor Control Flow Monitoring Using Signatured Instruction Streams, IEEE Transactions on Computers, 1986.

    Google Scholar 

  51. Shen, J. P. and S. P. Tomas, A Roving Monitoring Processor for Detection of Control Flow Errors in Multiple Processor Systems, In Microprocessing and Microprogramming: The Euromicro Journal, 1986.

    Google Scholar 

  52. Shen, J. P., W. Maly, and F. J. Ferguson, inductive Fault Analysis of MOS Integrated Circuits, In IEEE Design and Test of Computers, December 1985.

    Google Scholar 

  53. Shombert, L., The C. vmp Statistics Board Experiment, Master’s thesis, Carnegie Mellon University, Department of Electrical Engineering, 1981.

    Google Scholar 

  54. Shombert, L. A., Using Redundancy for Testable and Repairable Systolic Arrays, PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania, 1985.

    Google Scholar 

  55. Siewiorek, D. P., Architecture of Fault-Tolerant Computers, In Computer, vol. 17, nr. 8, pp. 9–18, August 1984.

    Google Scholar 

  56. Siewiorek, D. P., Architecture of Fault-Tolerant Computers, In D. K. Pradhan, Fault-Tolerant Computing: Theory and Techniques, Vol. II, pp. 417–466, Prentice-Hall, Englewood Cliffs, N. J., 1986.

    Google Scholar 

  57. Siewiorek, D. P. and K. W. Lai, Testing of Digital Systems, In Proceedings of the IEEE, pp. 1321–1333, October 1981.

    Google Scholar 

  58. Siewiorek, D. P. and S. R. McConnel, C. vmp: The Implementation, Performance, and Reliability of a Fault-Tolerant Multiprocessor, In Proceedings of the Third US-Japan Computer Conference, October 1978.

    Google Scholar 

  59. Siewiorek, D. P. and R. S. Swarz, The Theory and Practice of Reliable System Design, Digital Press, Bedford MA, 1982.

    Google Scholar 

  60. Siewiorek, D., V. Kini, R. Joobbani, and H. Bellis, A Case Study of C. mmp, Cm* and C. vmp: Part II. Predicting and Calibrating Reliability of Multiprocessor Systems, In Proceedings of the IEEE, vol. 66, nr. 10, pp. 1200–1220, October 1978.

    Google Scholar 

  61. Siewiorek, D. P., C. G. Bell, R. C. Chen, S. H. Fuller, J. Grason, and S. Rege, The Architecture and Applications of Computer Modules: A Set of Components for Digital Systems Design, In Proceedings of the 1973 COMPCON Conference, pp. 177–180, San Francisco, CA, February 1973.

    Google Scholar 

  62. Siewiorek, D. P., V. Kini, H. Mashburn, S. McConnel and M. Tsao, A Case Study of C. mmp, Cm* C. vmp: Part I-Experiences with Fault Tolerance in Multiprocessor Systems, In Proceedings of the IEEE, vol. 66, pp. 1178–1199, October 1978

    Google Scholar 

  63. Siewiorek, D. P., M. Canepa, and S. Clark, C. vmp: The Architecture and Implementation of a Fault-Tolerant Multiprocessor, In 7th International Symposium on Fault-Tolerant Computing, Los Angeles CA, June 1977.

    Google Scholar 

  64. Spector, A. Z., J. Butcher, D. S. Daniels, D. J. Duchamp, J. L. Eppinger, C. E. Fineman, A. Heddaya, P. M. Schwarz, Support for Distributed Transactions in the TABS Prototype, In IEEE Transactions on Software Engineering, vol. SE-11, nr. 6, pp. 520–530, June 1985.

    Google Scholar 

  65. Spector, A. Z., D. S. Daniels, D. J. Duchamp, J. L. Eppinger, R. Pausch, Distributed Transactions for Reliable Systems, Proceedings of the Tenth Symposium on Operating System Principles, December 1985.

    Google Scholar 

  66. Swan, R. J., S. H. Fuller, D. P. Siewiorek, Cm*: A Modular, Multi-Microprocessor, In AFIPS: Proceedings of the National Computer Conference, June 1977.

    Google Scholar 

  67. Tomas, S. P. and J. P. Shen, A Roving Monitoring Processor for Detection of Control Flow Errors in Multiple Processor Systems, In Proceedings of the International Conference on Computer Design, October 1985.

    Google Scholar 

  68. W. N. Toy, Fault-Tolerant Design of Local ESS Processors, In Proceedings of the IEEE, vol. 66, nr. 10, pp. 1126–1145, October 1978.

    Article  Google Scholar 

  69. Tsao, M. M., Transient Error and Fault Prediction, PhD thesis, Carnegie Mellon University, Department of Electrical Engineering, January 1981.

    Google Scholar 

  70. Tsao, M. M. and D. P. Siewiorek, Trend Analysis on System Fhror Files FP, In 13th international Fault Tolerant Computing Symposium, June 1983, Milan, Italy.

    Google Scholar 

  71. Tsao, M. M., A. W. Wilson, R. C. McGarity, C-J. Tseng and D. P. Siewiorek, The Design of C. Fast: A Single Chip Fault-Tolerant Microprocessor, In 12th International Fault-Tolerant Computing Symposium, June 1982, Santa Monica, CA.

    Google Scholar 

  72. U. S. Department of Defense, Military Standardization Handbook: Reliability Prediction of Electronic Equipment, MIL-STD-HDBK-217B, Notice 1, 1976.

    Google Scholar 

  73. Wilcox, R. C. and W. C. Mann, Redundancy Techniques for Computer Systems, Spartan Books, Washington, D. C., 1962

    Google Scholar 

  74. Wulf, W. A. and C. G. Bell, C. mmp: A Multi-Mini-Processor, In AFIPS Conference Proceedings, vol. 41, pp. 765–777, Montvale, NJ., 1972.

    Google Scholar 

  75. Wulf, W. A., R. Levin, and S. Harbison, Hydra/C. mmp: An Experimental Computer System, McGraw-Hill, New York, New York, 1980.

    Google Scholar 

  76. York, G., D. P. Siewiorek, Y. X. Zhu, Compensating Faults in Triple Modular Redundancy, In Proceedings of the Fifteenth International Symposium on Fault-Tolerant Computing, June 19–21 1985.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1987 Springer-Verlag/Wien

About this paper

Cite this paper

Siewiorek, D.P., Shen, J.P., Maxion, R.A. (1987). Experimental Research in Reliable Computing at Carnegie Mellon University. In: Avižienis, A., Kopetz, H., Laprie, JC. (eds) The Evolution of Fault-Tolerant Computing. Dependable Computing and Fault-Tolerant Systems, vol 1. Springer, Vienna. https://doi.org/10.1007/978-3-7091-8871-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-8871-2_13

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-8873-6

  • Online ISBN: 978-3-7091-8871-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics