A Systematic Approach to Automatically Generate Multiple Semantically Equivalent Program Versions

  • Sri Hari Krishna Narayanan
  • Mahmut Kandemir
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5026)

Abstract

Classic methods to overcome software faults include design diversity that involves creating multiple versions of an application. However, design diverse techniques typically require a staggering investment of time and manpower. There is also no guarantee that the multiple versions are correct or equivalent. This paper presents a novel approach that addresses the above problems, by automatically producing multiple, semantically equivalent copies for a given array/loop-based application. The copies, when used within the framework of common design diverse techniques, provide a high degree of software fault tolerance at practically no additional cost. In this paper, we also apply our automated version generation approach to detect the occurrence of soft errors during the execution of an application.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Avizienis, A.: On the implementation of nversion programming for software fault tolerance during execution. Proceedings of the IEEE 66(10), 1109–1125 (1978)CrossRefGoogle Scholar
  2. 2.
    Elmendorf, W.: Fault-tolerant programming. In: FTCS-2, pp. 79–83 (1972)Google Scholar
  3. 3.
    Randell, B.: System structure for software fault tolerance. IEEE Trans. on Software Engineering SE-1(2), 220–232 (1975)Google Scholar
  4. 4.
    Horning, J.J., et al.: A program structure for error detection and recovery. In: Operating Systems, Proceedings of an Int. Symposium, pp. 171–187. Springer, Heidelberg (1974)Google Scholar
  5. 5.
    Pullum, L.: A new adjudicator for fault tolerant software applications correctly resulting in multiple solutions. In: Digital Avionics Systems Conference, pp. 147–152 (1993)Google Scholar
  6. 6.
    Pullum, L.L.: Software Fault Tolerance Techniques and Implementation. Artech House (2001)Google Scholar
  7. 7.
    Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)MATHGoogle Scholar
  8. 8.
    Wolfe, M.J.: Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge (1990)Google Scholar
  9. 9.
    Kodukula, I., et al.: Data-centric multi-level blocking. In: PLDI, pp. 346–357 (1997)Google Scholar
  10. 10.
    Kadayif, I., Kandemir, M.: Data space-oriented tiling for enhancing locality. Trans. on Embedded Computing Sys. 4(2), 388–414 (2005)CrossRefGoogle Scholar
  11. 11.
    Michalak, S., Harris, K., Hengartner, N., Takala, B., Wender, S.: Predicting the number of fatal soft errors in los alamos national laboratory’s asc q supercomputer. IEEE Transactions on Device and Materials Reliability 5(3), 329–335 (2005)CrossRefGoogle Scholar
  12. 12.
    Wang, N., Quek, J., Rafacz, T.: patel, S.: Characterizing the effects of transient faults on a high-performance processor pipeline. In: DSN 2004: Proceedings of the 2004 International Conference on Dependable Systems and Networks, p. 61 (2004)Google Scholar
  13. 13.
    Patel, J.: Characterization of soft errors caused by single event upsets in cmos processes. IEEE Trans. Dependable Secur. Comput. 1(2), 128–143 (2004)CrossRefGoogle Scholar
  14. 14.
    Degalahal, V., Ramanarayanan, R., Vijaykrishnan, N., Xie, Y., Irwin, M.J.: The effect of threshold voltages on the soft error rate. In: International Symposium on Quality Electronic Design, pp. 503–508 (2004)Google Scholar
  15. 15.
    Kelly, W., et al.: The omega calculator and library v1.1.0. Technical report, Dept. of CS, Univ. of Maryland (1996)Google Scholar
  16. 16.
    Kreisel, G., Krivine, J.L.: Elements of mathematical logic. North-Holland Pub. Co., Amsterdam (1967)MATHGoogle Scholar
  17. 17.
    Reinhardt, S., Mukherjee, S.: Transient fault detection via simultaneous multithreading. SIGARCH Comput. Archit. News 28(2), 25–36 (2000)CrossRefGoogle Scholar
  18. 18.
    Chen, C., Hsiao, M.: Error-correcting codes for semiconductor memory applications: a state of the art review. Reliable Computer Systems - Design and Evaluation, 771–786 (1992)Google Scholar
  19. 19.
    Pradhan, D.K. (ed.): Fault-tolerant computer system design (1996)Google Scholar
  20. 20.
    Kelly, W., et al.: Code generation for multiple mappings. Technical report, Dept. of CS, Univ. of Maryland (1994)Google Scholar
  21. 21.
    Gurumurthi, S., Parashar, A., Sivasubramaniam, A.: Sos: Using speculation for memory error detection. In: Workshop on High Performance Computing Reliability Issues (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sri Hari Krishna Narayanan
    • 1
  • Mahmut Kandemir
    • 1
  1. 1.Computer Science and Engineering DepartmentThe Pennsylvania State UniversityUniversity ParkUSA

Personalised recommendations