Advertisement

Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning

  • Alex GroceEmail author
  • Klaus Havelund
  • Gerard Holzmann
  • Rajeev Joshi
  • Ru-Gang Xu
Article

Abstract

In this paper we discuss the application of a range of techniques to the verification of mission-critical flight software at NASA’s Jet Propulsion Laboratory. For this type of application we want to achieve a higher level of confidence than can be achieved through standard software testing. Unfortunately, given the current state of the art, especially when efforts are constrained by the tight deadlines and resource limitations of a flight project, it is not feasible to produce a rigorous formal proof of correctness of even a well-specified stand-alone module such as a file system (much less more tightly coupled or difficult-to-specify modules). This means that we must look for a practical alternative in the area between traditional testing and proof, as we attempt to optimize rigor and coverage. The approaches we describe here are based on testing, model checking, constraint-solving, monitoring, and finite-state machine learning, in addition to static code analysis. The results we have obtained in the domain of file systems are encouraging, and suggest that for more complex properties of programs with complex data structures, it is possibly more beneficial to use constraint solvers to guide and analyze execution (i.e., as in testing, even if performed by a model checking tool) than to translate the program and property into a set of constraints, as in abstraction-based and bounded model checkers. Our experience with non-file-system flight software modules shows that methods even further removed from traditional static formal methods can be assisted by formal approaches, yet readily adopted by test engineers and software developers, even as the key problem shifts from test generation and selection to test evaluation.

Keywords

File systems Testing Model checking Verification Flight software Formal proof 

Mathematics Subject Classifications (2010)

68N30 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
    CodeSonar: GrammaTech static analysis. http://www.grammatech.com/products/codesonar
  5. 5.
  6. 6.
    JPL Laboratory for Reliable Software (LaRS): http://lars-lab.jpl.nasa.gov
  7. 7.
    Mars Science Laboratory: http://mars.jpl.nasa.gov/mslGoogle Scholar
  8. 8.
    Open Group Base Specifications Issue 6: http://www.opengroup.org/onlinepubs/009695399/
  9. 9.
    Software development testing and static analysis tools: Coverity. http://www.coverity.com
  10. 10.
    Source code analysis tools for software security & reliability: Klockwork. http://klocwork.com
  11. 11.
    VCC: A C verifier - Microsoft Research: http://research.microsoft.com/en-us/projects/vcc/
  12. 12.
    Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  13. 13.
    Andrews, J., Li, F., Menzies, T.: Nighthawk: a two-level genetic-random unit test data generator. Autom. Softw. Eng. 144–153 (2007)Google Scholar
  14. 14.
    Andrews, J.H., Groce, A., Weston, M., Xu, R.-G.: Random test run length and effectiveness. In: Automated Software Engineering, pp. 19–28 (2008)Google Scholar
  15. 15.
    Andrews, J.H., Haldar, S., Lei, Y., Li, C.H.F.: Tool support for randomized unit testing. In: Proceedings of the First International Workshop on Randomized Testing, pp. 36–45. Portland (2006)Google Scholar
  16. 16.
    Angluin, D.: Learning regular sets from queries and counterexamples. Info. Comput. 75(2), 87106 (1987)MathSciNetGoogle Scholar
  17. 17.
    Ball, T., Rajamani, S.: Automatically validating temporal safety properties of interfaces. In: SPIN Workshop on Model Checking of Software, pp. 103–122 (2001)Google Scholar
  18. 18.
    Barringer, H., Groce, A., Havelund, K., Smith, M.: Formal analysis of log files. J. Aerosp. Comput. Info. Commun. 7(11), 365–390 (2010)CrossRefGoogle Scholar
  19. 19.
    Barringer, H., Havelund, K., Rydeheard, D., Groce, A.: Rule systems for runtime verification: a short tutorial. In: Proceedings of the 9th International Workshop on Runtime Verification (RV’09), LNCS, vol. 5779, pp. 1–24. Springer (2009)Google Scholar
  20. 20.
    Barringer, H., Rydeheard, D., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. In: Proceedings of the 7th International Workshop on Runtime Verification (RV’07), LNCS, vol. 4839, pp. 111–125. Springer, Vancouver (2007)Google Scholar
  21. 21.
    Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. J. Log. Comput. 20(3), 675–706 (2010)CrossRefzbMATHMathSciNetGoogle Scholar
  22. 22.
    Beck, K.: Test Driven Development: By Example. Addison Wesley, Reading (2002)Google Scholar
  23. 23.
    Beizer, B.: Software Testing Techniques. International Thomson Computer Press, Boston (1990)Google Scholar
  24. 24.
    Beyer, D., Chlipala, A.J., Henzinger, T.A., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: International Conference on Software Engineering, pp. 326–335 (2004)Google Scholar
  25. 25.
    Biere, A.: The Evolution from Limmat to Nanosat. Technical Report 444, Dept. of Computer Science, ETH Zŭrich (2004)Google Scholar
  26. 26.
    Biere, A., Cimatti, A., Clarke, E.M., Zhu, Y.: Symbolic model checking without BDDs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 193–207 (1999)Google Scholar
  27. 27.
    Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans. Comput. 21, 592–597 (1972)CrossRefzbMATHMathSciNetGoogle Scholar
  28. 28.
    Boonstoppel, P., Cadar, C., Engler, D.: Rwset: attacking path explosion in constraint-based test generation. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 351–366 (2008)Google Scholar
  29. 29.
    Brooks, F.: The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional (1995)Google Scholar
  30. 30.
    Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Operating System Design and Implementation, pp. 209–224 (2008)Google Scholar
  31. 31.
    Cadar, C., Ganesh, V., Pawlowski, P., Dill, D., Engler, D.: EXE: automatically generating inputs of death. In: Conference on Computer and Communications Security, pp. 322–335 (2006)Google Scholar
  32. 32.
    Chaki, S., Clarke, E.M., Groce, A., Jha, S., Veith, H.: Modular verification of software components in C. In: International Conference on Software Engineering, pp. 385–395 (2003)Google Scholar
  33. 33.
    Chen, F., Roşu, G.: MOP: an efficient and generic runtime verification framework. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’07) (2007)Google Scholar
  34. 34.
    Chen, Y., Groce, A., Zhang, C., Wong, W.-K., Fern, X., Eide, E., Regehr, J.: Taming compiler fuzzers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 197–208 (2013)Google Scholar
  35. 35.
    Ciupa, I., Leitner, A., Oriol, M., Meyer, B.: Experimental assessment of random testing for object-oriented software. In: Rosenblum, D.S., Elbaum, S.G. (eds.) International Symposium on Software Testing and Analysis, pp 84–94. ACM (2007)Google Scholar
  36. 36.
    Claessen, K., Hughes, J.: QuickCheck: a lightweight tool for random testing of haskell programs. In: ICFP, pp. 268–279 (2000)Google Scholar
  37. 37.
    Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000)Google Scholar
  38. 38.
    Clarke, E.M., Emerson, E.: The design and synthesis of synchronization skeletons using temporal logic. In: Workshop on Logics of Programs, pp. 52–71 (1981)Google Scholar
  39. 39.
    Clarke, L.A., Rosenblum, D.S.: A historical perspective on runtime assertion checking in software development. ACM SIGSOFT Softw. Eng. Notes 31(3), 25–37 (2006)CrossRefGoogle Scholar
  40. 40.
    Corbett, J., Dwyer, M., Hatcliff, J., Laubach, S., Pasareanu, C.S., Zheng, H.: Bandera: extracting finite-state models from Java source code. In: International Conference on Software Engineering, pp. 439–448 (2000)Google Scholar
  41. 41.
    Csallner, C., Smaragdakis, Y.: JCrasher: an automatic robustness tester for Java. Softw. Pract. Exp. 34(11), 1025–1050 (2004)CrossRefGoogle Scholar
  42. 42.
    de Moura, L., Bjorner, N.: Z3: an efficient SMT solver. In: Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340 (2008)Google Scholar
  43. 43.
    DeMillo, R.A., Lipton, R.J., Sayward, F.G.: Hints on test data selection: help for the practicing programmer. Computer 4(11), 34–41 (1978)Google Scholar
  44. 44.
    Doong, R.-K., Frankl, P.G.: The ASTOOT approach to testing object-oriented programs. ACM Trans. Softw. Eng. Methodol. 3(2), 101–130 (1994)CrossRefGoogle Scholar
  45. 45.
    Duran, J.W., Ntafos, S.C.: Evaluation of random testing. IEEE Trans. Softw. Eng. 10(4), 438–444 (1984)CrossRefGoogle Scholar
  46. 46.
    Dwyer, M.B., Elbaum, S.G., Person, S., Purandare, R.: Parallel randomized state-space search. In: International Conference on Software Engineering, pp. 3–12 (2007)Google Scholar
  47. 47.
    Een, N., Sorensson, N.: An extensible SAT-solver. In: Symposium on the Theory and Applications of Satisfiability Testing (SAT), pp. 502–518 (2003)Google Scholar
  48. 48.
    Erickson, J., Joshi, R.: Proving Correctness of a Flash Filesystem in ACL2. Unpublished manuscript in preparation (2006)Google Scholar
  49. 49.
    Gligoric, M., Groce, A., Zhang, C., Sharma, R., Alipour, A., Marinov, D.: Comparing non-adequate test suites using coverage criteria. In: International Symposium on Software Testing and Analysis, pp. 302–313 (2013)Google Scholar
  50. 50.
    Godefroid, P.: Verisoft: a tool for the automatic analysis of concurrent software. In: Computer-Aided Verification, pp. 172–186 (1997)Google Scholar
  51. 51.
    Godefroid, P.: Compositional dynamic test generation. In: Principles of Programming Languages, pp. 47–54 (2007)Google Scholar
  52. 52.
    Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation, pp. 213–223 (2005)Google Scholar
  53. 53.
    Groce, A.: (Quickly) testing the tester via path coverage. In: Workshop on Dynamic Analysis (2009)Google Scholar
  54. 54.
    Groce, A., Fern, A., Pinto, J., Bauer, T., Alipour, A., Erwig, M., Lopez, C.: Lightweight automated testing with adaptation-based programming. In: IEEE International Symposium on Software Reliability Engineering, pp. 161–170 (2012)Google Scholar
  55. 55.
    Groce, A., Havelund, K., Smith, M.H.: From scripts to specifications: the evolution of a flight software testing effort. In: 32nd International Conference on Software Engineering (ICSE’10), pp. 129–138. ACM SIG, Cape Town (2010)Google Scholar
  56. 56.
    Groce, A., Holzmann, G., Joshi, R.: Randomized differential testing as a prelude to formal verification. In: International Conference on Software Engineering, pp. 621–631 (2007)Google Scholar
  57. 57.
    Groce, A., Holzmann, G., Joshi, R., Xu, R.-G.: Putting flight software through the paces with testing, model checking, and constraint-solving. In: International Workshop on Constraints in Formal Verification, pp. 1–15 (2008)Google Scholar
  58. 58.
    Groce, A., Joshi, R.: Exploiting traces in program analysis. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 379–393 (2006)Google Scholar
  59. 59.
    Groce, A., Joshi, R.: Extending model checking with dynamic analysis. In: International Conference on Verification, Model Checking, and Abstract Interpretation, pp. 142–156 (2008)Google Scholar
  60. 60.
    Groce, A., Joshi, R.: Random testing and model checking: building a common framework for nondeterministic exploration. In: Workshop on Dynamic Analysis, pp. 22–28 (2008)Google Scholar
  61. 61.
    Groce, A., Zhang, C., Alipour, A., Eide, E., Chen, Y., Regehr, J.: Help, help, I’m being suppressed! the significance of suppressors in software testing. In: IEEE International Symposium on Software Reliability Engineering, pp. 390–399 (2013)Google Scholar
  62. 62.
    Groce, A., Zhang, C., Eide, E., Chen, Y., Regehr, J.: Swarm testing. In: International Symposium on Software Testing and Analysis, pp. 78–88 (2012)Google Scholar
  63. 63.
    Hamlet, R.: Testing programs with the aid of a compiler. IEEE Trans. Softw. Eng. 3(4), 279–290 (1977)Google Scholar
  64. 64.
    Hamlet, R., Taylor, R.: Partition testing does not inspire confidence. IEEE Trans. Softw. Eng. 16(12), 1402–1411 (1990)CrossRefMathSciNetGoogle Scholar
  65. 65.
    Hamlet, R.: Random testing. In: Encyclopedia of Software Engineering, pp. 970–978. Wiley (1994)Google Scholar
  66. 66.
    Hamlet, R.: When only random testing will do. In: International Workshop on Random Testing, pp. 1–9 (2006)Google Scholar
  67. 67.
    Havelund, K.: Runtime verification of C programs. In: Proceedings of the 1st TestCom/FATES Conference, LNCS, vol. 5047. Springer, Tokyo (2008)Google Scholar
  68. 68.
    Havelund, K., Roşu, G.: Efficient monitoring of safety properties. Softw. Tools Technol. Tran. 6(2), 158–173 (2004)CrossRefGoogle Scholar
  69. 69.
    Heimdahl, M.P.E., Rayadurgam, S., Visser, W., George, D., Gao, J.: Auto-generating test sequences using model checkers: a case study. In: International Workshop on Formal Approaches to Testing of Software (FATES) (2003)Google Scholar
  70. 70.
    Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Principles of Programming Languages, pp. 58–70 (2002)Google Scholar
  71. 71.
    Hoare, T.: The verifying compiler: a grand challenge for computing research. J. ACM 50(1), 63–69 (2003)CrossRefGoogle Scholar
  72. 72.
    Holzmann, G.: Static source code checking for user-defined properties. In: Conference on Integrated Design and Process Technology (2002)Google Scholar
  73. 73.
    Holzmann, G.: The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional (2003)Google Scholar
  74. 74.
    Holzmann, G.: The power of ten: rules for developing safety critical code. IEEE Comput. 39(6), 95–97 (2006)CrossRefGoogle Scholar
  75. 75.
    Holzmann, G., Joshi, R.: Model-driven software verification. In: SPIN Workshop on Model Checking of Software, pp. 76–91 (2004)Google Scholar
  76. 76.
    Holzmann, G., Joshi, R., Groce, A.: Swarm verification. In: Automated Software Engineering, pp. 1–6 (2008)Google Scholar
  77. 77.
    Holzmann, G., Joshi, R., Groce, A.: Tackling large verification problems with the swarm tool. In: SPIN Workshop on Model Checking of Software, pp. 134–143 (2008)Google Scholar
  78. 78.
    Holzmann, G.J., Joshi, R., Groce, A.: Swarm verification techniques. IEEE Trans. Softw. Eng. 37(6), 845–857 (2011)CrossRefGoogle Scholar
  79. 79.
    Joshi, R., Holzmann, G.: A mini-challenge: build a verifiable filesystem. In: The Conference on Verified Software: Theories, Tools, Experiments (2005)Google Scholar
  80. 80.
    Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer, Norwell (2000)Google Scholar
  81. 81.
    Kim, M., Choi, Y., Kim, Y., Kim, H.: Formal verification of a flash memory device driver — an experience report. In: SPIN Workshop on Model Checking of Software, pp. 144–159 (2008)Google Scholar
  82. 82.
    Kim, M., Choi, Y., Kim, Y., Kim, H.: Pre-testing flash device driver through model checking techniques. In: International Conference on Software Testing, Verification, and Validation, pp. 475–484 (2008)Google Scholar
  83. 83.
    Kim, M., Kannan, S., Lee, I., Sokolsky, O.: Java-MaC: a run-time assurance tool for Java. In: Proceedings of the 1st International Workshop on Runtime Verification (RV’01) of ENTCS, vol. 55, issue 2. Elsevier (2001)Google Scholar
  84. 84.
    Kim, M., Kim, Y., Kim, H.: Unit testing of flash memory device driver through a SAT-based model checker. In: Automated Software Engineering, pp. 198–207 (2008)Google Scholar
  85. 85.
    King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)CrossRefzbMATHGoogle Scholar
  86. 86.
    Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: sel4: formal verification of an OS kernel. In: ACM Symposium on Operating Systems Principles (2009)Google Scholar
  87. 87.
    Kroening, D., Clarke, E.M., Lerda, F.: A tool for checking ANSI-C programs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 168–176 (2004)Google Scholar
  88. 88.
    Lei, Y., Andrews, J.H.: Minimization of randomized unit test cases. In: International Symposium on Software Reliability Engineering, pp. 267–276 (2005)Google Scholar
  89. 89.
    Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)Google Scholar
  90. 90.
    McKeeman, W.: Differential testing for software. Digit. Tech. J. Digit. Equip. Corp. 10(1), 100–107 (1998)Google Scholar
  91. 91.
    Mercer, E., Jones, M.: Model checking machine code with the GNU debugger. In: SPIN Workshop on Model Checking Software (2005)Google Scholar
  92. 92.
    Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 105(33(12)), 32–44 (1990)Google Scholar
  93. 93.
    Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient SAT solver. In: Design Automation Conference, pp. 530–535 (2001)Google Scholar
  94. 94.
    Musuvathi, M., Park, D., Chou, A., Engler, D., Dill, D.: CMC: a pragmatic approach to model checking real code. In: Symposium on Operating System Design and Implementation (2002)Google Scholar
  95. 95.
    Necula, G., McPeak, S., Rahul, S., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: International Conference on Compiler Construction, pp. 213–228 (2002)Google Scholar
  96. 96.
    Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer (2002)Google Scholar
  97. 97.
    Offutt, J., Abdurazik, A.: Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries (2000)Google Scholar
  98. 98.
    Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: International Conference on Software Engineering, pp. 75–84 (2007)Google Scholar
  99. 99.
    Pettichord, B.: Design for testability. In: Pacific Northwest Software Quality Conference (2002)Google Scholar
  100. 100.
    Qadeer, S., Rehof, J.: Context-bounded model checking of concurrent software. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 93–107 (2005)Google Scholar
  101. 101.
    Reeves, G., Neilson, T.: The mars rover spirit FLASH anomaly. In: IEEE Aerospace Conference (2005)Google Scholar
  102. 102.
    Dwyer, R.M., Hatcliff, J.: Bogor: an extensible and highly-modular model checking framework. In: European Software Engineering Conference/Symposium on the Foundations of Software Engineering, pp. 267–276 (2003)Google Scholar
  103. 103.
    Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Foundations of Software Engineering, pp. 262–272 (2005)Google Scholar
  104. 104.
    Smith, M., Havelund, K.: Requirements capture with RCAT. In: 16th IEEE International Requirements Engineering Conference (RE’08). IEEE Computer Society, Barcelona (2008)Google Scholar
  105. 105.
    Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: Principles of Programming Languages, pp. 97–108 (2007)Google Scholar
  106. 106.
    Various: A Collection of NAND Flash Application Notes, Whitepapers and Articles. Available at http://www.data-io.com/NAND/NANDApplicationNotes.asp
  107. 107.
  108. 108.
    Visser, W., Havelund, K., Brat, G., Park, S.J., Lerda, F.: Model checking programs. Autom. Softw. Eng. 10(2), 203–232 (2003)CrossRefGoogle Scholar
  109. 109.
    Visser, W., Păsăreanu, C., Pelanek, R.: Test input generation for Java containers using state matching. In: International Symposium on Software Testing and Analysis, pp. 37–48 (2006)Google Scholar
  110. 110.
    Xu, R.-G., Majumdar, R., Godefroid, P.: Testing for buffer overflows with length abstraction. In: International Symposium on Software Testing and Analysis, pp. 19–28 (2008)Google Scholar
  111. 111.
    Yang, J., Sar, C., Engler, D.: EXPLODE: a lightweight, general system for finding serious storage system errors. In: Operating System Design and Implementation, pp. 131–146 (2006)Google Scholar
  112. 112.
    Yang, J., Sar, C., Twohey, P., Cadar, C., Engler, D.: Automatically generating malicious disks using symbolic execution. In: IEEE Symposium on Security and Privacy, pp. 243–257 (2006)Google Scholar
  113. 113.
    Yang, J., Twohey, P., Engler, D., Musuvathi, M.: Using model checking to find serious file system errors. In: Operating System Design and Implementation, pp. 273–288 (2004)Google Scholar
  114. 114.
    Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 283–294 (2011)Google Scholar
  115. 115.
    Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28(2), 183–200 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Alex Groce
    • 1
    Email author
  • Klaus Havelund
    • 2
  • Gerard Holzmann
    • 2
  • Rajeev Joshi
    • 2
  • Ru-Gang Xu
    • 3
  1. 1.School of Electrical Engineering and Computer ScienceOregon State UniversityCorvallisUSA
  2. 2.Laboratory for Reliable Software, Jet Propulsion LaboratoryCalifornia Institute of TechnologyPasadenaUSA
  3. 3.Department of Computer ScienceUniversity of CaliforniaLos AngelesUSA

Personalised recommendations