Why There is no General Solution to the Problem of Software Verification


How can we be certain that software is reliable? Is there any method that can verify the correctness of software for all cases of interest? Computer scientists and software engineers have informally assumed that there is no fully general solution to the verification problem. In this paper, we survey approaches to the problem of software verification and offer a new proof for why there can be no general solution.

This is a preview of subscription content, access via your institution.


  1. 1.

    Characterizing the satisfaction relation in the verification problem in model-theoretic terms may seem to differ from the way some computer scientists characterize verification. Emerson (2008, 28), for example, suggests the verification problem is determining “whether or not the behavior [our emphasis] of M meets the specification h” where M is the program and h is the specification. Our approach does not depend on behavioral properties, as such, of a program. Instead, we characterize the satisfaction relation in terms of a function that relates the models of a program/software system to models of the specification (see Sect. 3 for more detail).

  2. 2.

    The root toolkit example shows how difficult it is in practice to distinguish some specification from implementation issues. To put this point more sharply, we could answer the requirement to cut a piece of wood with a saw or a stick of dynamite. Both would do the job; the dynamite would surely cause much collateral damage.

  3. 3.

    Collectively, D2, D3, and D5 significantly overlap what we mean when we say a verification procedure is algorithmic. In particular, an algorithmic procedure is an effective procedure, and an effective procedure by definition implies (D2), (D3), and (D5). One might, therefore, replace the union of (D2), (D3), and (D5) with a desideratum requiring an approach to verification to be algorithmic. Articulating (D) as shown, however, supports some informative distinctions among the relative strengths of approaches that have been taken to the verification problem, and for this reason we choose to adopt (D) in the more expansive form shown.

  4. 4.

    Two programs, A and B, are concurrent if at least some portion of those programs execute at the same time.

  5. 5.

    For a detailed survey of the latter kind, see Emerson (2008) and Clarke et al. (2018) See also Blum et al. (1979).

  6. 6.

    Copeland et al. (2016) make the case that hypercomputation should be taken seriously as a candidate for what is meant by “computation”, given hypercomputation’s compatibility with the Church-Turing thesis. We will not defend our choice to exclude hypercomputation in this paper, however see Davis (2004) for reasons to be skeptical.

  7. 7.

    This problem has been known, at least informally, since the earliest days of software testing.

  8. 8.

    Among philosophers, Jim Fetzer was the first to point out that software verification characterized as a testing problem poses challenges (Fetzer 1988) that are intractable in practice.

  9. 9.

    A control-flow statement in S is a statement that can, based on a condition that may not always obtain, change the order of execution of the statement in S.

  10. 10.

    Note that this definition requires that S can run to “completion”. Some software systems, such as operating systems, by design “run forever”, and thus have no “completion”.

  11. 11.

    This definition of path complexity is different from McCabe complexity, which is a count of the number of independent paths in S (McCabe 1976).

  12. 12.

    A conditional statement is a statement of the form “If X, do Y”. A Turing complete language, in addition to providing a way implement conditional statements, must also provide a way to implement loops. For the purpose of this paper, we can restrict the analysis to a program whose control statements are “if–then” statements only. Why? To show that verification-as-testing fails to satisfy (D1), it is sufficient to show that even if S contained only if–then control constructs, verification-as-testing would fail to satisfy (D1). (Accommodating loop control constructs in S only increases complexity.).

  13. 13.

    For a complete discussion of the path complexity catastrophe see Symons and Horner (2014).

  14. 14.

    In typical practice, “soak testing” refers to informally observing the behavior of S over P under nominal operating conditions.

  15. 15.

    Some variants of Linux, for example, contain a coverage-analysis utility, gcov.

  16. 16.

    A liveness property asserts that program execution eventually reaches some desirable state (Owicki and Lamport 1982).

  17. 17.

    Concurrent systems are often reactive systems (i.e., they execute in response to a stimulus (e.g., from a sensor) outside those programs). Reactive systems are often nondeterministic, so their non-repeatable behavior is not amenable to testing. Their semantics can be given as infinite sequences of computation states.

  18. 18.

    A system K has the small model property if and only if any satisfiable formula in K has a “small” finite model, i.e., a model whose size is a polynomial function of the formula size.

  19. 19.

    The model checker used in this case found errors in the original design of the system. Some of these errors would have made buildings less resistant to earthquakes.

  20. 20.

    Robinson arithmetic is “ordinary” (Peano) arithmetic without the Peano induction axiom.

  21. 21.

    Virtually all business and scientific software must implement arithmetic.

  22. 22.

    Note that a theory of arithmetic (or anything else) that is not finitely axiomatizable cannot be implemented on a finite Universal Turing Machine. No second-, or higher-, order theory of arithmetic, for example, can be implemented on a Universal Turing Machine. (See for example Chang and Keisler 2012, Chapter 1.).

  23. 23.

    To avoid a problem of self-reference one need only partition the specification into two components. One component would state the requirement for the relationship between the models of H and the models of S, and the other component would describe everything not included in or implied by the first component of the specification.

  24. 24.

    Although beyond the scope of this paper, it’s worth noting that criteria (A) and (Q) rest on a theory of verification that does not appear to be limited to software regimes as such, and thus might help to characterize verification in ordinary empirical science (and even more generally, in any regime in which verification must be accomplished by a multi-step procedure in finite time).

  25. 25.

    In practice, we would attempt to limit the range of such sequences to those that had passed some testing or formal verification regimen.

  26. 26.

    We thank a reviewer of an earlier version of this paper for this objection.

  27. 27.

    We thank Troy Catterson for suggesting this elegant construction.


  1. Amdahl, G. M. (1967). Validity of the single processor approach to achieving large-scale computing capabilities. AFIPS Conference Proceedings, 30, 483–485. https://doi.org/10.1145/1465482.1465560.

    Article  Google Scholar 

  2. Amman, P., & Offutt, J. (2016). Introduction to software testing (2nd ed.). Cambridge: Cambridge University Press.

    Google Scholar 

  3. Baier, C., & Katoen, J. P. (2008). Principles of model checking. Cambridge: MIT Press.

    Google Scholar 

  4. Black, R., Veenendaal, E., & Graham, G. (2012). Foundations of software testing ISTQB certification. Boston: Cengage Learning EMEA.

    Google Scholar 

  5. Blum, E. K., Paul, M., & Takasu, S. (Eds.). (1979). Mathematical studies of information processing: Proceedings of the international conference, Kyoto, Japan, August 23-26, 1978. Lecture notes in computer science (Vol. 75). Springer.

  6. Boolos, G., Burgess, J., & Jeffrey, R. (2007). Computability and logic (5th ed.). Cambridge: Cambridge University Press.

    Google Scholar 

  7. Boschetti, F., Fulton, E. A., Bradbury, R., & Symons, J. (2012). What is a model, why people don’t trust them, and why they should. Negotiating our future: Living scenarios for Australia to 2050, 2, 107–119.

    Google Scholar 

  8. Brockwell, P. J., & Davis, R. A. (2006). Time series: Theory and methods (2nd ed.). Berlin: Springer.

    Google Scholar 

  9. Chang, C., & Keisler, J. (2012). Model theory (3rd ed.). New York: Dover.

    Google Scholar 

  10. Chung, K. L. (2001). A course in probability theory (3rd ed.). New York: Academic Press.

    Google Scholar 

  11. Clarke, E. M., Bloem, R., Veith, H., & Henzinger, T. A. (Eds.). (2018). Handbook of model checking. Berlin: Springer.

    Google Scholar 

  12. Clarke, E. M., & Emerson, E. A. (1981). Design and synthesis of synchronization skeletons for branching time temporal logic. In D. Kozen (Ed.), Logic of programs. Lecture notes in computer science (Vol. 131, pp. 52–71). Berlin: Springer.

    Google Scholar 

  13. Copeland, J., Dresner, E., Proudfoot, D., & Shagrir, O. (2016). Time to reinspect the foundations? Communications of the ACM, 59(11), 34–38.

    Article  Google Scholar 

  14. Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). Hoboken: Wiley.

    Google Scholar 

  15. Davis, M. (2004). The myth of hypercomputation. In C. Teuscher (Ed.), Alan turing: Life and legacy of a great thinker (pp. 195–211). Berlin: Springer.

    Google Scholar 

  16. DeMarco, T. (1979). Structured analysis and system specification. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  17. Diestel, R. (1997). Graph theory. New York: Springer.

    Google Scholar 

  18. Emerson, E. A. (2008). The beginning of model checking: A personal perspective. In O. Grumberg & H. Veith (Eds.), 25 years of model checking—History, achievements, perspectives. Vol. 5000 of lecture notes in computer science. Berlin: Springer.

    Google Scholar 

  19. Fetzer, J. H. (1988). Program verification: The very idea. Communications of the ACM, 31(9), 1048–1063.

    Article  Google Scholar 

  20. Floridi, L., Fresco, N., & Primiero, G. (2015). On malfunctioning software. Synthese, 192(4), 1199–1220.

    Article  Google Scholar 

  21. Floyd, R. W. (1967). Assigning meanings to programs. In Schwartz, J. T. (Ed.), Proceedings of a symposium in applied mathematics. Mathematical aspects of computer science (Vol. 19, pp. 19–32). Dordrecht: Springer.

  22. Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 38, 173–198.

    Article  Google Scholar 

  23. Gries, D. (1981). The science of programming. New York: Springer.

    Google Scholar 

  24. Hennessy, J., & Patterson, D. (2007). Computer architecture: A quantitative approach (4th ed.). New York: Elsevier.

    Google Scholar 

  25. Hoare, C. A. R. (1969). An axiomatic basis for computer programming. Communications of the ACM, 12, 576–580.

    Article  Google Scholar 

  26. Hogg, R., McKean, J., & Craig, A. (2005). Introduction to mathematical statistics (6th ed.). London: Pearson.

    Google Scholar 

  27. Horner, J. K. (2003). The development programmatics of large scientific codes. In Proceedings of the 2003 international conference on software engineering research and practice (pp. 224–227). Athens, Georgia: CSREA Press.

  28. Horner, J. K., & Symons, J. (2014). Reply to Angius and Primiero on software intensive science. Philosophy & Technology, 27(3), 491–494.

    Article  Google Scholar 

  29. Huth, M., & Ryan, M. (2004). Logic in computer science. Cambridge: Cambridge University Press.

    Google Scholar 

  30. IEEE. (2000). IEEE-STD-1471-2000. Recommended practice for architectural description of software-intensive systems. http://standards.IEEE.org. Accessed 10 Nov 2018.

  31. Koopman, P. (2014). A case study of Toyota unintended acceleration and software safety. https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_slides.pdf. Accessed 17 Apr 2018.

  32. Kozen, D. (1983). Results on the propositional μ-calculus. Theoretical Computer Science, 27, 333–354.

    Article  Google Scholar 

  33. Littlewood, B., & Strigini, L. (2000). Software reliability and dependability: A roadmap. Proceedings of the Conference on the Future of Software Engineering. https://doi.org/10.1145/336512.336551.

    Article  Google Scholar 

  34. Löwenheim, L. (1915). Über Möglichkeiten im Relativkalkül. Mathematische Annalen, 76(4): 447–470, https://doi.org/10.1007/bf01458217. A translation to English can be found in Löwenheim, Leopold (1977), “On possibilities in the calculus of relatives”, From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931 (3rd ed.), Cambridge, Massachusetts: Harvard University Press, pp. 228–251.

  35. McCabe, T. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2, 308–320.

    Article  Google Scholar 

  36. Millikan, R. G. (1989). In defense of proper functions. Philosophy of Science, 56(2), 288–302.

    Article  Google Scholar 

  37. Mostowski, A., Robinson, R. M., & Tarski, A. (1953). Undecidability and essential undecidability in arithmetic. In A. Tarski, A. Mostowski, & R. M. Robinson (Eds.), Undecidable theories. New York: Dover reprint.

    Google Scholar 

  38. Nielson, F., Nielson, H. R., & Hankin, C. (1999). Principles of program analysis. Berlin: Springer.

    Google Scholar 

  39. Owicki, S., & Lamport, L. (1982). Proving liveness properties of concurrent programs. ACM Transactions on Programming Languages and Systems, 4, 155–495.

    Article  Google Scholar 

  40. Pneuli, A. (1977). The temporal logic of programs. In Proceedings of the 18th annual symposium on foundations of computer science, pp. 46–57.

  41. Reichenbach, H. (1957). The philosophy of space and time. (Maria Reichenbach, Trans.). Dover edition.

  42. Skolem, T. (1920), Logisch-kombinatorische Untersuchungen über die Erfüllbarkeit oder Beweisbarkeit mathematischer Sätze nebst einem Theoreme über dichte Mengen. Videnskapsselskapet Skrifter, I. Matematisk-naturvidenskabelig Klasse, 6: 1–36. An English translation can be found in Skolem, T. (1977), “Logico-combinatorical investigations in the satisfiability or provabilitiy of mathematical propositions: A simplified proof of a theorem by L. Löwenheim and generalizations of the theorem”, From Frege to Gödel: A Source Book in Mathematical Logic, 18791931 (3rd ed.), Cambridge, Massachusetts: Harvard University Press, pp. 252–263.

  43. Symons, J., & Alvarado, R. (2016). Can we trust Big Data? Applying philosophy of science to software. Big Data & Society, 3(2), 2053951716664747.

    Article  Google Scholar 

  44. Symons, J., & Alvarado, R. (2019). Epistemic entitlements and the practice of computer simulation. Minds and Machines. https://doi.org/10.1007/s11023-018-9487-0.

    Article  Google Scholar 

  45. Symons, J., & Horner, J. K. (2014). Software intensive science. Philosophy and Technology, 27(3), 461–477.

    Article  Google Scholar 

  46. Symons, J., & Horner, J. K. (2017). Software error as a limit to inquiry for finite agents: Challenges for the post-human scientist. In T. Powers (Ed.), Philosophy and computing: Essays in epistemology, philosophy of mind, logic, and ethics (pp. 85–97). Berlin: Springer.

    Google Scholar 

  47. Turing, A. M. (1936). On computable numbers, with an application to the entscheidungsproblem. Proceedings of the London Mathematical Society, 42, 230–265.

    Google Scholar 

  48. Turing, A. M. (1950). Computing machinery and intelligence. Mind, LIX, 433–460.

    Article  Google Scholar 

  49. Valmari, A. (1998). The state explosion problem. Lectures on petri nets I: Basic models. Lectures in computer science (Vol. 1491, pp. 429–528). Berlin: Springer.

    Google Scholar 

  50. Venema, Y. (2001). Temporal logic. In L. Goble (Ed.), The Blackwell guide to philosophical logic (pp. 259–281). Hoboken: Blackwell.

    Google Scholar 

Download references


We are grateful to Perry Alexander, Ray Bongiorni, Troy Catterson, Richard de George, Corey Maley, Eileen Nutting, and two anonymous referees for this journal for their critical comments. This work is partly supported by The National Security Agency through the Science of Security initiative contract #H98230-18-D-0009.

Author information



Corresponding author

Correspondence to John Symons.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Symons, J., Horner, J.K. Why There is no General Solution to the Problem of Software Verification. Found Sci 25, 541–557 (2020). https://doi.org/10.1007/s10699-019-09611-w

Download citation


  • Software verification
  • Error
  • Model checker
  • Philosophy of computer science