This paper argues that the difference between contemporary software intensive scientific practice and more traditional non-software intensive varieties results from the characteristically high conditionality of software. We explain why the path complexity of programs with high conditionality imposes limits on standard error correction techniques and why this matters. While it is possible, in general, to characterize the error distribution in inquiry that does not involve high conditionality, we cannot characterize the error distribution in inquiry that depends on software. Software intensive science presents distinctive error and uncertainty modalities that pose new challenges for the epistemology of science.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
See Symons 2008 for a discussion of how computational models have figured in discussions of the metaphysics and epistemology of science.
Software has begun to perform some of the functions that, in the pre-software era, were considered distinctively human aspects of science. For example, Michael Schmidt and Hod Lipson described how their program Eureqa inferred Newton’s second law and the law of conservation of momentum from descriptions of the behaviour of a double-pendulum system (Schmidt and Lipson 2009). More recently, Eugene Loukine and colleagues demonstrated a model that was able to predict unforeseen side-effects for pharmaceuticals that were already approved for consumption (Eugen 2012). These two papers represent very different examples of software intensive science: one is a system which is capable of generating theoretical insights and law-like relationships from a data set, while the other makes dramatic progress on a specific practical question of great importance. Examples like these indicate that across a broad swath of scientific endeavor, from highly theoretical to applied science, inquiry itself is no longer purely a matter of individual or collective human effort. Across the sciences, software-intensive systems are increasingly driving the direction of research and in some cases are already beginning to displace human researchers. Unlike previous improvements in scientific technology, computers not only extend our capacities, but are taking on at least some of the cognitive aspects of theoretical work in the sciences. Fundamental to understanding the character of post-human science is careful attention to the nature of its distinctive kinds of error and uncertainty.
Of course, there are trivial counterexamples, such as the case of a program containing the same instruction repeatedly; say for example, the instruction “a = 2” repeated an arbitrary number of times. Such examples are not representative of typical or even useful software and they certainly have no role in scientific inquiry.
As we use the term here, a method is effective for a class of problems iff (Hunter 1971, pp. 13–15)
it consists of a finite number of exact, finite instructions
when applied to a problem from its class, it always finishes (terminates) after a finite number of steps
when applied to a problem from its class, it always produces a correct answer
A computer language is Turing complete if it can be simulated on a single-tape Turing machine (Boolos et al. 2002). Being Turing complete is a condition of adequacy for being a general-purpose computer language.
Here’s why: The equivalent of the “if-then” schemata is realizable in a Turing machine (e.g., “not-x or y” is representable in a Turing machine (Boolos et al. 2002), which is logically equivalent to “if x, then y”). Therefore, any Turing complete language must be able to simulate the “if-then-else” schemata. How, specifically, one maps the Turing-machine equivalent of the “if-then” schemata into a particular Turing complete language will in general depend on the particulars of the language of interest. For the purposes of this paper, we do not need to consider the precise details of those mappings: it is enough for our purposes that such mappings exist.
More precisely, the sample size required to attain a given confidence level is a function of the distribution of interest.
One can distinguish several kinds of testing in terms of properties of control-flow graphs (Nielson et al. 1999). By “testing every path” in a software system, we mean “executing, and analyzing the results of that execution of, all edges and all combinations of condition-edges in the control-graph representation of the software system of interest.”
The path-test cases for some software could be executed in parallel (Hennessy and Patterson 2007, p. 68); in theory, given a large enough parallel machine, all path tests in such a case could be executed in the same time it takes to execute one test case. But these are special cases. In general, we must consider cases in which we must execute the tests serially.
High path complexity is not the only aspect of SIS that has not received adequate attention to date by philosophers. As one anonymous referee for this paper points out, the high variability in the methods, algorithms, and language choices evident in SIS also has no counterpart in NSIS, leading to, among other things, fundamental questions of commensurability among different software systems that nominally concern the same subject matter. For example, there are at least 10 widely used numerical methods for solving systems of partial differential equations, and the results they produce are in general not identical (Morton and Mayers 2005). In addition, simply changing the computer language in which an algorithm is realized is not, for some pairs of languages such as Fortran 77 and C, even well-defined because the language standards do not provide an adequate basis for inter-language translation of certain numerical types (ANSI 1977; ISO/IEC 2005; Feldman et al. 1990). Problems of this kind have led to serious errors whose origin is quite difficult to isolate in practice. (No such problems arise in NSIS.) All these issues clearly bear on the reliability of software and scientific inferences based on the use of software. These topics merit careful treatment in their own right. Here, however, we focus on the distinctively high conditionality of SIS.
For a derivation, see Hogg et al. 2005 (Sections 2.6 and 9.4).
An algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function (Boolos et al. 2002).
This does not imply, of course, that the software does not have the same error distribution as M: it merely means that we would not have a warrant to make the inference that the software has the error distribution of M, on the basis of the procedure.
The existence of a requirement for high confidence does not, as such, imply that this requirement is satisfied.
In some computer languages, this can be done by implicitly accepting the default specification. So-called “interpreted” languages, which include many of the widely scripting languages in the UNIX family of operating environments, determine “type” only during execution. Type management in these contexts is obviously fragile.
Alexandrova, A. (2008). Making models count. Philosophy of Science, 75(3), 383–404.
ANSI. (1977). American National Standard Programming Language Fortran. ANSI, X3, 9–1977.
Batterman, R. W. (2009). Idealization and modeling. Synthese, 169(3), 427–446.
Black, R., van Veenendaal, E., & Graham, D. (2012). Foundations of software testing ISTQB certification. Cengage Learning EMEA.
Bokulich, A. (2011). How scientific models can explain. Synthese, 180(1), 33–45.
Bolinska, A. (2013). Epistemic representation, informativeness and the aim of faithful representation. Synthese, 190(2), 219–234.
Boolos, G., Burgess, J., & Jeffrey, R. (2002). Computability and Logic (4th ed.). Cambridge: Cambridge University Press.
Boschetti, F., Fulton, E. A., Bradbury, R. H., & Symons, J. (2012). What is a model, why people don’t trust them, and why they should. In Negotiating our future: Living scenarios for Australia to 2050, Vol 2, 107–119). Australian Academy of Science.
Center for Systems and Software Engineering, University of Southern California. (2013). COCOMO II. http://csse.usc.edu/csse/research/COCOMOII/cocomo_main.html.
Chakravartty, A. (2011). Scientific realism. In Stanford encyclopedia of philosophy. E. Zalta (Ed.). http://plato.stanford.edu/entries/scientific-realism/.
Chang, C., & Keisler, J. (1990). Model theory. North-Holland.
Chung, K. (2001). A course in probability theory (3rd ed.). New York: Academic.
Cox, D. (2006). Principles of statistical inference. Cambridge: Cambridge University Press.
Diestel, R. (1997). Graph theory. New York: Springer.
Eugen, L. (2012). Large-scale prediction and testing of drug activity on side-effect targets. Nature, 486(7403), 361–367.
Feldman, S. I., Gay, D. M. Maimone, M. W., & Schryer, N. (1990). A Fortran to C Converter. AT&T Bell Laboratories technical report.
Fewster, M., & Graham, D. (1999). Software test automation. Reading: Addison-Wesley.
Frigg, R., & Reiss, J. (2009). The philosophy of simulation: hot new issues or same old stew? Synthese, 169(3), 593–613.
Giere, R. (1976). Empirical probability, objective statistical methods, and scientific inquiry. In C. A. Hooker & W. Harper (Eds.), Foundations of probability theory, statistical inference, and statistical theories of science (Vol. 2, pp. 63–101). Dordrecht: Reidel.
Good, I. J. (1983). Good thinking: The Foundations of probability and its applications. University of Minnesota Press. Republished by Dover, 2009.
Graham, R. M., Clancy, G. J., Jr., & DeVaney, D. B. (1973). A software design and evaluation system. Communications of the ACM, 16(2), 110–116. Reprinted in E Yourdon, (Ed.), Writings of the Revolution. New York: Yourdon Press, 1982 (pp. 112–122).
Guala, F. (2002). Models, simulations, and experiments. In Model-based reasoning (pp. 59–74). Springer
Gustafson, J. (1998). Computational verifiability and the ASCI Program. Computational Science and Engineering 5, 36–45. http://www.johngustafson.net/pubs/pub55/ASCIPaper.htm.
Halmos, P. (1950). Measure theory. D. Van Nostrand Reinhold.
Hatton, L. (1997). The T experiments: errors in scientific software. IEEE Computational Science and Engineering 4, 27–38. Also available at http://www.leshatton.org/1997/04/the-t-experiments-errors-in-scientific-software/.
Hatton, L. (2013). Power-laws and the conservation of information in discrete token systems: Part 1: General theory. http://www.leshatton.org/Documents/arxiv_jul2012_hatton.pdf.
Hennessy, J., & Patterson, D. (2007). Computer architecture: A quantitative approach (4th ed.). New York: Elsevier.
Hogg, R., McKean, J., & Craig, A. (2005). Introduction to mathematical statistics (6th ed.). Upper Saddle River: Pearson.
Horner, J. K. (2003). The development programmatics of large scientific codes. Proceedings of the 2003 International Conference on Software Engineering Research and Practice (pp. 224–227). Athens: CSREA Press.
Horner, J. K. (2013). Persistence of Plummer-distributed small globular clusters as a function of primordial-binary population size. Proceedings of the 2013 International Conference on Scientific Computing (pp. 38–44). Athens: CSREA Press.
Humphreys, P. (1994). Numerical experimentation. In Patrick Suppes: Scientific philosopher (pp. 103–121). Kluwer.
Hunter, G. (1971). Metalogic: An introduction to the metatheory of standard first-order logic. Berkeley: University of California Press.
IEEE. (2000). IEEE-STD-1471-2000. Recommended practice for architectural description of software-intensive systems. http://standards.IEEE.org.
ISO/IEC. (2005). ISO/IEC 9899: TC2—Programming languages – C—Open standards.
ISO/IEC. (2008). ISO/IEC 12207:2008. Systems and software engineering—Software life cycle processes.
Kuhn, T. (1970). The structure of scientific revolutions. Second edition, enlarged (2nd ed.). Chicago: University of Chicago Press.
Littlewood, B., & Strigini, L. (2000). Software reliability and dependability: a roadmap. ICSE ‘00 Proceedings of the Conference on the Future of Software Engineering (pp. 175–188).
Maxwell, J. (1891). A treatise on electricity and magnetism. Third edition (1891). Dover reprint, 1954.
Mayo, D., & Spanos, A. (2011). Error statistics. In P.S. Bandyopadhyay & M. R. Forster (volume Eds.). D. M. Gabbay, P. Thagard & J. Woods (general Eds.), Philosophy of statistics, Handbook of philosophy of science, Volume 7, Philosophy of statistics. (pp. 1–46). Elsevier.
McCabe, T. (1976). A complexity measure. IEEE Transactions on Software Engineering 2, 308–320. Also available at http://www.literateprogramming.com/mccabe.pdf.
Morton, K. W., & Mayers, D. F. (2005). Numerical solution of partial differential equations. Cambridge: Cambridge University Press.
National Coordination Office for Networking and Information Technology Research and Development. (2013). DoE’s ASCI Program. http://www.nitrd.gov/pubs/bluebooks/2001/asci.html.
Newton (1726). The Principia. Edition of 1726 (Trans: Motte, A.). 1848. Prometheus reprint, 1995.
Nielson, F., Nielson, H. R., & Hankin, C. (1999). Principles of program analysis. Heidelberg: Springer.
Oreskes, N., Shrader-Frechette, K., & Belitz, K. (1994). Verification, validation, and confirmation of numerical models in the earth sciences. Science, 263(5147), 641–646.
Parker, W. S. (2009). II—Confirmation and adequacy‐for‐purpose in climate modelling. Aristotelian Society Supplementary Volume, 83 (1).
Peled, D., Pelliccione, P., & Spoletini, P. (2008). Model checking. In B. Wah (Ed.). Wiley encyclopedia of computer science and engineering
Primiero, G. (2013). A taxonomy of errors for information systems. Minds and Machines. doi:10.1007/s11023-013-9307-5.
Reichenbach, H. (1958). The philosophy of space and time. (Trans: Reichenbach, M., & Freund, J). New York: Dover.
Salmon, W. (1967). The foundations of scientific inference. Pittsburgy: University of Pittsburgh Press.
Schmidt, M., & Lipson, H. (2009). Distilling free-form natural laws from experimental data. Science, 324(5923), 81–85.
Silva, J. (2012). A vocabulary of program slicing-based techniques. ACM Computing Surveys 44, Article No. 12.
Sorenson, R. (2011). Epistemic paradoxes. In E. Zalta (Ed.), Stanford encyclopedia of philosophy. http://plato.stanford.edu/entries/epistemic-paradoxes/.
Symons, J. (2008). Computational models of emergent properties. Minds and Machines, 18(4), 475–491.
Symons, J., & Boschetti, F. (2013). How computational models predict the behavior of complex systems. Foundations of Science, 18, 809–821.
Taylor, J. (1982). An introduction to error analysis: The study of uncertainties in physical measurements (2nd ed.). Sausalito: University Science.
United Nations. (1996). Resolution adopted by the general assembly:50/245. Comprehensive Nuclear-Test-Ban Treaty.
Waite, W. M., & Goos, G. (1984). Compiler construction. New York: Springer.
Winsberg, E. (1999). Sanctioning models: the epistemology of simulation. Science in Context, 12(2), 275–292.
Winsberg, E., & Lenhard, J. (2010). Holism and entrenchment in climate model validation. In M. Carrier & A. Nordmann (Eds.), Science in the context of application: Methodological change, conceptual transformation, cultural reorientation. Dordrecht: Springer.
Woodward, J. (2009). Scientific explanation. In E. Zalta (Ed.), Stanford encyclopedia of philosophy. http://plato.stanford.edu/entries/scientific-explanation/.
This work benefited from discussions with Sam Arbesman, George Crawford, Paul Humphreys, and Tony Pawlicki. We are grateful to the reviewers of earlier versions of this paper for extensive and insightful criticisms. For any errors that remain, we blame the path complexity of our (biological) software.
About this article
Cite this article
Symons, J., Horner, J. Software Intensive Science. Philos. Technol. 27, 461–477 (2014). https://doi.org/10.1007/s13347-014-0163-x
- Post-human science