Abstract
In this paper, I critically evaluate several related, provocative claims made by proponents of data-intensive science and “Big Data” which bear on scientific methodology, especially the claim that scientists will soon no longer have any use for familiar concepts like causation and explanation. After introducing the issue, in Section 2, I elaborate on the alleged changes to scientific method that feature prominently in discussions of Big Data. In Section 3, I argue that these methodological claims are in tension with a prominent account of scientific method, often called “Inference to the Best Explanation” (IBE). Later on, in Section 3, I consider an argument against IBE that will be congenial to proponents of Big Data, namely, the argument due to Roche and Sober Analysis, 73:659–668, (2013) that “explanatoriness is evidentially irrelevant.” This argument is based on Bayesianism, one of the most prominent general accounts of theory-confirmation. In Section 4, I consider some extant responses to this argument, especially that of Climenhaga Philosophy of Science, 84:359–368, (2017). In Section 5, I argue that Roche and Sober’s argument does not show that explanatory reasoning is dispensable. In Section 6, I argue that there is good reason to think explanatory reasoning will continue to prove indispensable in scientific practice. Drawing on Cicero’s oft-neglected De Divinatione, I formulate what I call the “Ciceronian Causal-nomological Requirement” (CCR), which states, roughly, that causal-nomological knowledge is essential for relying on correlations in predictive inference. I defend a version of the CCR by appealing to the challenge of “spurious correlations,” chance correlations which we should not rely upon for predictive inference. In Section 7, I offer some concluding remarks.
Similar content being viewed by others
Notes
See Kitchin (2014, pp. 5–7) for a more extensive discussion and critical examination of such claims. One moderate and less provocative thesis that Kitchin advances is that, rather than replacing human scientists, new data-mining techniques should be used to supplement traditional scientific methods by “reveal[ing] information which will be of potential interest and is worthy of further research” (2014, p. 6). Here, Kitchin locates the primary epistemological significance of Big Data analytics in the “context of discovery” (or the “context of pursuit”) rather than the “context of justification.”
See Northcott (2020) for some helpful case studies on the extent to which Big Data analytics actually improves predictive performance.
The reason that R&S choose to evaluate EER instead of the similar inequality: Pr(H|O&E) > Pr(H) is that O might confirm H by itself, and E might be irrelevant to H once joined with O, neither lowering the probability of H nor raising H’s probability. Thus, it is not enough to simply show that Pr(H|O&E) > Pr(H) to demonstrate that explanatoriness is evidentially relevant for the Bayesian. What must be shown is EER.
This is in contrast to the “old argument” for incompatibilism laid out by van Fraassen (1989).
A few points of clarification are in order. First, it should be noted that R&S do not explicitly make this argument, and so they do not defend P1 of the New Argument. Second, in their defense of SOT, R&S presuppose a Bayesian conception of evidential irrelevance; however, they leave open the possibility that “there are alternative senses of evidential irrelevance on which explanatoriness is evidentially relevant” (Roche and Sober 2017, p. 582). As we will see below, this possibility can be used to push back against the New Argument for Incompatibilism.
Climenhaga explicitly puts the point in terms of “epistemic” probabilities, the sort traditionally defended by Keynes (1921) and Carnap (1950). These probability statements are said to codify supposed objective relationships between propositions, and are such that if Pr(H|O&K) = r, then our degree of belief in H given that our evidence is O&K ought to be set equal to r. Thus, the sort of Bayesianism assumed here is one in between the purely subjective and the purely objective view. The objectivity of epistemic probabilities is supposed to be analogous to the way in which the entailment relation between propositions is objective.
My response to R&S is thus similar in kind, though different in its details, to that which has already been explored by McCain and Poston (2014, 2018), who can also be interpreted as rejecting P1 of the New Argument. I’ll have more to say about McCain and Poston’s response to R&S and its relation to my main argument below.
Although no explicit discussion of the different versions of IBE has hitherto appeared in connection to the debate over SOT, it should be noted that at one point, R&S briefly allude to a weaker version of explanationism, admitting that their argument, of course, will not undermine views according to which, “IBE is entirely parasitic on a Bayesian calculation of posterior probabilities”(Roche and Sober 2013, p. 665, fn. 3).
Given the many different ways in which explanatory reasoning manifests itself, clearly this task is easier said than done. Roche and Sober (2014, p. 195) attempt to dispatch with the Einstein and Newton cases put forward by M&P by pointing out that these are cases of hypothetico-deductive reasoning, which can be given a straightforward Bayesian rationale; however, since Lavoisier’s reasoning makes essential reference to some form of simplicity, it is not obvious how to analyze this instance of explanatory reasoning in non-explanationist terms. See Cabrera (2017) for a discussion of the relationship between Bayesianism and the various explanatory virtues.
See Roche and Sober (2014, pp. 196–7) for a response to M&P’s proposal regarding the connection between explanatory considerations and the resiliency of a probability function. Roche and Sober (2014, p. 197) remark that their SOT is “neutral on M&P’s thesis regarding explanatoriness and evidential relevance.” It is likely that they will respond similarly to the proposal I develop below. However, one worry is that if R&S admit other senses in which explanatoriness can be evidentially relevant, then the slogan that is used to characterize their thesis may end up misleading.
In order to avoid trivial falsity, A and B need to be logically distinct event-types. For instance, if A = “is a bachelor” and B = “is an adult, unmarried, male”, then, obviously, one should infer B on the basis of A, even though there is no causal connection between A and B. In this case, the connection between A and B is logical rather than causal; knowledge of this logical connection ensures that the inference from A to B is rational.
Drawing on Steel (2003), Climenhaga considers the possibility that time could be a common cause of the correlation between British bread prices and Venetian sea-levels (2017, p. 364); however, it is unclear if time itself can stand in causal relations, given the plausible assumption that causation is a relation between events. On a standard account of events (e.g., Kim 1993), events are objects instantiating a property at a time. Since time is necessary but not sufficient for being an event, it is unclear how time itself could be the cause of anything.
Something like the CCR may be what McCain and Poston have in mind when they write that the inference in the smoking-and-cancer cases is justified only if we have a “justified belief in an unknown explanatory story” (2014, p. 150).
References
Arntzenius, F. (2010). “Reichenbach’s Common Cause Principle,” The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.), URL =https://plato.stanford.edu/archives/fall2010/entries/physics-Rpcc. Accessed 13 Aug 2020.
Beard, M. T. (1986). Cicero and divination: the formation of a Latin discourse. Journal of Romance Studies, 76, 33–46.
Cabrera, F. (2017) Can there be a Bayesian explanationism? On the prospects of a productive partnership. Synthese, 194(4), 1245–1272.
Cabrera, F. (2020) Evidence and explanation in Cicero’s on divination. Studies in History and Philosophy of Science Part A, 82, 34–43.
Calude, C. S., & Longo, G. (2016). The deluge of spurious correlations in big data. Foundations of Science, 1–18. https://doi.org/10.1007/s10699-016-9489-4.
Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago Press.
Cicero. (1923) On Old Age. On Friendship. On Divination. Translated by W. A. Falconer. Loeb Classical Library 154. Cambridge: Harvard University Press.
Climenhaga, N. (2017). How explanation guides confirmation. Philosophy of Science, 84, 359–368.
Conee, E., & Feldman, R. (2004). Evidentialism: essays in epistemology. Oxford University Press.
Darwin, C. (1872). On the origin of species. London: John Murray.
Douven, I. (2011). “Abduction”, The Stanford Encyclopedia of Philosophy (spring 2011 edition), ed. E. N. Zalta, URL=http://plato.stanford.edu/archives/spr2011/entries/abduction/. Accessed 13 Aug 2020.
Gauch, H. G. (2012). Scientific method in brief. Cambridge: Cambridge University Press.
Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95.
Harman, G. (1986). Change in view: Principles of reasoning. MIT Press.
Hey, T., Tansley, S., & Tolle, K. (2009). Jim Grey on eScience: a transformed scientific method. In T. Hey, S. Tansley, & K. Tolle (Eds.), The Fourth Paradigm: Data-Intensive Scientific Discovery (pp. xvii–xxxi). Redmond: Microsoft Research.
Keynes, J. (1921). A treatise on probability. London: Macmillan.
Kim, J. (1993). Supervenience and mind: Selected philosophical essays. New York: Cambridge University Press.
Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12.
Leonelli, S. (2012). Introduction: making sense of data-driven research in the biological and bio- medical sciences. Studies in History and Philosophy of Biological and Biomedical Sciences, 43(1), 1–3.
Lewis, D. (1980). A subjectivist’s guide to objective chance. The University of Western Ontario Series in Philosophy of Science, 15, 267–297.
Lipton, P. (2004). Inference to the best explanation (2nd ed.). New York: Routledge.
Lycan, W. G. (1988). Judgement and justification. Cambridge: Cambridge University Press.
Lycan, W. G. (2002). Explanation and epistemology. In P. Moser (Ed.), The Oxford handbook of epistemology. Oxford: Oxford University Press.
Mayer-Schonberger, V., & Cukier, K. (2013). Big data: a revolution that will transform how we live, work and think. London: John Murray Publisher.
Mazzocchi, F. (2015). Could big data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO Reports, 16(10), 1250–1255.
McCain, K., & Poston, T. (2014). Why explanatoriness is evidentially relevant. Thought, 3, 145–153.
McCain, K., & Poston, T. (2018). The evidential impact of explanatory considerations. In McCain & Poston (Eds.), Best Explanations: New Essays on Inference to the Best Explanation (pp. 121–129). Oxford: Oxford University press.
Mittelstadt, B., & Floridi, L. (2016). The ethics of big data: current and foreseeable issues in biomedical contexts. Science and Engineering Ethics, 22(2), 303–341.
Northcott, E. (2020). Big data and prediction: four case studies. Studies in History and Philosophy of Science. Part A, 81, 96–104.
Norton, J. D. (2003). Causation as folk science. Philosophers’ Imprint, 3(4), 1–22.
Okasha, S. (2000). Van Fraassen’s critique of inference to the best explanation. Studies in History and Philosophy of Science, 31, 691–710.
Pietsch, W. (2016). The causal nature of modeling with big data. Philosophy and Technology, 29(2), 137–171.
Pietsch, W., & Wernecke, J. (2017). Introduction: ten theses on big data and computability. In W. Pietsch, J. Wernecke, & M. Ott (Eds.), Berechenbarkeit der Welt? Wiesbaden: Springer VS.
Poston, T. (2014). Reason & explanation: a defense of explanatory coherentism. New York: Palgrave-MacMillan.
Psillos, S. (2002). Simply the best: A case for abduction. In A. C. Kakas & F. Sadri (Eds.), Computational logic: logic programming and beyond (pp. 605–626). Berlin: Springer-Verlag.
Reichenbach, H. (1956). The direction of time. Berkeley: University of California Press.
Roche, W., & Sober, E. (2013). Explanatoriness is evidentially irrelevant, or inference to the best explanation meets Bayesian confirmation theory. Analysis, 73, 659–668.
Roche, W., & Sober, E. (2014). Explanatoriness and evidence: a reply to McCain and Poston. Thought, 3(3), 193–199.
Roche, W., & Sober, E. (2017). Is explanatoriness a guide to confirmation? A reply to Climenhaga. Journal for General Philosophy of Science, 48(4), 581–590.
Siegel, E. (2013). Predictive analytics: the power to predict who will click, buy, lie, or die. Hoboken: Wiley.
Sober, E. (2001). Venetian sea levels, British bread prices, and the principle of the common cause. British Journal for the Philosophy of Science, 52(2), 331–346.
Sober, E. (2002). “Bayesianism—its scope and limits” in R. Swinburne, (ed.), Bayes’ Theorem, Proceedings of the British Academy Press, 113: 21–38.
Steel, D. (2003). Making time stand still: a response to Sober’s counter-example to the principle of the common cause. The British Journal for the Philosophy of Science, 54(2), 309–317.
Strevens, M. (2006). Explanation. In D. M. Borchert (Ed.), Encyclopedia of philosophy (2nd ed., pp. 518–527). Detroit: Macmillan.
Thagard, P. (1978). The best explanation: criteria for theory choice. The Journal of Philosophy, 75(2), 76–92.
van Fraassen, B. C. (1989). Laws and Symmetry. Oxford: Oxford University Press.
Acknowledgements
I am grateful to Elliott Sober and to the audience at the 2018 Meeting of the Philosophy of Science Association for comments on ealier drafts of this paper. In addition, I am grateful to the anonymous reviewers at Philosophy & Technology for many helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cabrera, F. The Fate of Explanatory Reasoning in the Age of Big Data. Philos. Technol. 34, 645–665 (2021). https://doi.org/10.1007/s13347-020-00420-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13347-020-00420-9