Skip to main content
Log in

The Fate of Explanatory Reasoning in the Age of Big Data

  • Research Article
  • Published:
Philosophy & Technology Aims and scope Submit manuscript

Abstract

In this paper, I critically evaluate several related, provocative claims made by proponents of data-intensive science and “Big Data” which bear on scientific methodology, especially the claim that scientists will soon no longer have any use for familiar concepts like causation and explanation. After introducing the issue, in Section 2, I elaborate on the alleged changes to scientific method that feature prominently in discussions of Big Data. In Section 3, I argue that these methodological claims are in tension with a prominent account of scientific method, often called “Inference to the Best Explanation” (IBE). Later on, in Section 3, I consider an argument against IBE that will be congenial to proponents of Big Data, namely, the argument due to Roche and Sober Analysis, 73:659–668, (2013) that “explanatoriness is evidentially irrelevant.” This argument is based on Bayesianism, one of the most prominent general accounts of theory-confirmation. In Section 4, I consider some extant responses to this argument, especially that of Climenhaga Philosophy of Science, 84:359–368, (2017). In Section 5, I argue that Roche and Sober’s argument does not show that explanatory reasoning is dispensable. In Section 6, I argue that there is good reason to think explanatory reasoning will continue to prove indispensable in scientific practice. Drawing on Cicero’s oft-neglected De Divinatione, I formulate what I call the “Ciceronian Causal-nomological Requirement” (CCR), which states, roughly, that causal-nomological knowledge is essential for relying on correlations in predictive inference. I defend a version of the CCR by appealing to the challenge of “spurious correlations,” chance correlations which we should not rely upon for predictive inference. In Section 7, I offer some concluding remarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. However, see Pietsch (2016), who critically evaluates some of these claims, and with whose views I am broadly sympathetic, as well as Leonelli (2012), who considers the impact of Big Data on biological practice.

  2. See Kitchin (2014, pp. 5–7) for a more extensive discussion and critical examination of such claims. One moderate and less provocative thesis that Kitchin advances is that, rather than replacing human scientists, new data-mining techniques should be used to supplement traditional scientific methods by “reveal[ing] information which will be of potential interest and is worthy of further research” (2014, p. 6). Here, Kitchin locates the primary epistemological significance of Big Data analytics in the “context of discovery” (or the “context of pursuit”) rather than the “context of justification.”

  3. See Northcott (2020) for some helpful case studies on the extent to which Big Data analytics actually improves predictive performance.

  4. In addition to Darwin’s theory of natural selection and common ancestry (Okasha 2000), this list includes the Copernican argument for the heliocentric model of the solar system (Gauch 2012), and Huygens’ argument for the wave theory of light over Newton’s particle theory (Thagard 1978).

  5. The reason that R&S choose to evaluate EER instead of the similar inequality: Pr(H|O&E) >  Pr(H) is that O might confirm H by itself, and E might be irrelevant to H once joined with O, neither lowering the probability of H nor raising H’s probability. Thus, it is not enough to simply show that Pr(H|O&E) >  Pr(H) to demonstrate that explanatoriness is evidentially relevant for the Bayesian. What must be shown is EER.

  6. This is in contrast to the “old argument” for incompatibilism laid out by van Fraassen (1989).

  7. A few points of clarification are in order. First, it should be noted that R&S do not explicitly make this argument, and so they do not defend P1 of the New Argument. Second, in their defense of SOT, R&S presuppose a Bayesian conception of evidential irrelevance; however, they leave open the possibility that “there are alternative senses of evidential irrelevance on which explanatoriness is evidentially relevant” (Roche and Sober 2017, p. 582). As we will see below, this possibility can be used to push back against the New Argument for Incompatibilism.

  8. Climenhaga explicitly puts the point in terms of “epistemic” probabilities, the sort traditionally defended by Keynes (1921) and Carnap (1950). These probability statements are said to codify supposed objective relationships between propositions, and are such that if Pr(H|O&K) = r, then our degree of belief in H given that our evidence is O&K ought to be set equal to r. Thus, the sort of Bayesianism assumed here is one in between the purely subjective and the purely objective view. The objectivity of epistemic probabilities is supposed to be analogous to the way in which the entailment relation between propositions is objective.

  9. My response to R&S is thus similar in kind, though different in its details, to that which has already been explored by McCain and Poston (2014, 2018), who can also be interpreted as rejecting P1 of the New Argument. I’ll have more to say about McCain and Poston’s response to R&S and its relation to my main argument below.

  10. Although no explicit discussion of the different versions of IBE has hitherto appeared in connection to the debate over SOT, it should be noted that at one point, R&S briefly allude to a weaker version of explanationism, admitting that their argument, of course, will not undermine views according to which, “IBE is entirely parasitic on a Bayesian calculation of posterior probabilities”(Roche and Sober 2013, p. 665, fn. 3).

  11. As Lycan (2002, p. 417) notes, the Ferocious view, although having its defenders, is “disputed by almost everyone.” Surprisingly, this version of explanationism has a number of proponents, including Harman (1986), Lycan (1988), Conee and Feldman (2004), and Poston (2014).

  12. Given the many different ways in which explanatory reasoning manifests itself, clearly this task is easier said than done. Roche and Sober (2014, p. 195) attempt to dispatch with the Einstein and Newton cases put forward by M&P by pointing out that these are cases of hypothetico-deductive reasoning, which can be given a straightforward Bayesian rationale; however, since Lavoisier’s reasoning makes essential reference to some form of simplicity, it is not obvious how to analyze this instance of explanatory reasoning in non-explanationist terms. See Cabrera (2017) for a discussion of the relationship between Bayesianism and the various explanatory virtues.

  13. See Roche and Sober (2014, pp. 196–7) for a response to M&P’s proposal regarding the connection between explanatory considerations and the resiliency of a probability function. Roche and Sober (2014, p. 197) remark that their SOT is “neutral on M&P’s thesis regarding explanatoriness and evidential relevance.” It is likely that they will respond similarly to the proposal I develop below. However, one worry is that if R&S admit other senses in which explanatoriness can be evidentially relevant, then the slogan that is used to characterize their thesis may end up misleading.

  14. Here, I follow Beard (1986) in using “Cicero” to denote the author of the dialog and “Marcus” to denote the character in the dialog. In addition, I rely on the Loeb translation of De Divinatione (Cicero 1923). 

  15. In order to avoid trivial falsity, A and B need to be logically distinct event-types. For instance, if A = “is a bachelor” and B = “is an adult, unmarried, male”, then, obviously, one should infer B on the basis of A, even though there is no causal connection between A and B. In this case, the connection between A and B is logical rather than causal; knowledge of this logical connection ensures that the inference from A to B is rational.

  16. Drawing on Steel (2003), Climenhaga considers the possibility that time could be a common cause of the correlation between British bread prices and Venetian sea-levels (2017, p. 364); however, it is unclear if time itself can stand in causal relations, given the plausible assumption that causation is a relation between events. On a standard account of events (e.g., Kim 1993), events are objects instantiating a property at a time. Since time is necessary but not sufficient for being an event, it is unclear how time itself could be the cause of anything.

  17. Something like the CCR may be what McCain and Poston have in mind when they write that the inference in the smoking-and-cancer cases is justified only if we have a “justified belief in an unknown explanatory story” (2014, p. 150).

References

  • Arntzenius, F. (2010). “Reichenbach’s Common Cause Principle,” The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.), URL =https://plato.stanford.edu/archives/fall2010/entries/physics-Rpcc. Accessed 13 Aug 2020.

  • Beard, M. T. (1986). Cicero and divination: the formation of a Latin discourse. Journal of Romance Studies, 76, 33–46.

    Article  Google Scholar 

  • Cabrera, F. (2017) Can there be a Bayesian explanationism? On the prospects of a productive partnership. Synthese, 194(4), 1245–1272.

  • Cabrera, F. (2020) Evidence and explanation in Cicero’s on divination. Studies in History and Philosophy of Science Part A, 82, 34–43.

  • Calude, C. S., & Longo, G. (2016). The deluge of spurious correlations in big data. Foundations of Science, 1–18. https://doi.org/10.1007/s10699-016-9489-4.

  • Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago Press.

    Google Scholar 

  • Cicero. (1923) On Old Age. On Friendship. On Divination. Translated by W. A. Falconer. Loeb Classical Library 154. Cambridge: Harvard University Press.

  • Climenhaga, N. (2017). How explanation guides confirmation. Philosophy of Science, 84, 359–368.

    Article  Google Scholar 

  • Conee, E., & Feldman, R. (2004). Evidentialism: essays in epistemology. Oxford University Press.

  • Darwin, C. (1872). On the origin of species. London: John Murray.

    Google Scholar 

  • Douven, I. (2011). “Abduction”, The Stanford Encyclopedia of Philosophy (spring 2011 edition), ed. E. N. Zalta, URL=http://plato.stanford.edu/archives/spr2011/entries/abduction/. Accessed 13 Aug 2020.

  • Gauch, H. G. (2012). Scientific method in brief. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74, 88–95.

    Article  Google Scholar 

  • Harman, G. (1986). Change in view: Principles of reasoning. MIT Press.

  • Hey, T., Tansley, S., & Tolle, K. (2009). Jim Grey on eScience: a transformed scientific method. In T. Hey, S. Tansley, & K. Tolle (Eds.), The Fourth Paradigm: Data-Intensive Scientific Discovery (pp. xvii–xxxi). Redmond: Microsoft Research.

    Google Scholar 

  • Keynes, J. (1921). A treatise on probability. London: Macmillan.

    Google Scholar 

  • Kim, J. (1993). Supervenience and mind: Selected philosophical essays. New York: Cambridge University Press.

    Book  Google Scholar 

  • Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12.

    Article  Google Scholar 

  • Leonelli, S. (2012). Introduction: making sense of data-driven research in the biological and bio- medical sciences. Studies in History and Philosophy of Biological and Biomedical Sciences, 43(1), 1–3.

    Article  Google Scholar 

  • Lewis, D. (1980). A subjectivist’s guide to objective chance. The University of Western Ontario Series in Philosophy of Science, 15, 267–297.

    Google Scholar 

  • Lipton, P. (2004). Inference to the best explanation (2nd ed.). New York: Routledge.

    Google Scholar 

  • Lycan, W. G. (1988). Judgement and justification. Cambridge: Cambridge University Press.

    Google Scholar 

  • Lycan, W. G. (2002). Explanation and epistemology. In P. Moser (Ed.), The Oxford handbook of epistemology. Oxford: Oxford University Press.

    Google Scholar 

  • Mayer-Schonberger, V., & Cukier, K. (2013). Big data: a revolution that will transform how we live, work and think. London: John Murray Publisher.

    Google Scholar 

  • Mazzocchi, F. (2015). Could big data be the end of theory in science? A few remarks on the epistemology of data-driven science. EMBO Reports, 16(10), 1250–1255.

    Article  Google Scholar 

  • McCain, K., & Poston, T. (2014). Why explanatoriness is evidentially relevant. Thought, 3, 145–153.

    Google Scholar 

  • McCain, K., & Poston, T. (2018). The evidential impact of explanatory considerations. In McCain & Poston (Eds.), Best Explanations: New Essays on Inference to the Best Explanation (pp. 121–129). Oxford: Oxford University press.

    Google Scholar 

  • Mittelstadt, B., & Floridi, L. (2016). The ethics of big data: current and foreseeable issues in biomedical contexts. Science and Engineering Ethics, 22(2), 303–341.

    Article  Google Scholar 

  • Northcott, E. (2020). Big data and prediction: four case studies. Studies in History and Philosophy of Science. Part A, 81, 96–104.

  • Norton, J. D. (2003). Causation as folk science. Philosophers’ Imprint, 3(4), 1–22.

  • Okasha, S. (2000). Van Fraassen’s critique of inference to the best explanation. Studies in History and Philosophy of Science, 31, 691–710.

    Article  Google Scholar 

  • Pietsch, W. (2016). The causal nature of modeling with big data. Philosophy and Technology, 29(2), 137–171.

    Article  Google Scholar 

  • Pietsch, W., & Wernecke, J. (2017). Introduction: ten theses on big data and computability. In W. Pietsch, J. Wernecke, & M. Ott (Eds.), Berechenbarkeit der Welt? Wiesbaden: Springer VS.

    Chapter  Google Scholar 

  • Poston, T. (2014). Reason & explanation: a defense of explanatory coherentism. New York: Palgrave-MacMillan.

    Book  Google Scholar 

  • Psillos, S. (2002). Simply the best: A case for abduction. In A. C. Kakas & F. Sadri (Eds.), Computational logic: logic programming and beyond (pp. 605–626). Berlin: Springer-Verlag.

    Chapter  Google Scholar 

  • Reichenbach, H. (1956). The direction of time. Berkeley: University of California Press.

    Book  Google Scholar 

  • Roche, W., & Sober, E. (2013). Explanatoriness is evidentially irrelevant, or inference to the best explanation meets Bayesian confirmation theory. Analysis, 73, 659–668.

    Article  Google Scholar 

  • Roche, W., & Sober, E. (2014). Explanatoriness and evidence: a reply to McCain and Poston. Thought, 3(3), 193–199.

    Google Scholar 

  • Roche, W., & Sober, E. (2017). Is explanatoriness a guide to confirmation? A reply to Climenhaga. Journal for General Philosophy of Science, 48(4), 581–590.

    Article  Google Scholar 

  • Siegel, E. (2013). Predictive analytics: the power to predict who will click, buy, lie, or die. Hoboken: Wiley.

    Google Scholar 

  • Sober, E. (2001). Venetian sea levels, British bread prices, and the principle of the common cause. British Journal for the Philosophy of Science, 52(2), 331–346.

    Article  Google Scholar 

  • Sober, E. (2002). “Bayesianism—its scope and limits” in R. Swinburne, (ed.), Bayes’ Theorem, Proceedings of the British Academy Press, 113: 21–38.

  • Steel, D. (2003). Making time stand still: a response to Sober’s counter-example to the principle of the common cause. The British Journal for the Philosophy of Science, 54(2), 309–317.

    Article  Google Scholar 

  • Strevens, M. (2006). Explanation. In D. M. Borchert (Ed.), Encyclopedia of philosophy (2nd ed., pp. 518–527). Detroit: Macmillan.

    Google Scholar 

  • Thagard, P. (1978). The best explanation: criteria for theory choice. The Journal of Philosophy, 75(2), 76–92.

    Article  Google Scholar 

  • van Fraassen, B. C. (1989). Laws and Symmetry. Oxford: Oxford University Press.

    Book  Google Scholar 

Download references

Acknowledgements

I am grateful to Elliott Sober and to the audience at the 2018 Meeting of the Philosophy of Science Association for comments on ealier drafts of this paper. In addition, I am grateful to the anonymous reviewers at Philosophy & Technology for many helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Cabrera.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cabrera, F. The Fate of Explanatory Reasoning in the Age of Big Data. Philos. Technol. 34, 645–665 (2021). https://doi.org/10.1007/s13347-020-00420-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13347-020-00420-9

Keywords

Navigation