Evidence, Explanation and Predictive Data Modelling
Predictive risk modelling is a computational method used to generate probabilities correlating events. The output of such systems is typically represented by a statistical score derived from various related and often arbitrary datasets. In many cases, the information generated by such systems is treated as a form of evidence to justify further action. This paper examines the nature of the information generated by such systems and compares it with more orthodox notions of evidence found in epistemology. The paper focuses on a specific example to illustrate the issues: The New Zealand Government has proposed implementing a predictive risk modelling system which purportedly identifies children at risk of a maltreatment event before the age of five. Timothy Williamson’s (2002) conception of epistemology places a requirement on knowledge that it be explanatory. Furthermore, Williamson argues that knowledge is equivalent to evidence. This approach is compared to the claim that the output of such computational systems constitutes evidence. While there may be some utility in using predictive risk modelling systems, I argue, since an explanatory account of the output of such algorithms that meets Williamson’s requirements cannot be given, doubt is cast upon the resulting statistical scores as constituting evidence on generally accepted epistemic grounds. The algorithms employed in such systems are geared towards identifying patterns which turn out to be good correlations. However, rather than providing information about specific individuals and their exposure to risk, a more valid explanation of a high probability score is that the particular variables related to incidents of maltreatment are just higher amongst certain subgroups in a population than they are amongst others. The paper concludes that any justification of the information generated by such systems is generalised and pragmatic at best and the application of this information to individual cases raises various ethical issues.
KeywordsBig data Information and computing ethics Epistemology Explanation Predictive modelling
I wish to acknowledge and sincerely thank Dr Brian Ballsun-Stanton who read a draft of this paper and in true ANZAC and academic spirit made many inspiring and instructive comments. I also thank the two reviewers who challenged me to clarify and think deeper about many of the issues discussed in this article. Any remaining errors are mine alone.
- Ferraris, V., Bosco, F., Cafiero, G., D’Angelo, E., & Suloyeva, Y. (2014). Defining Profiling: Working Paper. From http://www.unicri.it/special_topics/citizen_profiling/WP1_final_version_9_gennaio.pdf
- Goldman, A., & Beddor, B. (2015). Reliabilist epistemology, The Standford Encylopedia of Philosophy. From http://plato.stanford.edu/entries/reliabilism/
- Hempel, K. (1962). Deductive-nomological vs. statistical explanation. In H. Feigl & G. Maxwell (Eds.), Scientific explanation, space and time, vol. 3, Minnesota studies in the philosophy of science (pp. 98–169). Minneapolis: University of Minnesota Press.Google Scholar
- Hempel, K. (1965). Aspects of scientific explanation and other essays in the philosophy of science. New York: Free Press.Google Scholar
- Kelly, T. (2014). Evidence. In E. N. Zalta (Ed.), The Stanford Encyclopedia of philosophy (Fall 2014 Edition). From http://plato.stanford.edu/archives/fall2014/entries/evidence.
- Kim, J. (1988). What is naturalised epistemology? In J. Tomberlin (Ed.), Philosophical perspectives 2, epistemology (pp. 381–405). Atascadero: Ridgeview Publishing Co.Google Scholar
- Laplace, P.-S., Marquis de.  1951. A philosophical essay on probabilities (trans: Truscott, F.W. and Emery, F. L.). New York: Dover Publications.Google Scholar
- O’Connor, D. (2008). Attributions and cognitive closure; stereotypes of perpetrators and victims of child sexual abuse. All Volumes (2001–2008). Paper 13. From http://digitalcommons.unf.edu/ojii_volumes/13
- Salmon, W. (1989). Four decades of scientific explanation. Minneapolis: University of Minnesota Press.Google Scholar
- Talbott, W. (2008). Bayesian epistemology. In E. N. Zalta (Ed.), The Stanford Encyclopedia of philosophy. From http://plato.stanford.edu/archives/sum2015/entries/epistemology-bayesian
- Vaithianathan, R., Malony, T., Jiang, N., De Haan, I., Dale, C., Putnam-Horstein E., & Dare, T. (2013). Vulnerable children: Can administrative data be used to identify children at risk of adverse outcomes? Centre for Applied Research in Economics, Auckland University. Retrieved from https://www.msd.govt.nz/documents/about-msd-and-our-work/publications-resources/research/vulnerable-children/auckland-university-can-administrative-data-be-used-to-identify-children-at-risk-of-adverse-outcome.pdf
- Woodward, J. (2014). Scientific explanation. The Stanford Encyclopedia of Philosophy. From http://plato.stanford.edu/entries/scientific-explanation/#WhaDoStaTheExp
- Zdziarski, J. (2014). The importance of forensic tools validation. From http://www.zdziarski.com/blog/?p=3112