Abstract
Many philosophers have pointed out that statistical evidence, or at least some forms of it, lack desirable epistemic or non-epistemic properties, and that this should make us wary of litigations in which the case against the defendant rests in whole or in part on statistical evidence. Others have responded that such broad reservations about statistical evidence are overly restrictive since appellate courts have expressed nuanced views about statistical evidence. In an effort to clarify and reconcile, I put forward an interpretive analysis of why statistical evidence should raise concerns in some cases but not others. I argue that when there is a mismatch between the specificity of the evidence and the expected specificity of the accusation, statistical evidence—as any other kind of evidence—should be considered insufficient to sustain a conviction. I rely on different stylized court cases to illustrate the explanatory power of this analysis.
Similar content being viewed by others
Notes
See, for example, Tribe (1971), Nesson (1979), Cohen (1977), Thomson (1986), Wasserman (1991), Stein (2005), Ho (2008), Enoch et al. (2012), Cheng (2013), Buchak (2014), Pritchard (2015), Blome-Tillmann (2015), Nunn (2015), Staffel (2016), Smith (2018), Pundik (2017), Littlejohn (2020), Gardiner (2018), Di Bello (2019), Dahlman (2020), Bolinger (2020), Moss (2021), Nelkin (forthcoming).
For an earlier versions of this specificity argument, see Chapter 7 of Di Bello (2013).
On the distinction, see the helpful remarks by Picinali (2016).
For a critique of the use of hypothetical scenarios in theorizing about evidence law, see Allen (2021).
Some scholars speak of reasonable completeness of the evidence or lack thereof. Kaye (1986) gives an anecdotal example of a drunk driving case in which the prosecutor called the arresting officer to the stand who testified about the smell and breath of the man arrested, as well as the results of a field sobriety test. The incriminating evidence did not, however, include the results of a breathalyzer. Kaye notes that the jurors expected to hear about that type of evidence but did not, and thus acquitted the defendant. A similar idea is invoked by Judge Richard Posner in Howard v. Wal-Mart Stores Inc. 160 F.3d 358 (7th. Circ. 1998). Nance (2016) in a recent book gave a more detailed account of what it means for the evidence to be reasonably complete—that is, it should be all the evidence that someone tasked with making a decision at trial would reasonably expect to see from a conscientious investigation of the facts in the type of case at hand.
The literature has refined this point in different ways, drawing on epistemological and moral considerations. According to Thomson (1986), the 99-to-1 statistics are not causally connected, in the appropriate manner, with the facts of the case, while other forms of evidence, say eyewitness testimony, typically are. Colyvan et al. (2001) and Allen and Pardo (2007) emphasize the reference class problem which tend to affect inferences made on the basis of quantitative information. Enoch et al. (2012) and Pritchard (2015) argue that naked statistics are not specific because they lack the modal properties of sensitivity and safety. Wasserman (1991) and Pundik (2017) point out that convictions are attributions of individual (as opposed to group-based) culpability, and naked statistical evidence conflicts with this assumption since it describes group-level features.
Fitelson (2006) offers an excellent discussion of many probability-based accounts of evidential support.
For more on this point, see Duff (2001), Burns (2004). That criminal accusations should usually be specific is a peculiarity of criminal trials. There are many situations in which the same degree of detail is not required. Consider a bank that gives out loans. The bank uses a scoring system that draws on factors such as assets, credit history and job security. Risk scores track the likelihood of default if a loan is granted. Scores range between 0 (zero likelihood of default) and 100 (certain default). Suppose Tera needs a loan to purchase a home. She applies for a loan and is assigned a risk score of 3. The bank deems her at low enough risk of default and grants her the loan. The bank can do business without settling the question whether Tera—specifically—will default or not. If the scoring system is trustworthy, Tera’s default is 3% likely. This is all the bank needs to know. Since the bank expects that 3 out of 100 people with a risk score of 3 will default, it will adjust the interest rate appropriately. So banks—just like casinos and insurance companies—do not need individualized information. They make decisions about long-run patterns. So they can routinely rely on the relevant statistical or actuarial information without problem.
The distinction between the informativeness of the testimony and the error-statistics used to qualify its reliability helps to see why the statistics in the worker scenario are not specific enough. Suppose the first-order evidence in the worker case is the statement ‘I saw the worker in the factory where the guard was killed’. Further, suppose the 99-to-1 statistics are used to qualify this testimony. They offer the additional information that 99 of the 100 workers in the factory killed the guard. For a similar analysis of naked statistical evidence, see Dahlman (2020). This formulation makes clear that the first-order information bearing on the incident—namely the testimony that the worker was in the factory—cannot be specific enough to sustain a homicide accusation. Adding the 99-to-1 statistics does not change this fact. Likewise, adding error statistics does not change the fact that the assertion ‘I saw the defendant ran along 5th Avenue’ is not specific enough to sustain, say, a robbery charge.
For an analogous case, see US v. Coscia, 866 F. 3d 782 - Court of Appeals, 7th Circuit 2017.
Another factor that may affect the expected degree of specificity of the accusation is the availability of the evidence. It may be hard to reconstruct someone’s whereabouts in detail if it is difficult to find adequate evidence. Consider a tort case, leaving the criminal realm for a moment. Say a person has developed cancer while working in a field and using a chemical pesticide. Epidemiological studies show that repeated exposure to the pesticide increases one’s likelihood of cancer. Should the company that marketed the pesticide be held responsible? This is a difficult question. But think how hard it would be to hold the company responsible if the requirement was for the plaintiff to show event-specific or individualized causation—i.e. that the pesticide caused the plaintiff’s cancer in that particular case. Or consider again the trader example. It would be unreasonable to expect the prosecutor to offer evidence showing what the trader did each time and with what purpose. It would be unreasonable because such evidence would normally not be available. So the degree of specificity of the accusation will also depend on what evidence one would reasonably expect to see given the circumstances of a case.
On the question of uniqueness of genetic profiles, see Kaye (2013).
For a comparison between fingerprint evidence and genetic evidence, see Zabell (2005).
References
Allen, R. J. (2010). No Plausible Alternative to a Plausible Story of Guilt as the Rule of Decision in Criminal Cases. In J. Cruz & L. Laudan (Eds.), Prueba y Esandares de Prueba en el Derecho. Instituto de Investigaciones Filosoficas-UNAM.
Allen, R. J. (2021). Naturalized epistemology and the law of evidence revisited. Quaestio Facti, 2, 253–284.
Allen, R. J., & Leiter, B. (2001). Naturalized epistemology and the law of evidence. Virginia Law Review, 87, 1491–1550.
Allen, R. J., & Pardo, M. S. (2007). The problematic value of mathematical models of evidence. Journal of Legal Studies, 36(1), 107–140.
Allen, R. J., & Pardo, M. S. (2019). Relative plausibility and its critics. International Journal of Evidence and Proof, 23(1/2), 5–59.
Balding, D. J., & Donnely, P. (1996). Evaluating DNA Profile evidence when the suspect is identified through a database search. Journal of Forensic Science, 41(4), 603–607.
Blome-Tillmann, M. (2015). Sensitivity, causality, and statistical evidence in courts of law. Thought A. Journal of Philosophy, 4(2), 102–112.
Bolinger, R. J. (2020). The rational impermissibility of accepting (some) racial generalizations. Synthese, 197, 2415–2431.
Buchak, L. (2014). Belief, credence, and norms. Philosophical Studies, 169(2), 285–311.
Burns, R. P. (2004). The distinctiviness of trial narratives. In A. Duff, L. Farmer, S. Marshall, & V. Tadros (Eds.), The trial on trial (VOL 1): truth and due process. Hart Publishing.
Carnap, R., & Bar-Hillel, Y. (1952). An outline of a theory of semantic information. Technical Report MIT.
Cheng, E. K. (2013). Reconceptualizing the burden of proof. Yale Law Journal, 122(5), 1254–1279.
Cheng, E. K., & Nunn, G. A. (2016). DNA, blue bus, and phase changes. The International Journal of Evidence and Proof, 20(2), 112–120.
Cohen, J. L. (1977). The Probable and the Provable. Oxford University Press.
Colyvan, M., Regan, H. M., & Ferson, S. (2001). Is it a crime to belong to a reference class? Journal of Political Philosophy, 9(2), 168–181.
Dahlman, C. (2020). Naked statistical evidence and incentives for lawful conduct. International Journal of Evidence and Proof, 24(2), 162–179.
De Macedo, C. (2008). Guilt by statistical association: Revisiting the prosecutor’s fallacy and the interrogator’s fallacy. Journal of Philosophy, 105(5), 320–332.
Di Bello, M. (2013). Statistics and Probability in Criminal Trials: The Good, the Bad and the Ugly. Ph.D. thesis, Stanford Univerity.
Di Bello, M. (2019). Trial by statistics: Is a high probability of guilt enough to convict? Mind, 128(512), 1045–1084.
Di Bello, M., & O’Neil, C. (2020). Profile evidence, fairness and the risk of mistaken convictions. Ethics,130(2), 147–178.
Duff, A. (2001). Punishment, Communication and Community. Oxford University Press.
Ebert, P. A., Smith, M., & Durbach, I. (2018). Lottery judgments: A philosophical and experimental study. Philosophical Psychology, 31(1), 110–138.
Enoch, D., Fisher, T., & Spectre, L. (2012). Statistical evidence, sensitivity, and the legal value of knowledge. Philosophy and Public Affairs, 40(3), 197–224.
Feinberg, S. E., & Kaye, D. H. (1991). Legal and statistical aspects of some mysterious clusters. Journal of the Royal Statistical Society Series A (Statistics in Society), 154, 61–74.
Fitelson, B. (2006). Likelihoodism, Bayesianism, and relational confirmation. Synthese, 16(47), 1–22.
Floridi, L. (2004). Outline of a theory of strongly semantic information. Minds and Machines, 14, 197–222.
Gardiner, G. (2018). Legal burdens of proof and statistical evidence. In D. Coady & J. Chase (Eds.), Routledge Handbook of Applied Epistemology. Routledge.
Gardiner, G. (2019). The reasonable and the relevant: Legal standards of proof. Philosophy and Public Affairs, 47(3), 288–318.
Gardiner, G. (2020). Profiling and proof: Are statistics safe? Philosophy, 95(2), 161–183.
Groenendijk, J., & Stokhof, M. (1997). Questions. In J. van Benthem (Ed.), Handbook of Logic and Language. Elsevier and MIT Pess.
Haack, S. (2014). Evidence Matters: Science, Proof, and Truth in the Law. Cambridge University Press.
Hedden, B., & Colyvan, M. (2019). Legal probabilism: A qualified defence. Journal of Political Philosophy, 27(4), 448–468.
Ho, H. L. (2008). Philosophy of Evidence Law. Oxford University Press.
Kaye, D. H. (1979). The paradox of the gatecrasher and other stories. The Arizona State Law Journal, 1979(1), 101–110.
Kaye, D. H. (1986). Do we need a calculus of weight to understand proof beyond a reasonable doubt? Boston University Law Review, 66, 657–672.
Kaye, D. H. (2013). Beyond uniqueness: the birthday paradox, source attribution and individualization in forensic science. Law, Probability and Risk, 12(1), 3–11.
Koehler, J. J. (2002). When do courts think base rate statistics are relevant? Jurimetrics Journal, 42, 373–402.
Krauss, S. F. (2020). Against the alleged insufficiency of statistical evidence. Florida State University Law Review, 47, 801–825.
Laudan, L. (2011). The rules of trial, political morality and the costs of error: Or, Is proof beyond a reasonable doubt doing more harm than good? In G. Leslie & L. Brian (Eds.), Oxford Studies in Philosophy of Law (Vol. 1). Oxford University Press.
Littlejohn, C. (2020). Truth, knowledge, and the standard of proof in criminal law. Synthese, 197, 5253–5286.
Loftus, E. F. (1996). Eyewitness Testimony (revised edition). Harvard University Press.
Malcom, B. G. (2008). Convictions predicated on DNA evidence alone: How reliable evidence became infallible. Columbia Law Review, 38(2), 313–338.
Mayo, D. (2018). Statistical Inference as Severe Testing. Cambridge University Press.
Meester, R., Collins, M., Gill, R., & van Lambalgen, M. (2006). On the (ab)use of statistics in the legal case against the nurse Lucia de B. Law, Probability and Risk, 5(3–4), 233–250.
Moss, S. (2021). Knowledge and legal proof. In T. Szabo Gendler & J. Hawthorne (Eds.), Oxford studies in epistemology (Vol. 7). Oxford University Press.
Nance, D. A. (2016). The burdens of Proof: Discriminatory Power, Weight of Evidence, and Tenacity of Belief. Cambridge University Press.
Nelkin, D. N. (forthcoming). Rational belief and statistical evidence: Blame, bias, and the law. In D. Igor (Ed.), The Lottery Paradox. Cambridge University Press.
Nesson, C. R. (1979). Reasonable doubt and permissive inferences: The value of complexity. Harvard Law Review, 92(6), 1187–1225.
Niedermeier, K. E., Kerr, N. L., & Messeé, L. A. (1999). Jurors’ use of naked statistical evidence: Exploring bases and implications of the Wells effect. Journal of Personality and Social Psychology,76(4), 533–542.
NRC. (1996). The Evaluation of Forensic DNA evidence. National Academy Press.
Nunn, A. G. (2015). The incompatibility of due process and naked statistical evidence. Vanderbilt Law Review, 68(5), 1407–1433.
Papineau, D. (2021). The disvalue of knowledge. Synthese, 198, 5311–5332.
Pennington, N., & Hastie, R. (1991). A cognitive theory of juror decision making: the story model. Cardozo Law Review, 13, 519–557.
Picinali, F. (2016). Base-rates of negative traits: Instructions for use in criminal trials. Journal of Applied Philosophy, 33(1), 69–87.
Porat, A., & Posner, E. (2012). Aggregation and law. Yale Law Journal, 122, 2–69.
Pritchard, D. (2015). Risk. Metaphilosophy, 46(3), 436–461.
Pundik, A. (2017). Freedom and generalisation. Oxford Journal of Legal Studies, 37(1), 189–216.
Redmayne, M. (2015). Character in the Criminal Trial. Oxford University Press.
Ross, L. (2021). Rehabilitating statistical evidence. Philosophy and Phenomenological Research, 102(1), 3–23.
Roth, A. (2010). Safety in numbers? Deciding when DNA alone is enough to convict. New York University Law Review, 85(4), 1130–1185.
Schauer, F. (2003). Profiles, Probabilities, and Stereotypes. Belknap Press.
Schauer, F. (2021). Statistical evidence and the problem of specification. Unpublised Manuscript.
Schmalbeck, R. (1986). The trouble with statistical evidence. Law and Contemporary Problems, 49(3), 221–236.
Schoeman, F. (1987). Statistical vs. direct evidence. Noûs, 21(2), 179–198.
Smith, M. (2018). When does evidence suffice for conviction? Mind, 127(508), 1193–1218.
Staffel, J. (2016). Beliefs, buses and lotteries: Why rational belief can’t be stably high credence. Philosophical Studies,173, 1721–1734.
Stein, A. (2005). Foundations of Evidence Law. Oxford University Press.
Thomson, J. J. (1986). Liability and individualized evidence. Law and Contemporary Problems, 49(3), 199–219.
Tribe, L. H. (1971). A further critique of mathematical proof. Harvard Law Review, 84(8), 1810–1820.
Urbaniak, R. (2018). Narration in judiciary fact-finding: a probabilistic explication. Artificial Intelligence and Law, 26(4), 345–376.
van Koppen, P., & Mackor, A. R. (2020). A scenario approach to the Simonshaven case. Topics in Cognitive Science, 12(4), 1132–1151.
Wasserman, D. T. (1991). The morality of statistical proof and the risk of mistaken liability. Cardozo Law Review, 13, 935–976.
Weinstein, J. B., & Dewsbury, I. (2006). Comment on the meaning of ‘proof beyond a reasonable doubt’. Law, Probability and Risk,5(2), 167–173.
Wells, G. L. (1992). Naked statistical evidence of liability: Is subjective probability enough? Journal of Personality and Social Psychology, 62(5), 752–793.
Wixted, J. T., & Wells, G. L. (2017). The relationship between eyewitness confidence and identification accuracy: A new synthesis. Psychological Science in the Public Interest, 18(1), 10–65.
Zabell, S. L. (2005). Fingerprint evidence. Journal of Law and Policy, 13(1), 143–179.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the topical collection “Recent Issues in Philosophy of Statistics: Evidence, Testing, and Applications”, edited by Sorin Bangu, Emiliano Ippoliti, and Marianna Antonutti.
Rights and permissions
About this article
Cite this article
Di Bello, M. When statistical evidence is not specific enough. Synthese 199, 12251–12269 (2021). https://doi.org/10.1007/s11229-021-03331-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-021-03331-0