Skip to main content
Log in

When statistical evidence is not specific enough

  • Original Research
  • Published:
Synthese Aims and scope Submit manuscript

Abstract

Many philosophers have pointed out that statistical evidence, or at least some forms of it, lack desirable epistemic or non-epistemic properties, and that this should make us wary of litigations in which the case against the defendant rests in whole or in part on statistical evidence. Others have responded that such broad reservations about statistical evidence are overly restrictive since appellate courts have expressed nuanced views about statistical evidence. In an effort to clarify and reconcile, I put forward an interpretive analysis of why statistical evidence should raise concerns in some cases but not others. I argue that when there is a mismatch between the specificity of the evidence and the expected specificity of the accusation, statistical evidence—as any other kind of evidence—should be considered insufficient to sustain a conviction. I rely on different stylized court cases to illustrate the explanatory power of this analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. See, for example, Tribe (1971), Nesson (1979), Cohen (1977), Thomson (1986), Wasserman (1991), Stein (2005), Ho (2008), Enoch et al. (2012), Cheng (2013), Buchak (2014), Pritchard (2015), Blome-Tillmann (2015), Nunn (2015), Staffel (2016), Smith (2018), Pundik (2017), Littlejohn (2020), Gardiner (2018), Di Bello (2019), Dahlman (2020), Bolinger (2020), Moss (2021), Nelkin (forthcoming).

  2. See, for example, Kaye (1979), Schmalbeck (1986), Schoeman (1987), Allen and Leiter (2001), Schauer (2003), Redmayne (2015), Hedden and Colyvan (2019), Krauss (2020), Ross (2021), Papineau (2021).

  3. For a representative set of cases, see Malcom (2008), Roth (2010).

  4. For an earlier versions of this specificity argument, see Chapter 7 of Di Bello (2013).

  5. On the distinction, see the helpful remarks by Picinali (2016).

  6. See, for example, the discussion in Di Bello and O’Neil (2020). For a helpful review of the case law in the United States on the question of admissibility and statistical evidence, see Koehler (2002).

  7. For variations of this scenario, see Nesson (1979) and Cohen (1977).

  8. Psychologists and experimental philosophers have examined the descriptive puzzle, sometimes called the Wells’ effect; see, for example, Wells (1992), Niedermeier et al. (1999), Ebert et al. (2018).

  9. For a critique of the use of hypothetical scenarios in theorizing about evidence law, see Allen (2021).

  10. Some scholars speak of reasonable completeness of the evidence or lack thereof. Kaye (1986) gives an anecdotal example of a drunk driving case in which the prosecutor called the arresting officer to the stand who testified about the smell and breath of the man arrested, as well as the results of a field sobriety test. The incriminating evidence did not, however, include the results of a breathalyzer. Kaye notes that the jurors expected to hear about that type of evidence but did not, and thus acquitted the defendant. A similar idea is invoked by Judge Richard Posner in Howard v. Wal-Mart Stores Inc. 160 F.3d 358 (7th. Circ. 1998). Nance (2016) in a recent book gave a more detailed account of what it means for the evidence to be reasonably complete—that is, it should be all the evidence that someone tasked with making a decision at trial would reasonably expect to see from a conscientious investigation of the facts in the type of case at hand.

  11. The literature has refined this point in different ways, drawing on epistemological and moral considerations. According to Thomson (1986), the 99-to-1 statistics are not causally connected, in the appropriate manner, with the facts of the case, while other forms of evidence, say eyewitness testimony, typically are. Colyvan et al. (2001) and Allen and Pardo (2007) emphasize the reference class problem which tend to affect inferences made on the basis of quantitative information. Enoch et al. (2012) and Pritchard (2015) argue that naked statistics are not specific because they lack the modal properties of sensitivity and safety. Wasserman (1991) and Pundik (2017) point out that convictions are attributions of individual (as opposed to group-based) culpability, and naked statistical evidence conflicts with this assumption since it describes group-level features.

  12. The debate in the literature about the meaning of ‘proof beyond a reasonable doubt’ is ongoing. For helpful discussions, see among others Weinstein and Dewsbury (2006), Laudan (2011), Gardiner (2019).

  13. See, for example, Stein (2005), Ho (2008), Allen (2010), Haack (2014), Gardiner (2019), Moss (2021).

  14. Schauer (2021) makes a similar point and draws a connection to the so-called problem of aggregation as discussed in Porat and Posner (2012).

  15. I am indebted to the ‘story model’ of judicial decision-making; see Pennington and Hastie (1991), Urbaniak (2018), van Koppen and Mackor (2020). See, however, Allen and Pardo (2019) who prefer the notion of ‘explanation.’

  16. Fitelson (2006) offers an excellent discussion of many probability-based accounts of evidential support.

  17. On informativeness, see Carnap and Bar-Hillel (1952), Groenendijk and Stokhof (1997) and Floridi (2004).

  18. For more on this point, see Duff (2001), Burns (2004). That criminal accusations should usually be specific is a peculiarity of criminal trials. There are many situations in which the same degree of detail is not required. Consider a bank that gives out loans. The bank uses a scoring system that draws on factors such as assets, credit history and job security. Risk scores track the likelihood of default if a loan is granted. Scores range between 0 (zero likelihood of default) and 100 (certain default). Suppose Tera needs a loan to purchase a home. She applies for a loan and is assigned a risk score of 3. The bank deems her at low enough risk of default and grants her the loan. The bank can do business without settling the question whether Tera—specifically—will default or not. If the scoring system is trustworthy, Tera’s default is 3% likely. This is all the bank needs to know. Since the bank expects that 3 out of 100 people with a risk score of 3 will default, it will adjust the interest rate appropriately. So banks—just like casinos and insurance companies—do not need individualized information. They make decisions about long-run patterns. So they can routinely rely on the relevant statistical or actuarial information without problem.

  19. On the reliability of eyewitness testimony, see Loftus (1996), Wixted and Wells (2017).

  20. The distinction between the informativeness of the testimony and the error-statistics used to qualify its reliability helps to see why the statistics in the worker scenario are not specific enough. Suppose the first-order evidence in the worker case is the statement ‘I saw the worker in the factory where the guard was killed’. Further, suppose the 99-to-1 statistics are used to qualify this testimony. They offer the additional information that 99 of the 100 workers in the factory killed the guard. For a similar analysis of naked statistical evidence, see Dahlman (2020). This formulation makes clear that the first-order information bearing on the incident—namely the testimony that the worker was in the factory—cannot be specific enough to sustain a homicide accusation. Adding the 99-to-1 statistics does not change this fact. Likewise, adding error statistics does not change the fact that the assertion ‘I saw the defendant ran along 5th Avenue’ is not specific enough to sustain, say, a robbery charge.

  21. For similar examples, see Pundik (2017) and Gardiner (2020).

  22. For court cases along these lines, see Feinberg and Kaye (1991), Meester et al. (2006).

  23. For an analogous case, see US v. Coscia, 866 F. 3d 782 - Court of Appeals, 7th Circuit 2017.

  24. Another factor that may affect the expected degree of specificity of the accusation is the availability of the evidence. It may be hard to reconstruct someone’s whereabouts in detail if it is difficult to find adequate evidence. Consider a tort case, leaving the criminal realm for a moment. Say a person has developed cancer while working in a field and using a chemical pesticide. Epidemiological studies show that repeated exposure to the pesticide increases one’s likelihood of cancer. Should the company that marketed the pesticide be held responsible? This is a difficult question. But think how hard it would be to hold the company responsible if the requirement was for the plaintiff to show event-specific or individualized causation—i.e. that the pesticide caused the plaintiff’s cancer in that particular case. Or consider again the trader example. It would be unreasonable to expect the prosecutor to offer evidence showing what the trader did each time and with what purpose. It would be unreasonable because such evidence would normally not be available. So the degree of specificity of the accusation will also depend on what evidence one would reasonably expect to see given the circumstances of a case.

  25. Among forensic scientists and legal scholars, see NRC (1996), Balding and Donnely (1996), Stein (2005), Allen and Pardo (2007), Roth (2010), Cheng and Nunn (2016). Among philosophers, see De Macedo (2008), Enoch et al. (2012), Pritchard (2015), Smith (2018), Mayo (2018).

  26. On the question of uniqueness of genetic profiles, see Kaye (2013).

  27. For a comparison between fingerprint evidence and genetic evidence, see Zabell (2005).

References

  • Allen, R. J. (2010). No Plausible Alternative to a Plausible Story of Guilt as the Rule of Decision in Criminal Cases. In J. Cruz & L. Laudan (Eds.), Prueba y Esandares de Prueba en el Derecho. Instituto de Investigaciones Filosoficas-UNAM.

  • Allen, R. J. (2021). Naturalized epistemology and the law of evidence revisited. Quaestio Facti, 2, 253–284.

    Google Scholar 

  • Allen, R. J., & Leiter, B. (2001). Naturalized epistemology and the law of evidence. Virginia Law Review, 87, 1491–1550.

    Article  Google Scholar 

  • Allen, R. J., & Pardo, M. S. (2007). The problematic value of mathematical models of evidence. Journal of Legal Studies, 36(1), 107–140.

    Article  Google Scholar 

  • Allen, R. J., & Pardo, M. S. (2019). Relative plausibility and its critics. International Journal of Evidence and Proof, 23(1/2), 5–59.

    Article  Google Scholar 

  • Balding, D. J., & Donnely, P. (1996). Evaluating DNA Profile evidence when the suspect is identified through a database search. Journal of Forensic Science, 41(4), 603–607.

    Article  Google Scholar 

  • Blome-Tillmann, M. (2015). Sensitivity, causality, and statistical evidence in courts of law. Thought A. Journal of Philosophy, 4(2), 102–112.

    Google Scholar 

  • Bolinger, R. J. (2020). The rational impermissibility of accepting (some) racial generalizations. Synthese, 197, 2415–2431.

    Article  Google Scholar 

  • Buchak, L. (2014). Belief, credence, and norms. Philosophical Studies, 169(2), 285–311.

    Article  Google Scholar 

  • Burns, R. P. (2004). The distinctiviness of trial narratives. In A. Duff, L. Farmer, S. Marshall, & V. Tadros (Eds.), The trial on trial (VOL 1): truth and due process. Hart Publishing.

  • Carnap, R., & Bar-Hillel, Y. (1952). An outline of a theory of semantic information. Technical Report MIT.

  • Cheng, E. K. (2013). Reconceptualizing the burden of proof. Yale Law Journal, 122(5), 1254–1279.

  • Cheng, E. K., & Nunn, G. A. (2016). DNA, blue bus, and phase changes. The International Journal of Evidence and Proof, 20(2), 112–120.

    Article  Google Scholar 

  • Cohen, J. L. (1977). The Probable and the Provable. Oxford University Press.

  • Colyvan, M., Regan, H. M., & Ferson, S. (2001). Is it a crime to belong to a reference class? Journal of Political Philosophy, 9(2), 168–181.

    Article  Google Scholar 

  • Dahlman, C. (2020). Naked statistical evidence and incentives for lawful conduct. International Journal of Evidence and Proof, 24(2), 162–179.

    Article  Google Scholar 

  • De Macedo, C. (2008). Guilt by statistical association: Revisiting the prosecutor’s fallacy and the interrogator’s fallacy. Journal of Philosophy, 105(5), 320–332.

  • Di Bello, M. (2013). Statistics and Probability in Criminal Trials: The Good, the Bad and the Ugly. Ph.D. thesis, Stanford Univerity.

  • Di Bello, M. (2019). Trial by statistics: Is a high probability of guilt enough to convict? Mind, 128(512), 1045–1084.

    Article  Google Scholar 

  • Di Bello, M., & O’Neil, C. (2020). Profile evidence, fairness and the risk of mistaken convictions. Ethics,130(2), 147–178.

  • Duff, A. (2001). Punishment, Communication and Community. Oxford University Press.

  • Ebert, P. A., Smith, M., & Durbach, I. (2018). Lottery judgments: A philosophical and experimental study. Philosophical Psychology, 31(1), 110–138.

    Article  Google Scholar 

  • Enoch, D., Fisher, T., & Spectre, L. (2012). Statistical evidence, sensitivity, and the legal value of knowledge. Philosophy and Public Affairs, 40(3), 197–224.

    Article  Google Scholar 

  • Feinberg, S. E., & Kaye, D. H. (1991). Legal and statistical aspects of some mysterious clusters. Journal of the Royal Statistical Society Series A (Statistics in Society), 154, 61–74.

    Article  Google Scholar 

  • Fitelson, B. (2006). Likelihoodism, Bayesianism, and relational confirmation. Synthese, 16(47), 1–22.

    Google Scholar 

  • Floridi, L. (2004). Outline of a theory of strongly semantic information. Minds and Machines, 14, 197–222.

    Article  Google Scholar 

  • Gardiner, G. (2018). Legal burdens of proof and statistical evidence. In D. Coady & J. Chase (Eds.), Routledge Handbook of Applied Epistemology. Routledge.

  • Gardiner, G. (2019). The reasonable and the relevant: Legal standards of proof. Philosophy and Public Affairs, 47(3), 288–318.

    Article  Google Scholar 

  • Gardiner, G. (2020). Profiling and proof: Are statistics safe? Philosophy, 95(2), 161–183.

  • Groenendijk, J., & Stokhof, M. (1997). Questions. In J. van Benthem (Ed.), Handbook of Logic and Language. Elsevier and MIT Pess.

  • Haack, S. (2014). Evidence Matters: Science, Proof, and Truth in the Law. Cambridge University Press.

  • Hedden, B., & Colyvan, M. (2019). Legal probabilism: A qualified defence. Journal of Political Philosophy, 27(4), 448–468.

    Article  Google Scholar 

  • Ho, H. L. (2008). Philosophy of Evidence Law. Oxford University Press.

  • Kaye, D. H. (1979). The paradox of the gatecrasher and other stories. The Arizona State Law Journal, 1979(1), 101–110.

  • Kaye, D. H. (1986). Do we need a calculus of weight to understand proof beyond a reasonable doubt? Boston University Law Review, 66, 657–672.

    Google Scholar 

  • Kaye, D. H. (2013). Beyond uniqueness: the birthday paradox, source attribution and individualization in forensic science. Law, Probability and Risk, 12(1), 3–11.

    Article  Google Scholar 

  • Koehler, J. J. (2002). When do courts think base rate statistics are relevant? Jurimetrics Journal, 42, 373–402.

    Google Scholar 

  • Krauss, S. F. (2020). Against the alleged insufficiency of statistical evidence. Florida State University Law Review, 47, 801–825.

    Google Scholar 

  • Laudan, L. (2011). The rules of trial, political morality and the costs of error: Or, Is proof beyond a reasonable doubt doing more harm than good? In G. Leslie & L. Brian (Eds.), Oxford Studies in Philosophy of Law (Vol. 1). Oxford University Press.

  • Littlejohn, C. (2020). Truth, knowledge, and the standard of proof in criminal law. Synthese, 197, 5253–5286.

    Article  Google Scholar 

  • Loftus, E. F. (1996). Eyewitness Testimony (revised edition). Harvard University Press.

  • Malcom, B. G. (2008). Convictions predicated on DNA evidence alone: How reliable evidence became infallible. Columbia Law Review, 38(2), 313–338.

    Google Scholar 

  • Mayo, D. (2018). Statistical Inference as Severe Testing. Cambridge University Press.

  • Meester, R., Collins, M., Gill, R., & van Lambalgen, M. (2006). On the (ab)use of statistics in the legal case against the nurse Lucia de B. Law, Probability and Risk, 5(3–4), 233–250.

    Google Scholar 

  • Moss, S. (2021). Knowledge and legal proof. In T. Szabo Gendler & J. Hawthorne (Eds.), Oxford studies in epistemology (Vol. 7). Oxford University Press.

  • Nance, D. A. (2016). The burdens of Proof: Discriminatory Power, Weight of Evidence, and Tenacity of Belief. Cambridge University Press.

  • Nelkin, D. N. (forthcoming). Rational belief and statistical evidence: Blame, bias, and the law. In D. Igor (Ed.), The Lottery Paradox. Cambridge University Press.

  • Nesson, C. R. (1979). Reasonable doubt and permissive inferences: The value of complexity. Harvard Law Review, 92(6), 1187–1225.

    Article  Google Scholar 

  • Niedermeier, K. E., Kerr, N. L., & Messeé, L. A. (1999). Jurors’ use of naked statistical evidence: Exploring bases and implications of the Wells effect. Journal of Personality and Social Psychology,76(4), 533–542.

  • NRC. (1996). The Evaluation of Forensic DNA evidence. National Academy Press.

  • Nunn, A. G. (2015). The incompatibility of due process and naked statistical evidence. Vanderbilt Law Review, 68(5), 1407–1433.

    Google Scholar 

  • Papineau, D. (2021). The disvalue of knowledge. Synthese, 198, 5311–5332.

    Article  Google Scholar 

  • Pennington, N., & Hastie, R. (1991). A cognitive theory of juror decision making: the story model. Cardozo Law Review, 13, 519–557.

    Google Scholar 

  • Picinali, F. (2016). Base-rates of negative traits: Instructions for use in criminal trials. Journal of Applied Philosophy, 33(1), 69–87.

    Article  Google Scholar 

  • Porat, A., & Posner, E. (2012). Aggregation and law. Yale Law Journal, 122, 2–69.

    Google Scholar 

  • Pritchard, D. (2015). Risk. Metaphilosophy, 46(3), 436–461.

    Article  Google Scholar 

  • Pundik, A. (2017). Freedom and generalisation. Oxford Journal of Legal Studies, 37(1), 189–216.

    Google Scholar 

  • Redmayne, M. (2015). Character in the Criminal Trial. Oxford University Press.

  • Ross, L. (2021). Rehabilitating statistical evidence. Philosophy and Phenomenological Research, 102(1), 3–23.

    Article  Google Scholar 

  • Roth, A. (2010). Safety in numbers? Deciding when DNA alone is enough to convict. New York University Law Review, 85(4), 1130–1185.

    Google Scholar 

  • Schauer, F. (2003). Profiles, Probabilities, and Stereotypes. Belknap Press.

  • Schauer, F. (2021). Statistical evidence and the problem of specification. Unpublised Manuscript.

  • Schmalbeck, R. (1986). The trouble with statistical evidence. Law and Contemporary Problems, 49(3), 221–236.

    Article  Google Scholar 

  • Schoeman, F. (1987). Statistical vs. direct evidence. Noûs, 21(2), 179–198.

    Article  Google Scholar 

  • Smith, M. (2018). When does evidence suffice for conviction? Mind, 127(508), 1193–1218.

    Article  Google Scholar 

  • Staffel, J. (2016). Beliefs, buses and lotteries: Why rational belief can’t be stably high credence. Philosophical Studies,173, 1721–1734.

  • Stein, A. (2005). Foundations of Evidence Law. Oxford University Press.

  • Thomson, J. J. (1986). Liability and individualized evidence. Law and Contemporary Problems, 49(3), 199–219.

    Article  Google Scholar 

  • Tribe, L. H. (1971). A further critique of mathematical proof. Harvard Law Review, 84(8), 1810–1820.

    Article  Google Scholar 

  • Urbaniak, R. (2018). Narration in judiciary fact-finding: a probabilistic explication. Artificial Intelligence and Law, 26(4), 345–376.

    Article  Google Scholar 

  • van Koppen, P., & Mackor, A. R. (2020). A scenario approach to the Simonshaven case. Topics in Cognitive Science, 12(4), 1132–1151.

    Article  Google Scholar 

  • Wasserman, D. T. (1991). The morality of statistical proof and the risk of mistaken liability. Cardozo Law Review, 13, 935–976.

    Google Scholar 

  • Weinstein, J. B., & Dewsbury, I. (2006). Comment on the meaning of ‘proof beyond a reasonable doubt’. Law, Probability and Risk,5(2), 167–173.

  • Wells, G. L. (1992). Naked statistical evidence of liability: Is subjective probability enough? Journal of Personality and Social Psychology, 62(5), 752–793.

    Article  Google Scholar 

  • Wixted, J. T., & Wells, G. L. (2017). The relationship between eyewitness confidence and identification accuracy: A new synthesis. Psychological Science in the Public Interest, 18(1), 10–65.

    Article  Google Scholar 

  • Zabell, S. L. (2005). Fingerprint evidence. Journal of Law and Policy, 13(1), 143–179.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcello Di Bello.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the topical collection “Recent Issues in Philosophy of Statistics: Evidence, Testing, and Applications”, edited by Sorin Bangu, Emiliano Ippoliti, and Marianna Antonutti.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Di Bello, M. When statistical evidence is not specific enough. Synthese 199, 12251–12269 (2021). https://doi.org/10.1007/s11229-021-03331-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11229-021-03331-0

Keywords

Navigation