## Abstract

It is well known that Bayes’ theorem (with likelihood ratios) can be used to calculate the impact of evidence, such as a ‘match’ of some feature of a person. Typically the feature of interest is the DNA profile, but the method applies in principle to any feature of a person or object, including not just DNA, fingerprints, or footprints, but also more basic features such as skin colour, height, hair colour or even name. Notwithstanding concerns about the extensiveness of databases of such features, a serious challenge to the use of Bayes in such legal contexts is that its standard formulaic representations are not readily understandable to non-statisticians. Attempts to get round this problem usually involve representations based around some variation of an event tree. While this approach works well in explaining the most trivial instance of Bayes’ theorem (involving a single hypothesis and a single piece of evidence) it does not scale up to realistic situations. In particular, even with a single piece of match evidence, if we wish to incorporate the possibility that there are potential errors (both false positives and false negatives) introduced at any stage in the investigative process, matters become very complex. As a result we have observed expert witnesses (in different areas of speciality) routinely ignore the possibility of errors when presenting their evidence. To counter this, we produce what we believe is the first full probabilistic solution of the simple case of generic match evidence incorporating both classes of testing errors. Unfortunately, the resultant event tree solution is too complex for intuitive comprehension. And, crucially, the event tree also fails to represent the causal information that underpins the argument. In contrast, we also present a simple-to-construct graphical Bayesian Network (BN) solution that automatically performs the calculations and may also be intuitively simpler to understand. Although there have been multiple previous applications of BNs for analysing forensic evidence—including very detailed models for the DNA matching problem, these models have not widely penetrated the expert witness community. Nor have they addressed the basic generic match problem incorporating the two types of testing error. Hence we believe our basic BN solution provides an important mechanism for convincing experts—and eventually the legal community—that it is possible to rigorously analyse and communicate the full impact of match evidence on a case, in the presence of possible errors.

### Similar content being viewed by others

## Notes

Note that even if the suspect is determined to have feet requiring size 13

*or*14 shoes, we would still refer to it as a ‘match’; thus, we deliberately avoid using the term ‘consistent with’ even though forensic scientists typically use that expression rather than ‘match’ in such situations. The distinction between ‘match’ and ‘consistent with’ is actually artificial and leads to much confusion since it suggests, wrongly, that a ‘match’ is somehow unique. Even using the term ‘exact match’ to distinguish “14” from “13 or 14” is potentially misleading because again it wrongly implies uniqueness.Although the likelihoods P(

*E*|*H*) and P(*E*|not*H*) are independent of the value of the prior P(*H*) they must take account of the same background knowledge that is implicit in the prior. For example, suppose that the prior P(*H*) = 0.5 is based on the background knowledge that the defendant was one of only two men known to be at the scene of the crime and both men were a similar large size. Then if*E*is a matching shoe size 12, P(*E*| not*H*) is certainly not the random match probability. In fact, in this case P(*E*|not*H*), like P(*E*|*H*) will be close to 1.The odds of any hypothesis

*H*(in this case the prosecution hypothesis) is simply the ratio of the probability of*H*over the probability of the negation of*H*(i.e. the defence hypothesis in this case). So the prior odds is just P(*H*) divided by P(not*H*) and the posterior odds of*H*is just P(*H*|*E*) divided by (P(*not H*|*E*). Odds can easily be transformed into probabilities: specifically, if the odd are*x*to*y*for hypothesis*H*over*not H*then the probability of*H*is*x*/(*x**+**y*) and the probability of*not H*is*y*/(*x**+**y*). So odds of 100 to 1 in favour of*H*means the probability of*H*is 100/101 and the probability of*not H*is 1/101. Also note (we will assume this later) that if the prior odds are ‘evens’ i.e. 50:50 then the posterior odds will be the same as the likelihood ratio.It is important to note that, as explained in (Fenton et al. 2013) these crucial properties of the LR apply only when the defence hypothesis is the negation of the prosecution hypothesis

*H*. Forensic scientists sometimes consider defence hypotheses that are not the negation of*H*. In such circumstances the LR is somewhat meaningless as it tells us nothing about the probative value of the evidence. Moreover (Fenton et al. 2013) also showed that even when*H*and not*H**are*used, the LR may tell us nothing about the probative value of*E*on some other hypothesis relevant to a case. In particular, this means that evidence*E*with an LR of one may still be probative elsewhere.We use the term ‘full branch’ instead of ‘posterior branch’ because the term ‘posterior probability’ technically applies to the conditional probabilities \( P(H | e) \) and \( P(\neg H|e) \), where

*H*is the prosecution hypothesis, ¬H is the defence hypothesis, and*e*is the evidence. In contrast, the probability of the ‘full branches’ are actually the respective joint probabilities*P*(*H, e*) and*P*(*¬H, e*). Because\( P(H | e) = P(H,e)P(e) \), the posterior odds can be equivalently written as the ratio of the posterior probabilities or the ratio of the joint probabilities. That is: \( \frac{P(H|e)}{P(\neg H|e)} = \frac{P(H|e)P(e)}{P(\neg H|e)P(e)} = \frac{P(H,e)}{P(\neg H,e)} \).In fact, a Bayesian network is the most tractable way of calculating complex statistical problems for which brute-force equation-based calculations become unwieldy and even intractable. Note, the results of a Bayesian network will be mathematically equivalent to the formal manual derivations for discrete variables.

Recall that, by assuming a 50:50 prior, we know that the posterior odds are equal to the likelihood ratio.

The likelihood ratio is 100, meaning equivalently the probability the prosecution hypothesis is true is 100/101 = 99.01 %).

The likelihood ratio is 65/35, meaning equivalently the probability the prosecution hypothesis is true is 65 %).

## References

ABS Consulting (2002) Marine safety: tools for risk-based decision making, Government Institutes

AgenaRisk software (2013). www.agenarisk.com

Aitken CGG, Taroni F (2004) Statistics and the evaluation of evidence for forensic scientists, 2nd edn. Wiley, New Jersey

Aitken C et al (2011) Expressing evaluative opinions: a position statement. Sci Justice 51(1):1–2

Balding D (2004) Comment on: why the effect of prior odds should accompany the likelihood ratio when reporting DNA evidence. Law Probab Risk 3(1):63–64

Balding DJ (2005) Weight-of-evidence for forensic DNA profiles. Wiley, New Jersey

Bedford T, Cooke R (2001) Probabilistic risk analysis, foundations and method. Cambridge University Press, Cambridge

Berger CEH, Buckleton J, Champod C, Evett I, Jackson G (2011) Evidence evaluation: a response to the court of appeal judgement in R v T. Sci Justice 51:43–49

Bex FJ, van Koppen PJ, Prakken H, Verheij B (2010) A hybrid formal theory of arguments, stories and criminal evidence. Artif Intell Law 18(2):123–152

Broeders T (2009) Decision-making in the forensic arena. In: Kaptein H, Prakkenx H and Verheij H, Ashgate (eds) Legal evidence and proof: statistics, stories and logic, p 71–92

Buckleton J, Triggs CM, Walsh SJ (2005). Forensic DNA Evidence Interpretation, CRC Press

Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Springer, New York

Cowell RG, Lauritzen SL, Mortera J (2008) Probabilistic modelling for DNA mixture analysis. Forensic Sci Int Genet Suppl Series 1(1):640–642

Dawid AP (2004) Which likelihood ratio (comment on ‘why the effect of prior odds should accompany the likelihood ratio when reporting DNA evidence). Law Probab Risk 3(1):65–71

Dawid AP, Evett IW (1997) Using a graphical model to assist the evaluation of complicated patterns of evidence. J Forensic Sci 42:226–231

Dawid AP, Mortera J, Pascali VL, Van Boxel D (2002) Probabilistic expert systems for forensic inference from genetic markers. Scand J Stat 29(4):577–595

Dawid AP, Mortera J, Vicard P (2007) Object-oriented Bayesian networks for complex forensic DNA profiling problems. Forensic Sci Int 169:195–205

Evett IW, Weir BS (1998) Interpreting DNA evidence : statistical genetics for forensic scientists. Sinauer Associates, Sunderland

Evett IW, Foreman LA, Jackson G, Lambert JA (2000) DNA profiling: a discussion of issues relating to the reporting of very small match probabilities. Crim Law Rev (May) 341–355

Fenton NE (2011) Science and law: improve statistics in court. Nature 479:36–37

Fenton N, Neil M (2010) Comparing risks of alternative medical diagnosis using Bayesian arguments. J Biomed Inform 43:485–495

Fenton NE, Neil M (2011) Avoiding legal fallacies in practice using Bayesian networks. Aust J Legal Philos 36:114–150

Fenton N, Neil M (2012) Risk assessment and decision analysis with bayesian networks. CRC Press, Boca Raton

Fenton NE, Neil M, Lagnado D (2012) A general structure for legal arguments about evidence using Bayesian networks. Cognit Sci 37(1):61–102

Fenton NE, Berger D, Lagnado D, Neil M, Hsu A (2013) When ‘neutral’ evidence still has probative value (with implications from the Barry George Case), science and justice http://dx.doi.org/10.1016/j.scijus.2013.07.002

Foreman LA, Evett IW (2001) Statistical analysis to support forensic interpretation for a new ten-locus STR profiling system. Int J Legal Med 114(3):147–155

Gigerenzer G (2002) Reckoning with risk: learning to live with uncertainty. Penguin Books, London

Gill R (2013) Forensic statistics: ready for consumption? http://www.math.leidenuniv.nl/~gill/forensic.statistics.pdf

Gittelson S, Biedermann A, Bozza S, Taroni F (2013) Modeling the forensic two-trace problem with Bayesian networks. Artif Intell Law 21:221–252

Hepler AB, Dawid AP, Leucari V (2007) Object-oriented graphical representations of complex patterns of evidence. Law Probab Risk 6(1–4):275–293

Kadane JB, Schum DA (1996) A probabilistic analysis of the Sacco and Vanzetti evidence. Wiley, New Jersey

Kaye DH (2009) Identification, individualization, uniqueness. Law Probab Risk 8(2):85–94

Kaye DH, Bernstein DE, Mnookin JL (2010) The new wigmore: a treatise on evidence—expert evidence. Aspen Publishers, Second Edition

Koehler JJ (1993) Error and exaggeration in the presentation of DNA evidence at trial. Jurimetrics 34:21–39

Koehler JJ (1996) On conveying the probative value of DNA evidence: frequencies, likelihood ratios and error rates. Univers Colo Law Rev 67:859–886

Koehler JJ (2012) Proficiency tests to estimate error rates in the forensic sciences. Law Probab Risk 12(1):89–98. doi:10.1093/lpr/mgs013

Meester R, Sjerps M (2004) Why the effect of prior odds should accompany the likelihood ratio when reporting DNA evidence. Law Probab Risk 3(1):51–62

Morrison GM (2012) The likelihood ratio framework and forensic evidence in court: a response to RvT. Int J Evidence Proof 16(1)

Mortera J, Dawid AP, Lauritzen SL (2003) Probabilistic expert systems for DNA mixture profiling. Theor Pop Biol 63:191–205

Nordgaard A, Hedell R, Ansell R (2012) Assessment of forensic findings when alternative explanations have different likelihoods-”Blame-the-brother”-syndrome. Sci Justice 52:226–236

Puch-Solis R, Roberts P, Pope S, Aitken C (2012). Practitioner guide no 2: assessing the probative value of DNA evidence, guidance for judges, lawyers, forensic scientists and expert witnesses, royal statistical Society.http://www.rss.org.uk/uploadedfiles/userfiles/files/Practitioner-Guide-2-WEB.pdf

Queen Mary University of London, ERC Advanced Grant (2013) Effective Bayesian modelling with knowledge before data (BAYES-KNOWLEDGE) www.eecs.qmul.ac.uk/~norman/projects/B_Knowledge.html

R v Adams (1996) 2 Cr App R 467, [1996] Crim LR 898, CA and R v Adams [1998] 1 Cr App R 377

R v T (2009). EWCA Crim 2439 www.bailii.org/ew/cases/EWCA/Crim/2010/2439.pdf

Redmayne M (2001) Expert evidence and criminal justice. Oxford University Press, Oxford

Redmayne M, Roberts P, Aitken C, Jackson G (2011) Forensic science evidence in question. Crim Law Rev 5:347–356

Robertson B, Vignaux T (1995) Interpreting evidence: evaluating forensic science in the courtroom. Wiley, New jersey

Robertson B, Vignaux GA, Berger CEH (2011) Extending the confusion about Bayes. Mod Law Rev 74(3):444–455

Saks MJ, Koehler JJ (2007) The individualization fallacy in forensic science evidence http://works.bepress.com/michael_saks/1

Schum DA, Starace S (2001) The evidential foundations of probabilistic reasoning. Northwestern University Press, Evanston

Shaw A (2013) Do people trust Bayesian calculations better if they are shown a simple version first? MSc Thesis, University of London

Sjerps M, Berger C (2012) How clear is transparent? Reporting expert reasoning in legal cases. Law Probab Risk 11(4):317–329

Sjerps M, Meesters R (2009) Selection effects and database screening in forensic science. Forensic Sci Int 192(1–3):56–61

Taroni F, Aitken C, Garbolino P, Biedermann A (2006) Bayesian networks and probabilistic inference in forensic science. Wiley, New Jersey

Thompson WC (2008) The potential for error in forensic DNA testing (and how that complicates the use of DNA databases for criminal identification). In council for responsible genetics (CRG) National Conference: forensic DNA databases and race: Issues, Abuses and Actions, June 19-20, 2008, New York University http://www.councilforresponsiblegenetics.org/pageDocuments/H4T5EOYUZI.pdf

Thompson WC, Taroni F, Aitken CGG (2003) How the probability of a false positive affects the value of DNA evidence. J Forensic Sci 48(1):47–54

Triggs CM, Buckleton JS (2004) Comment on: why the effect of prior odds should accompany the likelihood ratio when reporting DNA evidence. Law Probab Risk 3(1):73–82

Verheij B (2007) Argumentation support software: boxes-and-arrows and beyond. Law Probab Risk 6:187–208

## Acknowledgments

We are indebted to the following for providing comments, corrections, relevant information, and contacts: David Balding, Daniel Berger, Sheila Bird, Tiernan Coyle, David Kaye, Joseph Kadane, Jay Koehler, Margarita Kotti, David Lagnado, Amber Marks, William Marsh, Geoff Morrison, Richard Nobles, David Ormerod, Mike Redmayne, David Schiff, Bill Thompson, Patricia Wiltshire.

## Author information

### Authors and Affiliations

### Corresponding author

## Rights and permissions

## About this article

### Cite this article

Fenton, N., Neil, M. & Hsu, A. Calculating and understanding the value of any type of match evidence when there are potential testing errors.
*Artif Intell Law* **22**, 1–28 (2014). https://doi.org/10.1007/s10506-013-9147-x

Published:

Issue Date:

DOI: https://doi.org/10.1007/s10506-013-9147-x