Exclusion probabilities and likelihood ratios with applications to mixtures

Slooten, Klaas-Jan; Egeland, Thore

doi:10.1007/s00414-015-1217-z

Exclusion probabilities and likelihood ratios with applications to mixtures

Original Article
Published: 10 July 2015

Volume 130, pages 39–57, (2016)
Cite this article

International Journal of Legal Medicine Aims and scope Submit manuscript

Klaas-Jan Slooten^1,3 &
Thore Egeland²

561 Accesses
19 Citations
Explore all metrics

Abstract

The statistical evidence obtained from mixed DNA profiles can be summarised in several ways in forensic casework including the likelihood ratio (LR) and the Random Man Not Excluded (RMNE) probability. The literature has seen a discussion of the advantages and disadvantages of likelihood ratios and exclusion probabilities, and part of our aim is to bring some clarification to this debate. In a previous paper, we proved that there is a general mathematical relationship between these statistics: RMNE can be expressed as a certain average of the LR, implying that the expected value of the LR, when applied to an actual contributor to the mixture, is at least equal to the inverse of the RMNE. While the mentioned paper presented applications for kinship problems, the current paper demonstrates the relevance for mixture cases, and for this purpose, we prove some new general properties. We also demonstrate how to use the distribution of the likelihood ratio for donors of a mixture, to obtain estimates for exceedance probabilities of the LR for non-donors, of which the RMNE is a special case corresponding to LR>0. In order to derive these results, we need to view the likelihood ratio as a random variable. In this paper, we describe how such a randomization can be achieved. The RMNE is usually invoked only for mixtures without dropout. In mixtures, artefacts like dropout and drop-in are commonly encountered and we address this situation too, illustrating our results with a basic but widely implemented model, a so-called binary model. The precise definitions, modelling and interpretation of the required concepts of dropout and drop-in are not entirely obvious, and we attempt to clarify them here in a general likelihood framework for a binary model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 4

Fig. 5

Evaluating Mixture Solution™— rapid and non-MCMC probabilistic mixture analysis

Article 01 October 2021

Relationship inference based on DNA mixtures

Article 05 November 2015

MaSTR™: an effective probabilistic genotyping tool for interpretation of STR mixtures associated with differentially degraded DNA

Article 29 January 2022

References

Balding D, Buckleton J (2009) Interpreting low template DNA profiles. Forensic Sci Int Genet 4(1):1–10
Article CAS Google Scholar
Buckleton J, Curran J (2008) A discussion of the merits of random man not excluded and likelihood ratios. Forensic Sci Int Genet 2:343–348
Article Google Scholar
Buckleton J, Triggs C, Walsh S (eds.) (2005) Forensic DNA Evidence Interpretation. CRC Press, Florida, USA
Cowell R, Graversen T, Lauritzen S, Mortera J (2015) Analysis of forensic DNA mixtures with artefacts. J R Stat Soc Ser C Appl Stat 64(1):1–48
Article Google Scholar
Curran J, Gill P, Bill M (2005) Interpretation of repeat measurement DNA evidence allowing for multiple contributors and population substructure. Forensic Sci Int 148 (1):47–53
Article CAS Google Scholar
Dørum G, Kling D, Baeza-Richer C, Magariṅos MG, Sæbø S, Desmyter S, Egeland T (2014) Models and implementation for relationship problems with dropout. Int J Leg Med 129(3):411–423
Article Google Scholar
Gill P, Gusmão L, Haned H, Mayr W, Morling N, Parson W, Prieto L, Prinz M, Schneider H, Schneider P, Weir B (2012) DNA commission of the International Society of Forensic Genetics: Recommendations on the evaluation of STR typing results that may include drop-out and/or drop-in using probabilistic methods. Forensic Sci Int Genet 6 (6):679–688
Article CAS Google Scholar
Gill P, Haned H (2013) A new methodological framework to interpret complex DNA profiles using likelihood ratios. Forensic Sci Int Genet 7:251–263
Article CAS Google Scholar
Haned H, Slooten K, Gill P (2012) Exploratory data analysis for the interpretation of low template DNA mixtures. Forensic Sci Int Genet 6(6):762–774
Article CAS Google Scholar
Kruijver M (2015) Efficient computations with the likelihood ratio distribution. Forensic Sci Int Genet 14:116–124
Article Google Scholar
Kruijver M, Meester R, Slooten K (2015) P-values should not be used for evaluating the strength of DNA evidence. Forensic Sci Int Genet 16:226–231
Article CAS Google Scholar
Nothnagel M, Schmidtke J, Krawczak M (2010) Potentials and limits of pairwise kinship analysis using autosomal short tandem repeat loci. Int J Legal Med 124(3):205–215
Article Google Scholar
Slooten K, Meester R (2011) Forensic identification: the Island Problem and its generalizations. Statistica Neerlandica 65:202–237
Article Google Scholar
Slooten K, Egeland T (2014) Exclusion probabilities and likelihood ratios with applications to kinship problems. Int J Legal Med 128(3):415–425
Article Google Scholar
Slooten K, Meester R (2014) Probabilistic strategies for familial DNA searching. J R Stat Soc Ser C Appl Stat 63(3):361–384
Article Google Scholar
Steele C, Balding D (2014) Statistical evaluation of forensic DNA profile evidence. Annual Review of Statistics and Its Application 1:361–384
Article Google Scholar
Thompson E (2000) Statistical inference from genetic data on pedigrees. In: NSF-CBMS regional conference series in probability and statistics. JSTOR
Westen A, Kraaijenbrink T, de Medina AR, Harteveld J, Willemse P, Zuniga S, van der Gaag K, Weiler N, Warnaar J, Kayser M, Sijen T, de Knijff P (2014) Comparing six commercial autosomal STR kits in a large Dutch population sample. Forensic Sci Int Genet 10:55–63
Article CAS Google Scholar

Download references

Acknowledgments

The work of the second author leading to these results was financially supported from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n ⁰ 285487 (EUROFORGEN-NoE).

Author information

Authors and Affiliations

Netherlands Forensic Institute, P.O. Box 24044, 2490 AA, The Hague, The Netherlands
Klaas-Jan Slooten
Norwegian University of Life Sciences, 1432, Aas, Norway
Thore Egeland
VU Department of Mathematics, De Boelelaan 1081a, 1081 HV, Amsterdam, The Netherlands
Klaas-Jan Slooten

Authors

Klaas-Jan Slooten
View author publications
You can also search for this author in PubMed Google Scholar
Thore Egeland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thore Egeland.

Appendices

Appendix A: LR properties

In this section, we prove the properties (2.13) and (2.15). The first one is a generalization of the proof of Eq. 2.11 in [14].

$$\begin{array}{@{}rcl@{}} P(LR(\mathcal{H}_{d}) \geq x) &=& \sum\limits_{y \geq x}P(LR(\mathcal{H}_{d})=y)\\ &=& \sum\limits_{y \geq x}\frac{1}{y}P(LR(\mathcal{H}_{p})=y)\\ &=& \sum\limits_{y \geq x} \frac{1}{y}P(LR(\mathcal{H}_{p})=y \mid LR(\mathcal{H}_{p}) \geq x)\\&&P(LR(\mathcal{H}_{p}) \geq x) \\ &=& P(LR(\mathcal{H}_{p}) \geq x)E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) \geq x]. \end{array} $$

Note that we need x>0 in this equation, and that the inequalities may be replaced by strict inequalities.

As for the variance,

$$\begin{array}{@{}rcl@{}} Var(LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) >x )&=&E[LR(\mathcal{H}_{d})^{2}\mid LR(\mathcal{H}_{d}) >x]\\&&-E[LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) >x]^{2}, \end{array} $$

we compute both terms separately. First, we have

$$\begin{array}{@{}rcl@{}} E[LR(\mathcal{H}_{d}) \mid LR(\mathcal{H}_{d}) > x] &=& \sum\limits_{y>x} y P(LR(\mathcal{H}_{d})=y \mid LR(\mathcal{H}_{d}) > x) \\ &=& \sum\limits_{y>x} y \frac{P(LR(\mathcal{H}_{d})=y)}{P(LR(\mathcal{H}_{d})>x)} \\ &=& \sum\limits_{y>x}\frac{P(LR(\mathcal{H}_{p})=y)}{P(LR(\mathcal{H}_{d})>x)} \\ &=& \frac{P(LR(\mathcal{H}_{p}) >x)}{P(LR(\mathcal{H}_{d})>x)} \\ &=& \frac{1}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]}. \end{array} $$

The other term equals

$$\begin{array}{@{}rcl@{}} E[LR(\mathcal{H}_{d})^{2}\mid LR(\mathcal{H}_{d}) >x] &=& \sum\limits_{y>x} y^{2} P(LR(\mathcal{H}_{d})=y \mid LR(\mathcal{H}_{d}) > x) \\ &=& \sum\limits_{y>x} y^{2} \frac{P(LR(\mathcal{H}_{d})=y)}{P(LR(\mathcal{H}_{d})>x)}\\ &=& \sum\limits_{y>x}y\frac{P(LR(\mathcal{H}_{p})=y)}{P(LR(\mathcal{H}_{d})>x)}\\ &=& \frac{1}{P(LR(\mathcal{H}_{d})>x)}\sum\limits_{y>x}yP(LR(\mathcal{H}_{p})\\&&=y \mid LR(\mathcal{H}_{p})>x)\\ &&\cdot P(LR(\mathcal{H}_{p})>x) \\ &=& \frac{P(LR(\mathcal{H}_{p})>x)}{P(LR(\mathcal{H}_{d})>x)}\\&& E[LR(\mathcal{H}_{p})\mid LR(\mathcal{H}_{p})>x] \\ &=& \frac{E[LR(\mathcal{H}_{p}) \mid LR(\mathcal{H}_{p})>x]}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]}. \end{array} $$

Combining these results completes the argument

$$\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;Var(LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) >x )= $$

$$\frac{E[LR(\mathcal{H}_{p}) \mid LR(\mathcal{H}_{p})>x]}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]}-\frac{1}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]^{2}}= $$

$$\frac{E[LR(\mathcal{H}_{p}) \mid LR(\mathcal{H}_{p})>x]E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]-1}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >x]^{2}}. $$

Appendix B: Detailed calculations for SNP’s

We explain some of the mathematical expressions based on a simple example without dropout or drop-in. There are two contributors to a mixture and the question is whether a person S has contributed (corresponding to H _p) or not (corresponding to H _d). We assume the contributors to be unrelated, and the suspect to be either a contributor or unrelated to the contributors. This means for example that $P(\mathcal {M}=M \mid \mathcal {S}=g, \mathcal {S} \neq \mathcal {C}_{1})=P(\mathcal {M}=M)$ since the genotype of a non-contributor does not influence the mixture’s likelihood, because the non-contributor is unrelated to the contributors. Similarly $P(\mathcal {C}_{i}=g)=f_{g}$ for both contributors. We work with the hypotheses as random variables according to Eqs. 2.4 and 2.5, meaning that we also regard the mixture itself as random.

B.1 One marker

Only one SNP marker is considered initially and the frequencies of the alleles denoted 1 and 2 are p and 1−p with p=0.4 in the numerical examples. We next exemplify Eqs. 2.11–2.15. Table 1 shows the possible mixtures along with their probabilities and RMNE, and Table 2 gives the distribution of LR.

Table 1 The possible two person SPN mixture, with their probability to occur and RMNE, if the minor allele has frequency 0.4

Full size table

Table 2 The distribution of LR for a two person SNP mixture

Full size table

As an example, note that LR attains the maximal value 1/p ²=6.25 when both contributors are homozygous for the rarer allele and this occurs with probabilities

$$\begin{array}{@{}rcl@{}} P(LR(\mathcal{H}_{p})=6.25)&=&P(\mathcal{M}=(1), \mathcal{S}=(1/1) \mid \mathcal{S}=\mathcal{C}_{1})=p^{4}\\&=&0.0256, \end{array} $$

$$\begin{array}{@{}rcl@{}}P(LR(\mathcal{H}_{d})=6.25)&=&P(\mathcal{M}=(1), \mathcal{S}=(1/1) \mid \mathcal{S} \neq \mathcal{C}_{1}) =p^{6}\\&&=0.0040960.\end{array} $$

In Table 3, we give the distributions conditional on the LR being greater than one.

Table 3 The distribution of LR for a two person SNP mixture conditioned on LR>1

Full size table

We first confirm (2.11) with input from Tables 1 and 2:

$$RMNE=0.0256 \cdot 0.1600 + 0.8448 \cdot 1.0000+0.1296 \cdot 0.3600=0.8956,$$

$$E[LR^{-1}(\mathcal{H}_{p})]=\frac{1}{0.7576}0.2304+ {\cdots} +\frac{1}{6.2500} \cdot 0.0256=0.8956.$$

Note that

$$\begin{array}{@{}rcl@{}} E[LR(\mathcal{H}_{p})]=0.2304 \cdot 0.7576+{\cdots} +0.0256 \cdot 6.2500=1.3964 \end{array} $$

which exceeds 1/RMNE=1.1166 as it should according to Eq. 2.12. Note that, since we are working with random variables as in Eqs. 2.4 and 2.5, the RMNE probability is an average over all possible mixtures.

Consider next (2.13), i.e.

$$\begin{array}{@{}rcl@{}} P(LR(\mathcal{H}_{d}) > x)=P(LR(\mathcal{H}_{p})> x) \cdot E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) > x]. \end{array} $$

We exemplify for x=1. From Tables 2 and 3

$$\begin{array}{@{}rcl@{}} &P(LR(\mathcal{H}_{d})>1)=0.4055+0.0467+0.0041=0.4563\\ &P(LR(\mathcal{H}_{p})>1)=0.4800+0.1296+0.0256=0.6352\\ \end{array} $$

$$E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) > 1]= \frac{0.7557}{1.1837} +\frac{ .2040}{2.7778}+\frac{0.0403}{6.2500} = 0.7183$$

which confirms (2.13), i.e. 0.4563=0.6352⋅0.7183.

Consider next (2.15), i.e.

$$\begin{array}{@{}rcl@{}} && Var(LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) > 1)=\\ && \frac{E[LR(\mathcal{H}_{p}) \mid LR(\mathcal{H}_{p})>1]E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >1]-1}{E[LR^{-1} (\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) >1]^{2}}. \end{array} $$

From Table 3,

$$\begin{array}{@{}rcl@{}} && E(LR^{2}(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) > 1)=\\&& 1.1837^{2}\cdot 0.8888+2.7778^{2}\cdot 0.1023+6.2500^{2}\cdot 0.0090= 2.386265,\\ && E(LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) > 1)=\\&& 1.1837\cdot 0.8888+2.7778\cdot 0.1023+6.2500\cdot 0.0090=1.392491,\\ && Var(LR(\mathcal{H}_{d})\mid LR(\mathcal{H}_{d}) > 1)=2.386265-1.392491^{2}=0.447. \end{array} $$

On the other hand,

$$\begin{array}{@{}rcl@{}} && E[LR(\mathcal{H}_{p})\mid LR(\mathcal{H}_{p}) > 1]=\\&&1.1837 \cdot 0.7557+2.7778 \cdot 0.2040+6.2500 \cdot 0.0403=1.7133 \end{array} $$

and so the right hand side of Eq. 2.15, becomes

$$\begin{array}{@{}rcl@{}} \frac{1.7133 \cdot 0.7183-1}{0.7183^{2}}=0.447, \end{array} $$

as it should.

Table 4 illustrates the different interpretations of LR as a random variable discussed in section “The likelihood ratio as a random variable”

Table 4 LR is the ratio of column 3 to 4 or 5 to 6. This exemplifies the two equivalent interpretations of LR as a random variable

Full size table

The details for the first line of the table are

$$\begin{array}{@{}rcl@{}} P(\mathcal{M}=(1,2) \mid\mathcal{S} =(2/2),\mathcal{S}=\mathcal{C}_{1}) &=&p^{2}+2p(1-p)=0.6400,\\ P(\mathcal{M}=(1,2) \mid \mathcal{S} =(2/2),\mathcal{S} \neq\mathcal{C}_{1}) &=&1-p^{4}-{(1-p)}^{4}=0.8448,\\ P(\mathcal{S}=(2/2) \mid\mathcal{M}=(1,2),\mathcal{S}=\mathcal{C}_{1}) &=& \frac{(2p-p^{2}){(1-p)}^{2}}{1-p^{4}-{(1-p)}^{4}}=0.2727273,\\ P(\mathcal{S}=(2/2) \mid\mathcal{M}=(1,2),\mathcal{S} \neq\mathcal{C}_{1}) &=& {(1-p)}^{2}=0.36, \\ LR(\mathcal{M}=(1,2),\mathcal{S}=(2/2)) &=&\frac{0.6400}{0.8448}=\frac{0.2727273}{0.3600}=0.7576. \end{array} $$

B.2 Several markers

We next include several independent markers, all with the same distribution of allele frequencies as in the case for the one marker above, and first calculate $P(Log_{10}(LR(\mathcal {H}_{p})) \leq x)$ exactly and approximately. Exact calculations are possible in this case based on functions in the R-package DNAprofiles (there is some specific code at the end of this section) if the number of markers does not exceed roughly 14; for a general number of markers, we can always obtain the distribution from sampling or by the normal approximation. Figure 7 shows the distribution of $LR(\mathcal {H}_{p})$. The vertical stapled line shows

$$Log_{10}(1/RMNE)=-14Log_{10}(0.8956)=0.6704 $$

and the upper bound

$$Log_{10}\left( E(LR(\mathcal{H}_{p}))=14 Log_{10}(1.3964)=2.0301\right. $$

is the vertical solid line to the right.

Consider next $P(LR(\mathcal {H}_{d})>x)$. Table 5 includes the exact values and the estimates based on importance sampling.

Table 5 Exact and simulated values are shown for 14 SNP markers with minor allele frequency 0.4 for a two person mixture

Full size table

Regarding the computation of $P(LR(\mathcal {H}_{p}) \leq 1/RMNE)$ discussed in the main text, the code in Table 6 gives the exact answer 0.27 based on the R-package DNAprofiles.

Table 6 The probability that LR is less than 1/RMNE is calculated exactly based on the R package DNAprofiles

Full size table

Based on the accurate approximation provided by sampling, we proceed to a larger number of markers accepting that exact calculations are not possible. The asymptotic lognormal approximation discussed in [12] worked reasonably well for 14 markers (details omitted) and is expected to improve for a larger number of markers. We choose the number of markers so that the power is comparable to the numerical results for the NGM markers in the main text and this is achieved for −8.47135/Log ₁₀(RMNE)≈176 markers. Figure 8 shows the results.

Appendix C: Non-standard hypotheses

Finally, we look at two examples of non-standard likelihood ratios where the hypotheses H _p and H _d have more difference between them then the presence of the person of interest in the mixture. First, we consider the model itself to be different in the most general way, i.e. the number of contributors, the known contributors, the dropout probabilities and the drop-in parameter need not be the same. Second, we consider that the hypotheses agree on the number of contributors, their dropout probabilities and the drop-in parameter, but that one contributor is known under H _p and none under H _d. Then (2.7) still holds, and we show how the random variable approach of this paper applies in that situation.

C.1 Different models per hypothesis

We have thus far assumed that the probability model for both hypotheses is the same, i.e., that H _p and H _d specify the same number of contributors, drop-out probabilities, etc. In this example, we will show that this is not strictly necessary, but that the interpretations of the probabilities involving $\mathcal {H}_{p}$ and $\mathcal {H}_{d}$ become different when interpreted in terms of H _p and H _d. Suppose, for example, that H _p, resp. H _d state n, resp. $n^{\prime }$ contributors with dropout vector d resp. $\mathbf {d}^{\prime }$ and drop-in parameter c, resp. $c^{\prime }$ and that furthermore according to H _p the suspect is a contributor whereas this is not the case according to H _d. We also suppose that according to H _p some contributors are known and according to H _d some (possibly others) as well. We now set

$$\hspace*{.5pt}P(\mathcal{H}_{p}=g)= P_{\textbf{d},c}\hspace*{7.5pt}(\mathcal{S}=g \mid\mathcal{M}=M,\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\textbf{g}_{T}), $$

$$P({\mathcal{H}{_d}}=g)= \mathit{P}_{{\textbf{d}}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{S}=g \mid\mathcal{M}=M,\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T^{\prime}}=\mathbf{g}^{\prime}_{T^{\prime}}).$$

In that case,

$$\begin{array}{@{}rcl@{}} LR_{\mathcal{H}_{p},\mathcal{H}_{d}}(g)&=&\frac{P(\mathcal{H}_{p}=g)}{P(\mathcal{H}_{d}=g)}\\ &=&\frac{P_{\mathbf{d},c}(\mathcal{S}=g \mid\mathcal{M}=M,\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}{{P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{S}=g \mid\mathcal{M}=M,\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T}=\mathbf{g}_{T})}\\ &=& \frac{P_{\mathbf{d},c}(\mathcal{M}=M \mid\mathcal{S}=g,\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})P_{\mathbf{d},c}(\mathcal{S}=g \mid\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}{P_{\mathbf{d},c}(\mathcal{M}=M \mid\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}\\ && \cdot \frac{{P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{M}=M \mid\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T^{\prime}}={\mathbf{g}^{\prime}}_{T^{\prime}})}{{P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{M}=M \mid\mathcal{S} =g,\mathcal{S}\sim\mathcal{G},\mathcal{C}_{T^{\prime}}={\mathbf{g}^{\prime}}_{T^{\prime}}){P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{S}=g \mid\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T^{\prime}}=\mathbf{g}^{\prime}_{T^{\prime}})}\\ &=& \frac{P_{\mathbf{d},c}(\mathcal{M}=M \mid\mathcal{S}=g,\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}{{P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{M}=M\mid\mathcal{S}=g,\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T^{\prime}}={\mathbf{g}^{\prime}}_{T^{\prime}})}\\ &&\cdot\frac{P_{\mathbf{d},c}(\mathcal{S}=g \mid\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}{P_{\mathbf{d}^{\prime},c^{\prime}}(\mathcal{S}=g \mid\mathcal{S}\sim\mathcal{G},\mathcal{C}_{T^{\prime}}=\mathbf{g}_{T^{\prime}})}\\ &&\cdot \frac{P_{\mathbf{d^{\prime}},c^{\prime}}(\mathcal{M}=M\mid\mathcal{S} \sim\mathcal{G},\mathcal{C}_{T^{\prime}}=\mathbf{g^{\prime}}_{T^{\prime}})}{P_{\mathbf{d},c}(\mathcal{M}=M \mid\mathcal{S}=\mathcal{C}_{1},\mathcal{C}_{T}=\mathbf{g}_{T})}\\ &=& LR_{H_{p},H_{d}}(g)\frac{{P}_{\mathbf{d}^{\!}{^{\prime}}{,}{c^{\prime}{}}}(\mathcal{M}=M\mid \mathcal{C}_{T^{\prime}}={\mathbf{g}^{\prime}}_{T^{\prime}})}{P_{\text{d},}{~}\!_{c}(\mathcal{M}=M \mid\mathcal{C}_{T}=\mathbf{g}_{T})} \end{array} $$

since we assume that in the absence of mixture data, the marginal distributions of $\mathcal {S}$ are the same for both hypotheses. We see that $LR_{\mathcal {H}_{p},\mathcal {H}_{d}}(g)=P(\mathcal {H}_{p}=g)/P(\mathcal {H}_{d}=g)$ can be interpreted again as the realization of a random variable, but now as the realization of a random likelihood ratio that tests both the presence of the suspect (in $LR_{H_{p},H_{d}}(g))$ and the change in model (represented by the second factor). Thus, while the general framework presented in section “The likelihood ratio as a random variable” still applies, the relation between $\mathcal {H}_{p},\mathcal {H}_{d}$ and H _p,H _d is not the same as when the same probabilistic model is defined by H _p and H _d since the random variable $P(\mathcal {H}_{p}=g)/P(\mathcal {H}_{d}=g)$ can be viewed as a simultaneous test for the contribution of the suspect and for the two different models stating the composition of the mixture.

Note that (2.7) need not apply in this situation. If it does, then the results of section “The likelihood ratio as a random variable” apply to $LR_{\mathcal {H}_{p},\mathcal {H}_{d}}$ and not necessarily to the original LR. Also, as a result of testing two separate issues simultaneously, it is no longer the case that (2.21) still holds. Indeed, the mixture likelihood under H _d can be made arbitrarily small. If for instance H _d stipulates that all dropout rates are equal to one, then all detected alleles must be the result of drop-in. The likelihood of this happening can be made arbitrarily small by decreasing c.

C.2 Testing two donors simultaneously

We now illustrate the preceding results in case where the hypotheses H _p and H _d differ in more than one alleged contributor. Consider as an example a two-person mixture where H _p states that the mixture is a mixture of suspect S and victim V, and H _d states that its contributors are two unknown individuals. In that case, if the LR in favour of H _p is large, this does not necessarily imply that the evidence against S is strong. In [8], it is proposed to carry out a simulation experiment replacing S with a random person and calculate LR’s. Many large LRs may be the result of this. At first glance, this may seem to contradict (2.14) but this is actually not the case.

To see this, we recast this example in our random variable framework. We suppose that we have chosen parameters for the binary model, that the mixture is a two-person mixture, that there is no relatedness to take into account and we let as before $\mathcal {C}_{1}$ and $\mathcal {C}_{2}$ be the random variables that describe the genotypes of the first and second donor. Let g _V be the observed genotype of the victim. According to H _p, the second donor has genotype g _V and according to H _d, there are no known contributors. In order to define $\mathcal {H}_{p}$ and $\mathcal {H}_{d}$, we need them to have the same sample space but $\mathcal {H}_{p}$ needs to keep the profile of the second donor equal to that of the victim. Thus, we let

$$\begin{array}{@{}rcl@{}} P(\mathcal{H}_{p}=(g_{1},g_{2}))&=& P(\mathcal{C}_{1}=g_{1},\mathcal{C}_{2}=g_{2} \mid\mathcal{M}=M,\mathcal{C}_{2}=g_{V}) \\ &=& \left\{\begin{array}{ll} 0\qquad\qquad\qquad\qquad\qquad\qquad\quad \text{if} g_{2} \neq g_{V}, \\ P(\mathcal{C}_{1}=g_{1} \mid\mathcal{C}_{2}=g_{V},\mathcal{M}=M) \text{ if } g_{2}=g_{V}, \end{array}\right. \end{array} $$

$$\begin{array}{@{}rcl@{}} P(\mathcal{H}_{d}=(g_{1},g_{2}))&=&P(\mathcal{S}_{1}=g_{1},\mathcal{S}_{2}=g_{2} \mid\mathcal{M}=M,\mathcal{S}_{1} \sim\mathcal{G},\mathcal{S}_{2} \sim\mathcal{G})\\ &=&f_{g_{1}}f_{g_{2}}, \end{array} $$

where f _g is the population frequency of profile g. Then $LR_{\mathcal {H}_{p},\mathcal {H}_{d}}(g_{1},g_{2})=0$ unless g ₂=g_V and in that case,

$$\begin{array}{@{}rcl@{}} LR_{\mathcal{H}_{p},\mathcal{H}_{d}}(g_{1},g_{2})&=&\frac{P(\mathcal{H}_{p}=(g_{1},g_{2}))}{P(\mathcal{H}_{d}=(g_{1},g_{2}))}\\ &=&\frac{P(\mathcal{M}=M\mid\mathcal{C}_{1}=g_{1},\mathcal{C}_{2}=g_{V})}{P(\mathcal{M}=M)}\\&&\;\;\;\;\;\;\;\frac{1}{P(\mathcal{C}_{2}=g_{V}\mid\mathcal{M}=M)}\\ &=&LR_{H_{p},H_{d}}(g_{1})\frac{1}{P(\mathcal{C}_{2}=g_{V} \mid\mathcal{M}=M)}. \end{array} $$

In this product, the first term corresponds to the LR calculated with the original hypotheses H _p and H _d (mixture of suspect with genotype g ₁ and victim with genotype g_V versus mixture of two unknowns), which we denote again by $LR_{H_{p},H_{d}}(g_{1})$. We still have

$$P(LR(\mathcal{H}_{d}) > x) < \frac{1}{x},$$

but it is extremely unlikely that $\mathcal {H}_{d}$ selects g ₂=g _V as second genotype. If we condition on this, we get for x>0, after some calculations, that for a genotype g ₁ drawn at random from the population,

$$ P(LR_{H_{p},H_{d}}(g_{1}) > x)< \frac{1}{x}\frac{P(\mathcal{C}_{2}=g_{V} \mid \mathcal{M}=M)}{P(\mathcal{C}_{2}=g_{V})}. $$

(C.1)

Note that

$$\frac{P(\mathcal{C}_{2}=g_{V} \mid\mathcal{M}=M)}{P(\mathcal{C}_{2}=g_{V})}=\frac{P(\mathcal{M}=M \mid\mathcal{C}_{2}=g_{V})}{P(\mathcal{M}=M)} $$

is the LR in favour of the second contributor having genotype g _V. Thus, the bound (C.1) is not useful when there is strong evidence that V is the second contributor. On the other hand, when there is evidence that V is not a contributor, then (C.1) may be informative.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Slooten, KJ., Egeland, T. Exclusion probabilities and likelihood ratios with applications to mixtures. Int J Legal Med 130, 39–57 (2016). https://doi.org/10.1007/s00414-015-1217-z

Download citation

Received: 11 November 2014
Accepted: 23 April 2015
Published: 10 July 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s00414-015-1217-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exclusion probabilities and likelihood ratios with applications to mixtures

Abstract

Access this article

Similar content being viewed by others

Evaluating Mixture Solution™— rapid and non-MCMC probabilistic mixture analysis

Relationship inference based on DNA mixtures

MaSTR™: an effective probabilistic genotyping tool for interpretation of STR mixtures associated with differentially degraded DNA

References

Acknowledgments