Pairwise relatedness testing in the context of inbreeding: expectation and variance of the likelihood ratio

In this paper we investigate various effects of inbreeding on the likelihood ratio (LR) in forensic kinship testing. The basic setup of such testing involves formulating two competing hypotheses, in the form of pedigrees, describing the relationship between the individuals. The likelihood of each hypothesis is computed given the available genetic data, and a conclusion is reached if the ratio of these exceeds some pre-determined threshold. An important aspect of this approach is that the hypotheses are usually not exhaustive: The true relationship may differ from both of the stated pedigrees. It is well known that this may introduce bias in the test results. Previous work has established formulas for the expected value and variance of the LR, given the two competing hypotheses and the true relationship. However, the proposed method only handles cases without inbreeding. In this paper we extend these results to all possible pairwise relationships. The key ingredient is formulating the hypotheses in terms of Jacquard coefficients instead of the more restricted Cotterman coefficients. While the latter describe the relatedness between outbred individuals, the more general Jacquard coefficients allow any level of inbreeding. Our approach also enables scrutiny of another frequently overlooked source of LR bias, namely background inbreeding. This ubiquitous phenomenon is usually ignored in forensic kinship computations, due to lack of adequate methods and software. By leveraging recent work on pedigrees with inbred founders, we show how background inbreeding can be modeled as a continuous variable, providing easy-to-interpret results in specific cases. For example, we show that if true siblings are subjected to a test for parent-offspring, moderate levels of background inbreeding are expected to inflate the LR by more than 50%.


Introduction
The conventional approach to forensic kinship testing includes formulating two hypotheses and calculating a likelihood ratio (LR) based on genetic data from genotyped individuals. Practice differs between countries and laboratories, but typically the LR or some version of it is included when the case is reported. The conclusion based on the LR may be flawed when the true pedigree connecting the individuals of interest differs from the pedigrees considered by the hypotheses. As an example, consider a standard paternity case, where the prosecution asserts that a certain man is the father of a child, while the defense claims that the man and the child are unrelated. The truth, on the other hand, may be that the man is the child's uncle. A special case of incorrect hypotheses occurs when inbreeding is not accounted for. For example, if the alleged father is inbred, and this is ignored when formulating the hypotheses, this may significantly bias the LR. One aim of this paper is to investigate and quantify this effect.
Slooten and Egeland derived explicit equations for the expected value and variance of the LR [1]. They also extended this to cases where the true relationship differs from those stated in the hypotheses [2]. However, in both of these works only non-inbred individuals were considered.
An important contribution of this paper is the extension of these results to general pairwise relationships. In particular, we show that exact expressions for the expected value and variance of the LR can be obtained also in cases with inbreeding. The expressions are in general more involved than in the non-inbred case, and not as easy to interpret. However, we derive interesting and practical results in important special cases.
A parametric approach to modeling background inbreeding in kinship testing was recently introduced [3], employing the concept of inbred founders [4]. To exemplify, consider a pair of paternal half siblings, whose father is assigned an inbreeding coefficient f . As f increases from 0 to 1, the relationship between the half siblings becomes genetically indistinguishable from that between parent and child. We extend the theoretical framework of [1,2] to pedigrees with inbred founders. As a result, the impact of background inbreeding on the expectation and variance of the LR can be studied based on exact expressions. In cases where the amount of inbreeding is unknown, we can still provide guidance on the expected values for the LR. Our approach conveniently allows a continuous range of possible true alternatives rather than a discrete set of specific alternatives. To arrive at explicit results of practical interest, we restrict attention to pairwise relationships. Furthermore, as in the work of Slooten and Egeland, we ignore mutations, dropouts, and silent alleles and we assume Hardy-Weinberg Equilibrium (HWE). However, we explain how deviation from HWE can be modeled by the so called theta (θ ) correction.
R scripts and functions used to obtain numerical results in this paper are gathered in a R library (see the "R implementation" section). Pedigree likelihoods and marker simulations are performed with the forrel package [3].
This paper is organized in the following manner: After establishing some terminology and notation we review the main results of [2] regarding the expected value and variance of the LR for non-inbred pairs of individuals. We then proceed to extend these results to general pairwise relationships, including relationships in pedigrees with background inbreeding. Several worked examples follow, including a simulation study comparing our formulas with real-life results. Finally, we discuss some consequences of this work and how it relates to other aspects of forensic genetics.

Definitions and notation
A central concept for measuring genetic relatedness is that of identity by descent (IBD). Two alleles are said to be IBD relative to a given pedigree if they are identical by state and originate from the same ancestral allele within the pedigree [5].

Coefficients of inbreeding and kinship
The coefficient of inbreeding f , introduced by Wright [6], is the probability that an individual is autozygous at a given autosomal locus, i.e., that the two homologous alleles are IBD. This is the same as the kinship coefficient ϕ between the parents of the same individual, defined as the probability that a random allele from the mother is IBD to a random allele from the father at the same locus.
Founders of a pedigree are conventionally assumed to be unrelated and non-inbred. Following [3] we relax the second assumption, allowing an arbitrary inbreeding coefficient f to be assigned to any founder individual. For a given pedigree with N founders, we denote the set of founder inbreeding coefficients by f = (f 1 , f 2 , . . . , f N ).
Background inbreeding in human populations is normally low, but may exceed 5% in certain cases [7,8]. In forensic case work inbreeding is common, ranging from consanguineous marriages between cousins, f = 1/16 or lower, to incestuous relationships between siblings or parent-child, both with f = 1/4. In breeding applications values closer to 1 may occur.

Jacquard coefficients and likelihood of a pedigree
The kinship coefficient is a coarse measure of relatedness; for instance, it has the same value for a parent-child relationship as for full siblings. A more refined measure is given by the nine Jacquard coefficients [9] Δ = (Δ 1 , . . . , Δ 9 ), also called the condensed identity coefficients. These are the expected relative frequencies of the Jacquard states J 1 , . . . , J 9 are depicted in Fig. 1. Alleles within each individual are unordered, and hence, several IBD configurations can correspond to the same Jacquard state. Furthermore, Δ is related to ϕ through The likelihood of two individuals being related according to Δ, given their genotypes G = (g 1 , g 2 ) at a marker may be expressed by conditioning on the Jacquard state: The conditional probabilities P (G | J i ) are listed in Table 1. These probabilities are found by direct calculations; for instance, P ((aa, aa) | J 1 ) = p a since J 1 dictates that all four alleles are IBD. Connected dots indicate IBD. The states J 9 , J 8 , and J 7 do not involve inbreeding and are sometimes denoted K 0 , K 1 , and K 2
Although the IBD coefficients are only defined for noninbred individuals, other members of the pedigree can be inbred. For example, a pair of half siblings remain outbred even if their shared parent is inbred. However, this inbreeding will affect the relatedness coefficients. Table 2 lists the kinship and the IBD coefficients for some common relationships, as functions of the founder inbreeding. The effects are visualized in Fig. 2. In the half sibling example, the genetic relationship approaches that of parent-child, as the founder inbreeding increases towards 1. Similarly, the IBD coefficients of full siblings with inbred parents may fall anywhere in the lightly shaded region towards the point of monozygotic twins.

Review of previous results
We next review the main results of [2] relevant for our work. In particular we restate the explicit formulas for the expectation and variance of the LR in the case of non-inbred individuals.

The likelihood ratio as a random variable
We consider a kinship test involving genetic data from two non-inbred individuals. Two hypotheses H P and H D about the relationship are to be compared using the LR. For our purposes, each hypothesis corresponds to a point in the IBD triangle, denoted by κ P and κ D respectively. However, the evidence may be generated from another pedigree, corresponding to a third point κ T . We therefore have the  The IBD triangle with location of some common relationships. The gray area is inadmissable. The arrows illustrate the effect of founder inbreeding in the cases given in Table 2. PO, parent-child; MZ, monozygotic twins; S, siblings; H, half siblings; U, avuncular; G, grandparent grandchild; FC, first cousins; UN, unrelated following setup, comprising the competing hypotheses and the true relationship: Reflecting standard practice, we will always use unrelatedness as the defense hypothesis, i.e., κ D = (1, 0, 0). It should be noted, however, that this is not a theoretical requirement for the methods presented here. The concept of the likelihood ratio as a random variable was discussed by Slooten and Egeland [1]. We review the basics here, presented in a slightly simpler notation sufficient for our purposes.
Denote by K i , i = 0, 1, 2, the event that the individuals share exactly i alleles IBD. As shown in Fig. 1, K 0 , K 1 , and K 2 are identical to the Jacquard states J 9 , J 8 , and J 7 respectively. For fixed κ P the likelihood ratio for a given pair of genotypes G = (g 1 , g 2 ) can be written as Note that the final transition was obtained by applying (1) in both the numerator and denominator. The probabilities P (G | K i ) are given in Table 1. Now, viewing the genotypes as a random variable G, we define the random variable LR = LR(G). Note that the distribution of G is completely determined by κ T (assuming HWE), hence the distribution of LR is determined by κ P and κ T . If these parameters are clear from the context, we will suppress them in our notation; otherwise, we write Table 2 Relatedness coefficients as functions of founder inbreeding, in a selection of common relationships In the special case when H P equals the truth, i.e., κ P = κ T , we may simplify LR κ P ,κ T to LR κ P .
Throughout, we assume the following condition to hold In the present context, it means that all DNA profiles that can occur under H P , can also occur under H D .
In our examples H D specifies unrelated individuals, and then (3) holds. The condition also holds for mutation models provided all elements of the mutation matrix are positive. We do not model mutations in the work presented here, as practical exact expression are then no longer available. However, the implementation allows for general mutation models. Without (3), likelihood ratios could be infinite, i.e., not defined.

Expected likelihood ratio
The expectation of LR may be found by summing over all possible genotypes G in the standard way: where An exact expression for E(LR) when κ P = κ T was first derived in [1] and extended in [2] to apply when κ P = κ T . For the latter situation it was shown that, for a single marker with L alleles, where t denotes the vector transpose, and Importantly, the expected value depends only on the number of alleles, not on the allele frequencies. Furthermore, the expectation is symmetric in κ P and κ T , so that

Variance of the likelihood ratio
To derive the variance of LR we apply the general formula var(X ) = E(X 2 )−E(X ) 2 . Since the last term follows from Eq. 5, all that remains is to find the first term. Some notation is needed: Furthermore, supplementing the matrix A 0 given in Eq. 6, we define matrices A 1 and A 2 by It was shown in [2] that hence, the complete variance expression becomes Contrary to the expected LR, the variance of the LR depends on the allele frequencies.

Example: paternity testing
This example serves as an illustration of the above described expected LR and the corresponding hypotheses. Consider a paternity case, where a man is claimed to be the father of a child (H P ). The truth is that a brother of the alleged father is the true father of the child. The hypotheses and the true relatedness are in terms of the IBD coefficients given as Figure 3 illustrates the hypotheses in terms of pedigrees, and as points in the IBD triangle. Equation (5), with IBD coefficients as in Eq. 11, simplifies to The variance of LR becomes Fig. 3 Pedigrees and location of IBD coefficients κ P , κ D , and κ T for a paternity case when the true relationship is avuncular In the special case L = 2, and allele frequencies q and 1 − q, the variance expression reduces to var(LR) = 11 64 + 1 32 This expression is minimal when q = 0.5 and becomes infinitely large when q or 1 − q approaches 0. If no assumption is made for L, but all alleles are assumed equally frequent, the variance reduces to

Likelihood ratio for general pairwise relationships
In this section we extend the results reviewed above to relationships between any pairs of individuals. In particular we now allow inbreeding. For this to work we must pass from the IBD coefficients to the full set of Jacquard coefficients. For details regarding derivations of the results (see the Appendix).

Expected likelihood ratio
We use the same setup for kinship testing as introduced previously, but in order to allow general inbreeding, we now formulate our hypotheses using Jacquard coefficients, . Note that the defense hypothesis still corresponds to unrelatedness. We are interested in the likelihood ratio comparing H P with H D when the genotypes are generated by a pedigree with the Jacquard coefficients Δ T . Equation (1) implies that As shown in the Appendix, the expected LR is where B 9 is the symmetric 9 × 9 matrix given in Table 4, whose elements are E(LR J i ,J j ), for 1 ≤ i, j ≤ 9. As opposed to the non-inbred case, we see that the expected value in general depends on the allele frequencies.

Variance of the likelihood ratio
In the Appendix matrices B 1 , . . . , B 9 are defined and it is shown that From this we obtain the variance formula

Pairwise relationships with inbred founders
As previously explained, a set of inbreeding coefficients f can be assigned the founders of a pedigree to model background inbreeding. The Jacquard coefficients of any pair of pedigree members are then functions of f . It follows that the formulas for expectation and variance of LR Each row represents J i , a Jacquard state assumed by H P , while each column presents J j , the true Jacquard state involving such pedigrees remain as in Eqs. 15 and 17, except that the parameters Δ P and Δ T must be updated.
Specifically, let f P be a vector of founder inbreeding coefficients in the pedigree assumed by H P , and f T similarly in the true pedigree. The expectation and variance of LR in this situation are then given by Note that the matrices B i only depend on L and the allele frequencies, and therefore are unchanged by founder inbreeding.
Remark 1 It should be emphasized that the formulas (15) and (17) are needed only when at least one of the tested individuals are inbred in some of the involved pedigrees. If both are non-inbred, the simpler expressions (5) and (10) using IBD coefficients suffice. Importantly, this remains true if other members of the pedigree are inbred, as long as this does not lead to inbreeding in the tested individuals.
In particular, founder inbreeding may be accounted for in Eqs. 5 and 10 simply by replacing κ P and κ T by κ P (f P ) and κ T (f T ) respectively.

Founder inbreeding and θ correction
The conventional approach to background relatedness in forensics is the so called θ correction [12]. In an inbred population, the composition of genotypes do not follow the Hardy-Weinberg principle, implying that the frequencies given in Table 1 no longer hold. The following approach compensates for this by adjusting the allele frequencies.
Without loss of generality we can assume that alleles observed are sampled sequentially. The probability that allele i is sampled as the j th allele is given by the sampling formula whereθ = 1 − θ and b j denotes the number of alleles of type i among the j − 1 previously sampled. Note that for pairwise cases, the likelihood can be written where P (G | J i , θ) is calculated using Eq. 18. The matrices B 1 , ..., B 9 then change with θ , modifying the expectation and variance of the LR. This emphasises a fundamental difference between founder inbreeding and θ correction: f modifies the relationship itself, while θ only impacts the genotype probabilities. If rather than using θ correction, we assign an inbreeding coefficient f to A, the likelihood becomes

Example: θ correction and founder inbreeding in a paternity case
Consider next the hypothesis H P 1 : A is the father of B. Equation (18) now gives L θ (H P 1 ) = p a (θ +θp a ) 2θ +θp a 1 + θ and so the LR with θ correction is The inbreeding coefficient approach gives p a and LR f = 1/p a . Note that the LR does not depend on f and that this is true for all genotype combinations for A and B. The LRs for other genotype combinations for A and B with θ correction are given in Table 10.8 in [13].
To illustrate (19) consider the hypothesis H P 2 : A and B are paternal half siblings whose father is inbred.
The LR comparing H P 2 with A and B being unrelated becomes 1 2 (1 − f ). If A and B share alleles, the LR will depend also on θ .

R implementation
Utilities to perform the computations in this paper are provided in a R library named InbredLR, available from the first author, building on several packages in the ped suite, notably pedprobr and forrel [3]. The core of InbredLR are functions that compute the expectation and variance of the likelihood ratio for pairwise relationships. The user can specify the parameters (κ, f or Δ) or specify the pedigrees, possibly with inbred founders. A function for simulating marker data to estimate the distribution of LR is also provided, as well as a function for visualizing pedigrees H P and H D and the true pedigree and location of the corresponding IBD coefficients in the IBD triangle.

Paternity case for siblings with inbred founders
Consider two individuals who claim to be related as parent and offspring. Their true relationship is siblings and their parents coefficients of inbreeding are f T = (f 1 , f 2 ). Figure 4 shows the case. This example can be relevant Fig. 4 Hypotheses involved in "Paternity case for siblings with inbred founders" and the location of the corresponding IBD coefficients κ P , κ D , and κ T in the IBD triangle for family reunion cases, where a parent-child relationship would give right to residence permit, whereas a sibling relationship would not. In [14] such a case is considered. H P and H D and their true relationship are in terms of the IBD coefficients given as where κ T (f T ) = κ T (f 1 , f 2 ) are as in the first row of Table 2. Keeping in mind Remark 1, we apply (5) to find the expected LR:  Table 3). The variance of LR differs between the two cases, however (not shown here).
As the background inbreeding of the true sibling pedigree increases, E(LR) increases. The expected LR of the  Fig. 4. The shaded area shows one standard deviation below and above E(LR), for uniform allele frequencies paternity case (and hence the trust in H P ) is therefore higher if the true relatedness is siblings with background inbreeding, rather than the tested parent-child relationship. The variance of LR decreases moderately for increasing founder inbreeding. For increasing number of alleles L, the slope of the expected LR increases.
The following calculation gives a simple approximation of the inflation in the expected LR caused by background inbreeding. Suppose f 1 = f 2 = f , and write (21) as μ 0 + μ f , where μ 0 = 1 4 (L + 3) is the expected LR without founder inbreeding, and μ f = 1 4 (L − 1)f is the expected contribution caused by founder inbreeding. Note that μ 0 + μ f = (1 + μ f μ 0 )μ 0 , and that for L ≥ 5 we have . This implies that with N independent markers, the total LR has expectation This means that a background inbreeding level f will inflate the expected LR by at least 1 2 f N. For example, if N = 20 and f = 0.05, the inflation rate is greater than 50%.

Siblings and half siblings with founder inbreeding
Distinguishing between siblings and half siblings can be difficult based on unlinked markers. Mayor and Balding address the problem in [15], with focus on the number of loci needed. If the shared parent of the half siblings has inbreeding coefficient f T > 0, the problem becomes even more interesting.
Consider the situation shown in Fig. 6. The hypotheses are where f P = (f 1 , f 2 ) are the parental inbreeding coefficients in the H P pedigree and κ P (f P ) and κ T (f T ) are as in the first and second rows of Table 2, respectively. Fig. 6 The hypotheses involved in "Siblings and half siblings with founder inbreeding" and the location of the corresponding IBD coefficients κ P , κ D , and κ T in the IBD triangle This setup facilitates for modeling background inbreeding in both the true pedigree and in H P . Equation (5) gives In Fig. 7, the expectation of LR is shown as a function of founder inbreeding f T of the true half sibling pedigree, for H P stating sibling pedigree with founder inbreeding f P = 0 and 0.2 (assuming f 1 = f 2 ), and L = 2, 10 and 20 alleles at a locus. For increasing values of f T , E(LR) increases, for all values of f P , and the evidence in favor of a sibling relationship becomes stronger.
Consider next the situation when f 1 = f 2 = 0. H P then assumes a sibling relationship without inbred founders. Figure 8 shows E(LR) (dashed line) and LR computations from 1000 sets of simulated data, as a function of f T . The solid line gives the mean value of the simulated LR. The expected LR increases slightly as founder inbreeding increases. For Fig. 8a this seems to fit well with the mean values of the LRs from simulated data. These simulation assumes 13 loci, each of 3 alleles with allele frequencies 0.4, 0.3 and 0.3. In Fig. 8b, on the other hand, there is a substantial difference between E(LR) and the mean of the simulated LRs. These simulations use 13 CODIS markers with allele frequencies ranging from 0.0003 to 0.5378 (allele frequencies are available as a part of the R library InbredLR, see the "R implementation" section). Alleles with low frequencies will more seldom be present in the simulations. The expected LR only depends on the number of alleles at a locus, but because of the rare alleles, the simulations give in practice a lower number of alleles at these loci. The simulations in Fig. 8c use the same markers, but with uniform allele frequencies for alleles at a locus. The expectation of the LR is independent of the allele frequencies and is therefore not changed, but now the mean of the simulated LRs is closer to the expected value. Even though E(LR) is independent of the allele frequencies, the Finally, we offer an approximation of the inflation in the expected LR due to background inbreeding. For simplicity, we assume f 1 = f 2 = 0 so that H P states a normal sibling relationship. From Eq. 23 the expected LR is μ 0 = 1 8 (L+7) if f T = 0. On the other hand, if f T > 0, the expected contribution to the LR is

and it follows that
A background inbreeding level of f T will inflate the expected LR by at least 1 3 f T N. For example, with N = 20 and f T = 0.05, the inflation rate is greater than 33%.

Paternity case with inbreeding
Consider a paternity case with hypotheses as shown in Fig. 9. The alleged father is indeed the true father and has inbreeding coefficient f . We will analyze the consequences Fig. 9 The hypotheses in a paternity case with inbreeding. To the far right are the Jacquard states with nonzero probability in the true relationship of ignoring the inbreeding in H P . The hypotheses are parameterized in the following way: The expression for the expected LR simplifies considerably since most elements of Δ P and Δ T (f T ) are zero. Equation (15) gives and we see that E(LR) increases linearly from (L + 3)/4 to (L + 1)/2 as f T goes from 0 to 1. Consider next the variance. For brevity, we define Note that h(i, j, k) is invariant under permutations of i, j, k. Equation (16) gives (8,8,8).
Slooten and Egeland [1] derived the term not involving inbreeding, i.e., h (8,8,8) To derive the remaining term we condition on the zygosity of the son. If he is homozygous a/a, the father must also be a/a (recall that we are conditioning on Jacquard state  Fig. 9, for a single marker with L = 2, 10 and 50 alleles. Shaded area shows one standard deviation below and above E(LR), for uniform allele frequencies In summary, This is a concave function with respect to f T . Figure 10 shows E(LR) and one standard deviation on each side as a function of founder inbreeding f T , for different number of alleles at a locus.

Discussion
In testing theory, the formulation of hypotheses is crucial. Kinship problems, as considered in this paper, are no exception. The convention of kinship testing is to compare two specific relationships using the LR. In most applications other than kinship problems, the hypotheses together span many, if not all, alternatives. For instance, a common example is testing of HWE against all possible deviations from HWE. In forensic genetics, H P : "paternity" is typically tested only against H D : "unrelated," not all other alternatives. For this reason, it becomes essential to study what happens when the truth is neither of these hypotheses. A pairwise non-inbred relationship can be presented by a point in the IBD triangle (see Fig. 2), or in general by the Jacquard coefficients (see Fig. 1). We have presented two ways of expressing the hypotheses and the true relationship; (i) through the Jacquard coefficients, and (ii) background relatedness or founder inbreeding. These approaches let us investigate the LR for a continuous range of relationships and values of background relatedness. In both cases, the impact on the LR has been studied by deriving exact expressions for its mean and variance. In the latter case, the required formula follows rather directly by extending results in [1] and [2]. Explicit formulas for the expected LR has been derived for several sets of relationships. In the case of Jacquard coefficients, the explicit formulas are complicated to derive, and they depend on allele frequencies. An exact expression is given also for the variance. However, as the variance depends on allele frequencies, simple closed formulas can only be derived in special cases. For general applications we rely instead on the exact numerical implementation freely available in the R library InbredLR accompanying this paper.
Equipped with the results of this paper, we can address the following question when presented with a standard LR comparing two completely specified hypotheses H P and H D : What if the true relationship between the individuals is not as stated by H P ? Or this slightly different question: What if the true relationship is restricted to some particular region of the IBD triangle. Obviously, the LR can be reevaluated to reflect the new specifications. However, the exact expressions for expectation and variance of the LR can in some cases directly allow for statements valid for a continuous range of alternatives. For instance, regions obtained by varying founder inbreeding have been displayed in Fig. 2. Assume a LR has been reported in a paternity case and that inbreeding in the father has been ignored. It is then useful to know that accounting for inbreeding would imply increase in the expected LR. This finding could be essential as there may not be data available to estimate the inbreeding coefficient for the father. Hence, exact LR calculation is not feasible.
Because the definition of "common ancestor" sometimes differs, there is a slight difference in the definition of IBD in the literature. The paper [16] gives three definitions of IBD: ancient IBD, recent IBD, and familial IBD. Our definition of IBD goes in the category of familial IBD, where "common ancestor" is restricted to a given pedigree.
The conventional approach to background relatedness in forensics is the so called theta (θ ) correction [12]. Typical values are θ ∈ (0.01, 0.03). The θ parameter applies on a population level. The genotype probabilities of all founders in the pedigree are modified compared with what HWE would give. Our approach does not model relatedness between founders, but offers a richer model of inbreeding, since individual inbreeding coefficients can be specified for each founder.
Several authors (see, e.g., [2] and the references therein) have discussed reporting the logarithm of the LR rather than the LR. Nice expressions like the ones presented for the expectation and the variance are then no longer available. In most cases, the LR is reported on the original scale. In some circumstances, as for paternity cases, the LR may be 0, and then, the logarithm is not defined. Many papers including [17] study the distribution of Z = log(LR) by simulation. Equipped with the exact expressions of this paper, Z could be analyzed without resorting to simulation, since the mean and variance of Z can be derived from the counterparts for the LR. However, if some allele frequencies are close to 0, Z is not well approximated by a normal distribution for a realistic number of markers. The reason for this is the large variance when allele frequencies are small. For instance, (25) shows an example where the expression for the variance include terms of the form 1/p a and these become large whenever the allele frequency p a is small. A similar problem related to small allele frequencies is discussed in the result section. This demonstrates that the center of the log(LR) distribution, calculated from the expectation of LR, can be inaccurate. However, this criticism applies to the use of LR instead of log(LR) in general, and not specifically to the expectations. We maintain that results like the ones presented for the expectation and variance have considerable theoretical interest, but should be used with caution in practice.
This paper has mainly addressed the likelihood ratio and its properties. The exclusion probability (EP), the probability that genotypes will be incompatible with a claimed relationship, is also an important statistic. The impact of founder inbreeding on EP is discussed in [3]. Figure 4 illustrates a case where the true inbred relationship is not known, and Fig. 5 shows the corresponding expected LR for a single marker. Increasing the number of markers will, in this paternity case, increase the inflation of the expected LR. This means that adding more markers to the LR computation will not solve the problem. In general, with a sufficient number of markers, the Jacquard, IBD, or inbreeding coefficients can be estimated accurately, and the true relationship detected. If such additional marker data is not available, the impact of inbreeding can be studied as exemplified by a paternity case with unknown inbreeding earlier in the discussion and as illustrated in, e.g., Fig. 5. As addressed in the "Introduction" section, different scenarios can be investigated and LR results can be evaluated in light of the analyses of these scenarios.
The present paper does not consider linked markers. For independent loci, the inbreeding coefficients contain sufficient information to compute the Jacquard coefficients needed in our formulas for LR. While a similar approach is conceivable also for linked markers, this would involve multi-locus coefficients, which is outside the scope of this work.
Funding Open Access funding provided by Norwegian University of Life Sciences.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval None required as no data from humans are used.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommonshorg/licenses/by/4.0/.

Appendix: Expectation and variance of LR
Below we derive the expressions for the expectation and variance of LR in the general pairwise case. Let J i denote Jacquard state i and Δ P i and Δ T i the probabilities of J i according to the relationship stated by H P and the true relationship respectively. LR Δ P ,Δ T is then defined as the likelihood ratio comparing H P :Δ P with H D :Δ D when the marker data comes from the relationship Δ T . Similarly, LR J i ,J j denotes the likelihood ratio comparing Jacquard state J i with unrelated, i.e., J 9 when the marker data are generated by J j .
Equation (15) In the case of no inbreeding, i.e., Δ 1 = · · · = Δ 6 = 0, the above expression reduces to (5). The part of the 9 × 9 matrix B 9 corresponding to (J 7 , J 8 , J 9 ) coincides with the matrix given in Eq. 6. Since E(LR J i ,J j ) = E(LR J j ,J i ), B 9 is symmetric. The elements of Since the expectation has been calculated, to derive the variance it remains only to find The matrices B 1 , . . . , B 9 are symmetric 9 × 9 matrices. The simplest of these matrices is B 9 , given in Table 4. In general, B i consists of the elements {E(LR J j ,J i LR J k ,J i )} j,k=1,..., 9 .
The values for i, j, k = 7, 8, 9 have been provided in the "Review of previous results" section. Entry (j, k) All matrices can in principle be found from the above expression, but exact calculations by hand become unpractical and exact numerical calculation is more reasonable.