Skip to main content


Log in

A general model for likelihood computations of genetic marker data accounting for linkage, linkage disequilibrium, and mutations

  • Original Article
  • Published:
International Journal of Legal Medicine Aims and scope Submit manuscript


Several applications necessitate an unbiased determination of relatedness, be it in linkage or association studies or in a forensic setting. An appropriate model to compute the joint probability of some genetic data for a set of persons given some hypothesis about the pedigree structure is then required. The increasing number of markers available through high-density SNP microarray typing and NGS technologies intensifies the demand, where using a large number of markers may lead to biased results due to strong dependencies between closely located loci, both within pedigrees (linkage) and in the population (allelic association or linkage disequilibrium (LD)). We present a new general model, based on a Markov chain for inheritance patterns and another Markov chain for founder allele patterns, the latter allowing us to account for LD. We also demonstrate a specific implementation for X chromosomal markers that allows for computation of likelihoods based on hypotheses of alleged relationships and genetic marker data. The algorithm can simultaneously account for linkage, LD, and mutations. We demonstrate its feasibility using simulated examples. The algorithm is implemented in the software FamLinkX, providing a user-friendly GUI for Windows systems (FamLinkX, as well as further usage instructions, is freely available at Our software provides the necessary means to solve cases where no previous implementation exists. In addition, the software has the possibility to perform simulations in order to further study the impact of linkage and LD on computed likelihoods for an arbitrary set of markers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others


  1. Abecasis GR, Wigginton JE (2005) Handling marker-marker linkage disequilibrium: pedigree analysis with clustered markers. Am J Hum Genet 77(5):754–67

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97–101

    Article  CAS  PubMed  Google Scholar 

  3. Boyles AL, Scott WK, Martin ER, Schmidt S, Li YJ, Ashley-Koch A, Bass MP, Schmidt M, Pericak-Vance MA, Speer MC, Hauser ER (2005) Linkage disequilibrium inflates type I error rates in multipoint linkage analysis when parental genotypes are missing. Hum Hered 59(4):220–227

    Article  PubMed Central  PubMed  Google Scholar 

  4. Brinkmann B, Klintschar M, Neuhuber F, Huhne J, Rolf B (1998) Mutation rate in human microsatellites: influence of the structure and length of the tandem repeat. Am J Hum Genet 62(6):1408–1415

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. Chakraborty R, Stivers DN, Zhong Y (1996) Estimation of mutation rates from parentage exclusion data: applications to STR and VNTR loci. Mutat Res 354(1):41–48

    Article  PubMed  Google Scholar 

  6. Dawid AP, Mortera J, Pascali VL (2001) Non-fatherhood or mutation? A probabilistic approach to parental exclusion in paternity testing. Forensic Sci Int 124(1):55–61

    Article  CAS  PubMed  Google Scholar 

  7. Egeland T, Sheehan N (2008) On identification problems requiring linked autosomal markers. Forensic Sci Int Genet 2(3):219–25

    Article  PubMed  Google Scholar 

  8. Elston RC, Stewart J (1971) A general model for the genetic analysis of pedigree data. Hum Hered 21(6):523–42

    Article  CAS  PubMed  Google Scholar 

  9. Gudbjartsson DF, Jonasson K, Frigge ML, Kong A (2000) Allegro, a new computer program for multipoint linkage analysis. Nat Genet 25(1):12–13

    Article  CAS  PubMed  Google Scholar 

  10. Huang Q, Shete S, Amos CI (2004) Ignoring linkage disequilibrium among tightly linked markers induces false-positive evidence of linkage for affected sib pair analysis. Am J Hum Genet 75(6):1106–1112

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  11. Idury R, Elston R (1997) A faster and more general hidden markov model algorithm for multipoint likelihood calculations. Hum Hered 47(4):197–202

    Article  CAS  PubMed  Google Scholar 

  12. Kling D, Egeland T, Tillmar AO (2012a) Famlink-a user friendly software for linkage calculations in family genetics. Forensic Sci Int: Genet 6(5):616–620

    Article  CAS  Google Scholar 

  13. Kling D, Welander J, Tillmar A, Skare Ø Egeland T, Holmlund G (2012b) DNA microarray as a tool in establishing genetic relatedness—current status and future prospects. Forensic Sci Int: Genet 6(3):322–329

    Article  CAS  Google Scholar 

  14. Krawczak M (2007) Kinship testing with X-chromosomal markers: mathematical and statistical issues. Forensic Sci Int: Genet 1(2):111–114

    Article  Google Scholar 

  15. Kruglyak L, Lander ES (1998) Faster multipoint linkage analysis using fourier transforms. J Comput Biol 5(1):1–7

    Article  CAS  PubMed  Google Scholar 

  16. Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES (1996) Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 58(6):1347

    PubMed Central  CAS  PubMed  Google Scholar 

  17. Kurbasic A, Hossjer O (2008) A general method for linkage disequilibrium correction for multipoint linkage and association. Genet Epidemiol 32(7):647–57

    Article  PubMed  Google Scholar 

  18. Lander ES, Green P (1987) Construction of multilocus genetic linkage maps in humans. Proc Natl Acad Sci U S A 84(8):2363–7

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Nothnagel M, Szibor R, Vollrath O, Augustin C, Edelmann J, Geppert M, Alves C, Gusmao L, Vennemann M, Hou Y, Immel UD, Inturri S, Luo H, Lutz-Bonengel S, Robino C, Roewer L, Rolf B, Sanft J, Shin KJ, Sim JE, Wiegand P, Winkler C, Krawczak M, Hering S (2012) Collaborative genetic mapping of 12 forensic short tandem repeat (str) loci on the human x chromosome. Forensic Sci Int Genet 6(6):778–84

    Article  CAS  PubMed  Google Scholar 

  20. Pinto N, Gusmao L, Amorim A (2011) X-chromosome markers in kinship testing: a generalisation of the IBD approach identifying situations where their contribution is crucial. Forensic Sci Int Genet 5(1):27–32

    Article  CAS  PubMed  Google Scholar 

  21. Pinto N, Silva PV, Amorim A (2012) A general method to assess the utility of the x-chromosomal markers in kinship testing. Forensic Sci Int Genet 6(2):198–207

    Article  CAS  PubMed  Google Scholar 

  22. Skare O, Sheehan N, Egeland T (2009) Identification of distant family relationships. Bioinformatics 25(18):2376–82

    Article  CAS  PubMed  Google Scholar 

  23. Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68(4):978–89

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. Szibor R (2007) X-chromosomal markers: past, present and future. Forensic Sci Int Genet 1(2):93–9

    Article  PubMed  Google Scholar 

  25. Szibor R, Krawczak M, Hering S, Edelmann J, Kuhlisch E, Krause D (2003) Use of X-linked markers for forensic purposes. Int J Legal Med 117(2):67–74

    CAS  PubMed  Google Scholar 

  26. Tillmar AO (2012) Population genetic analysis of 12 X-STRs in Swedish population. Forensic Sci Int Genet 6(2):e80–81

    Article  CAS  PubMed  Google Scholar 

  27. Tillmar AO, Egeland T, Lindblom B, Holmlund G, Mostad P (2011) Using X-chromosomal markers in relationship testing: calculation of likelihood ratios taking both linkage and linkage disequilibrium into account. Forensic Sci Int Genet 5(5):506–511

    Article  CAS  PubMed  Google Scholar 

  28. Weir BS, Anderson AD, Hepler AB (2006) Genetic relatedness analysis: modern data and new challenges. Nat Rev Genet 7(10):771–780

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Daniel Kling.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(DOC 35.0 KB)

(DOC 130 KB)



The following section includes a more detailed description of the notation used in the paper. First, we assume locus i, (i = 1, … , I) has A i possible alleles, and let p i be a vector specifying the probabilities of a haplotype’s alleles at locus i given the haplotype’s alleles at lower indexes. We let r 2, … , r I denote the recombination rates between the loci, which are assumed known. For a locus i, let t be a transmission, specifying a start allele in the parent, a resulting allele in the child, and whether the parent is a mother or a father. We then denote with m i (t) the probability that the child obtains the resulting allele, given that the parent has the start allele. This function specifies the mutation model at locus i. The parameters of our model are p = (p 1, … , p I ), r = (r 2, … , r I ), and m = (m 1, … , m I ).

If parents’ alleles follow the population frequencies, the probabilities for a child to have various alleles are not given by the population frequencies, unless the process represented by the mutation model happens to have the population frequencies as stationary distribution. This means that adding the untyped father or mother to a person in the pedigree may change the probability results we are computing. To avoid this nuisance, we recommend that all untyped founders with only one child in the pedigree are (recursively) removed prior to computations. In our pedigree, a person may have specified no parents, only a mother, only a father, or both parents. Founders are those who have no parents in the pedigree. We also assume the pedigree does not contain untyped children with no descendants as such children cannot affect the result.

Our observed data is divided into data s for S typed founders and data d for M typed non-founders: Let s i j for i = 1, … , I, j = 1, … , S denote the observed allele or alleles of typed founder j at locus i. For males and X- chromosomal data, s i j specifies only one allele, otherwise s i j specifies the two observed alleles in no particular order. For the typed non-founders, let d i j specify the similar data. We write s i = (s i1, … , s i S ), s = (s 1, … , s I ), d i = (d i1, … , d i M ), and d = (d 1, … , d I ).

We also need a number of ancillary variables: The inheritance pattern at locus i can be described as a vector v i of length N, with one component for each parent-child relationship in the pedigree when the locus is autosomal, and one for each mother-child relationship for X- chromosomal loci. Each component is 0 or 1 depending on whether the paternal or maternal allele is inherited, we write v = (v 1, … , v I ). We also need to describe the founder alleles of the pedigree: These are maternal or paternal alleles whose relevant parent is not in the pedigree. First, there are founder alleles belonging to typed founders: Let g i j be the allele or alleles of typed founder j at locus i listed with the paternal allele first. Write g i = (g i1, … , g i S ) and g = (g 1, … , g I ). For the remaining F founder alleles, let f i j denote the j t h founder allele at locus i. Finally, we write f i = (f i1, … , f i F ) and f = (f 1, … , f I ).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kling, D., Tillmar, A., Egeland, T. et al. A general model for likelihood computations of genetic marker data accounting for linkage, linkage disequilibrium, and mutations. Int J Legal Med 129, 943–954 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: