Statistical Human Genetics pp 11-24

Part of the Methods in Molecular Biology book series (MIMB, volume 850)

Identification of Genotype Errors



It has been documented that there exist some errors in most large genotype datasets and that an error rate of 1–2% is adequate to lead to the distortion of map distance as well as a false conclusion of linkage (Abecasis et al. Eur J Hum Genet 9(2):130–134, 2001), therefore one needs to ensure that the data are as clean as possible. On the other hand, the process of data cleaning is tedious and demands efforts and experience. O’Connell and Weeks implemented four error-checking algorithms in computer software called PedCheck. In this chapter, the four algorithms implemented in PedCheck are discussed with a focus on the genotype-elimination method. Furthermore, an example for four levels of error checking permitted by PedCheck is provided with the required input files. In addition, alternative algorithms implemented in other statistical computing programs are also briefly discussed.

Key words

Genotype Genotype error Parametric linkage analysis LOD score Computational efficiency Automatic genotype elimination Nuclear-pedigree method Genotype-elimination method Critical-genotype method Odds-ratio method 


  1. 1.
    Abecasis GR, Cherny SS, Cardon LR. (2001) The impact of genotyping error on family-based analysis of quantitative traits. Eur J Hum Genet. 9(2):130–134.PubMedCrossRefGoogle Scholar
  2. 2.
    Morton NE.(1955) Sequential tests for the detection of linkage. Am J Hum Genet 7(3): 277–318.PubMedGoogle Scholar
  3. 3.
    Ott J.(1974) Estimation of the recombination fraction in human pedigrees: efficient computation of the likelihood for human linkage studies. Am J Hum Genet. 26(5):588–597.PubMedGoogle Scholar
  4. 4.
    Lathrop GM, Lalouel JM (1984) Easy calculations of LOD scores and genetic risks on small computers. Am J Hum Genet. 36(2):460–465.PubMedGoogle Scholar
  5. 5.
    Lathrop GM, Lalouel JM, Julier C, Ott J. (1984) Strategies for multilocus linkage analysis in humans. Proc Natl Acad Sci USA. 81(11):3443–3446.PubMedCrossRefGoogle Scholar
  6. 6.
    Lathrop GM, Lalouel JM, White RL. (1986) Construction of human linkage maps: likelihood calculations for multilocus linkage analysis. Genet Epidemiol. 3(1):39–52.PubMedCrossRefGoogle Scholar
  7. 7.
    Elston RC, Stewart J. (1971) A general model for the genetic analysis of pedigree data. Hum Hered. 21(6):523–542.PubMedCrossRefGoogle Scholar
  8. 8.
    Ott J.(1999) Analysis of Human Genetics Linkage. Baltimore: Hopkins University Press.Google Scholar
  9. 9.
    Lange K, Goradia TM. (1987) An algorithm for automatic genotype elimination. Am J Hum Genet. 40(3):250–256.PubMedGoogle Scholar
  10. 10.
    Lange K, Boehnke M. (1983) Extensions to pedigree analysis. V. Optimal calculation of Mendelian likelihoods. Hum Hered. 33(5): 291–301.PubMedCrossRefGoogle Scholar
  11. 11.
    Stringham HM, Boehnke M. (1996) Identifying marker typing incompatibilities in linkage analysis. Am J Hum Genet. 59(4):946–950.PubMedGoogle Scholar
  12. 12.
    Lange K, Weeks D, Boehnke M. (1988) Programs for Pedigree Analysis: MENDEL, FISHER, and dGENE. Genet Epidemiol. 5(6):471–472.PubMedCrossRefGoogle Scholar
  13. 13.
    Sobel E, Papp JC, Lange K.(2002) Detection and integration of genotyping errors in statistical genetics. Am J Hum Genet. 70(2): 496–508.PubMedCrossRefGoogle Scholar
  14. 14.
    O’Connell JR, Weeks DE. (1998) PedCheck a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet. 63(1):259–266.PubMedCrossRefGoogle Scholar
  15. 15.
    Lange K, Weeks DE.(1989) Efficient computation of LOD scores: genotype elimination, genotype redefinition, and hybrid maximum likelihood algorithms. Ann Hum Genet. 53(Pt 1):67–83.PubMedCrossRefGoogle Scholar
  16. 16.
    Terwilliger JD, Ott J. (1994) Handbook of Human Genetics Linkage. 1 ed. The Johns Hopkins University Press.Google Scholar
  17. 17.
    Sobel E, Lange K. (1996) Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. Am J Hum Genet. 58(6):1323–1337.PubMedGoogle Scholar
  18. 18.
    Abecasis GR, Cherny SS, Cookson WO, Cardon LR. (2002) Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 30(1):97–101.PubMedCrossRefGoogle Scholar
  19. 19.
    Broman KW, Weber JL. (1998) Estimation of pairwise relationships in the presence of genotyping errors. Am J Hum Genet. 63(5): 1563–1564.PubMedCrossRefGoogle Scholar
  20. 20.
    Sobel E, Sengul H, Weeks DE.(2001) Multipoint estimation of identity-by-descent probabilities at arbitrary positions among marker loci on general pedigrees. Hum Hered. 52(3): 121–131.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Unit of Statistical Genomics, Intramural Research ProgramNational Institute of Mental Health, National Institutes of HealthBethesdaUSA
  2. 2.Department of Psychiatry and Behavioral SciencesThe Johns Hopkins University School of MedicineBaltimoreUSA

Personalised recommendations