Identification of Genotype Errors

  • Jeffery O’Connell
  • Yin Yao
Part of the Methods in Molecular Biology book series (MIMB, volume 1666)


It has been documented that there exist some errors in most large genotype datasets and that an error rate of 1–2% is sufficient to lead to the distortion of map distance as well as a false conclusion of linkage (Abecasis et al., Eur J Hum Genet 9:130–134, 2001), therefore one needs to ensure that the data are as clean as possible. On the other hand, the process of data cleaning is tedious and demands effort and experience. O’Connell and Weeks implemented four error-checking algorithms in computer software called PedCheck. In this chapter, the four algorithms implemented in PedCheck are discussed with a focus on the genotype-elimination method. Furthermore, an example for four levels of error checking permitted by PedCheck is provided with the required input files. In addition, alternative algorithms implemented in other statistical computing programs are also briefly discussed.

Key words

Genotype Genotype error Parametric linkage analysis LOD score Computational efficiency Automatic genotype elimination Nuclear pedigree method Genotype-elimination method Critical genotype method Odds ratio method 



The views expressed in this chapter do not necessarily represent the views of the NIMH, NIH, HHS, or the US Government.


  1. 1.
    Morton NE (1955) Sequential tests for the detection of linkage. Am J Hum Genet 7(3):277–318PubMedPubMedCentralGoogle Scholar
  2. 2.
    Ott J (1974) Estimation of the recombination fraction in human pedigrees: efficient computation of the likelihood for human linkage studies. Am J Hum Genet 26(5):588–597PubMedPubMedCentralGoogle Scholar
  3. 3.
    Lathrop GM, Lalouel JM (1984) Easy calculations of lod scores and genetic risks on small computers. Am J Hum Genet 36(2):460–465PubMedPubMedCentralGoogle Scholar
  4. 4.
    Lathrop GM, Lalouel JM, Julier C, Ott J (1984) Strategies for multilocus linkage analysis in humans. Proc Natl Acad Sci U S A 81(11):3443–3446CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Lathrop GM, Lalouel JM, White RL (1986) Construction of human linkage maps: likelihood calculations for multilocus linkage analysis. Genet Epidemiol 3(1):39–52CrossRefPubMedGoogle Scholar
  6. 6.
    Elston RC, Stewart J (1971) A general model for the genetic analysis of pedigree data. Hum Hered 21(6):523–542CrossRefPubMedGoogle Scholar
  7. 7.
    Ott J (1999) Analysis of human genetics linkage. Hopkins University Press, BaltimoreGoogle Scholar
  8. 8.
    Lange K, Goradia TM (1987) An algorithm for automatic genotype elimination. Am J Hum Genet 40(3):250–256PubMedPubMedCentralGoogle Scholar
  9. 9.
    Lange K, Boehnke M (1983) Extensions to pedigree analysis. V. Optimal calculation of Mendelian likelihoods. Hum Hered 33(5):291–301CrossRefPubMedGoogle Scholar
  10. 10.
    Stringham HM, Boehnke M (1996) Identifying marker typing incompatibilities in linkage analysis. Am J Hum Genet 59(4):946–950PubMedPubMedCentralGoogle Scholar
  11. 11.
    Lange K, Weeks D, Boehnke M (1988) Programs for Pedigree Analysis: MENDEL, FISHER, and dGENE. Genet Epidemiol. 5(6):471–472CrossRefPubMedGoogle Scholar
  12. 12.
    Sobel E, Papp JC, Lange K (2002) Detection and integration of genotyping errors in statistical genetics. Am J Hum Genet 70(2):496–508CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    O'Connell JR, Weeks DE (1998) PedCheck a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet 63(1):259–266CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Lange K, Weeks DE (1989) Efficient computation of lod scores: genotype elimination, genotype redefinition, and hybrid maximum likelihood algorithms. Ann Hum Genet 53(Pt 1):67–83CrossRefPubMedGoogle Scholar
  15. 15.
    Terwilliger JD, Ott J (1994) Handbook of human genetics linkage, 1st edn. The Johns Hopkins University Press, BaltimoreGoogle Scholar
  16. 16.
    Sobel E, Lange K (1996) Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. Am J Hum Genet 58(6):1323–1337PubMedPubMedCentralGoogle Scholar
  17. 17.
    Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97–101CrossRefPubMedGoogle Scholar
  18. 18.
    Broman KW, Weber JL (1998) Estimation of pairwise relationships in the presence of genotyping errors. Am J Hum Genet 63(5):1563–1564CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Sobel E, Sengul H, Weeks DE (2001) Multipoint estimation of identity-by-descent probabilities at arbitrary positions among marker loci on general pedigrees. Hum Hered 52(3):121–131CrossRefPubMedGoogle Scholar
  20. 20.
    Abecasis GR, Cherny SS, Cardon LR (2001) The impact of genotyping error on family-based analysis of quantitative traits. Eur J Hum Genet 9(2):130–134CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.University of MarylandBaltimoreUSA
  2. 2.Unit of Genomic StatisticsIntramural Research Program, National Institute of Mental HealthBethesdaUSA

Personalised recommendations