Skip to main content

Advertisement

Log in

Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Genetic methods can complement epidemiological surveys and clinical registries in determining prevalence of monogenic autosomal recessive diseases. Several large population-based genetic databases, such as the NHLBI GO Exome Sequencing Project, are now publically available. By assuming Hardy–Weinberg equilibrium, the frequency of individuals homozygous in the general population for a particular pathogenic allele can be directly calculated from a sample of chromosomes where some harbor the pathogenic allele. Further assuming that the penetrance of the pathogenic allele(s) is known, the prevalence of recessive phenotypes can be determined. Such work can inform public health efforts for rare recessive diseases. A Bayesian estimation procedure has yet to be applied to the problem of estimating disease prevalence from large population-based genetic data. A Bayesian framework is developed to derive the posterior probability density of monogenic, autosomal recessive phenotypes. Explicit equations are presented for the credible intervals of these disease prevalence estimates. A primary impediment to performing accurate disease prevalence calculations is the determination of truly pathogenic alleles. This issue is discussed, but in many instances remains a significant barrier to investigations solely reliant on statistical interrogation—functional studies can provide important information for solidifying evidence of variant pathogenicity. We also discuss several challenges to these efforts, including the population structure in the sample of chromosomes, the treatment of allelic heterogeneity, and reduced penetrance of pathogenic variants. To illustrate the application of these methods, we utilized recently published genetic data collected on a large sample from the Schmiedeleut Hutterites. We estimate prevalence and calculate 95 % credible intervals for 13 autosomal recessive diseases using these data. In addition, the Bayesian estimation procedure is applied to data from a central European study of hereditary fructose intolerance. The methods described herein show a viable path to robustly estimating both the expected prevalence of autosomal recessive phenotypes and corresponding credible intervals using population-based genetic databases that have recently become available. As these genetic databases increase in number and size with the advent of cost-effective next-generation sequencing, we anticipate that these methods and approaches may be helpful in recessive disease prevalence calculations, potentially impacting public health management, health economic analyses, and treatment of rare diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Battaile KP, Battaile BC, Merkens LS, Maslen CL, Steiner RD (2001) Carrier frequency of the common mutation IVS8-1G>C in DHCR7 and estimate of the expected incidence of Smith–Lemli–Opitz syndrome. Mol Genet Metab 72(1):67–71

    Article  CAS  PubMed  Google Scholar 

  • Beales PL, Elcioglu N, Woolf AS, Parker D, Flinter FA (1999) New criteria for improved diagnosis of Bardet–Biedl syndrome: results of a population survey. J Med Genet 36:437–446

    PubMed Central  CAS  PubMed  Google Scholar 

  • Browning SR, Thompson EA (2012) Detecting rare variant associations by identity-by-descent mapping in case–control studies. Genetics 190:1521–1531

    Article  PubMed Central  PubMed  Google Scholar 

  • Chong JX, Ouwenga R, Anderson RL, Waggoner DJ, Ober C (2012) A population-based study of autosomal-recessive disease-causing mutations in a founder population. Am J Hum Genet 91:608–620

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Cooper DN, Krawczak M, Polychronakis C, Tyler-Smith C, Kehrer-Sawatzki H (2013) Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease. Hum Genet 132:1077–1130

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Evans M, Hastings N, Peacock B (1993) Statistical Distributions, 2nd edn. Wiley, USA

    Google Scholar 

  • Ewens WJ (2004) Mathematical Population Genetics, 2nd edn. Springer-Verlag, New York

    Book  Google Scholar 

  • Fitterer B, Hall P, Antonishyn N, Desikan R, Gelb M, Lehotay D (2014) Incidence and carrier frequency of Sandhoff disease in Saskatchewan determined using novel substrate with detection by tandem mass spectrometry and molecular genetic analysis. Mol Genet Metabol 111(3):382–389

    Article  CAS  Google Scholar 

  • Fu W, O’Connor TD, Jun G, Kang HM et al (2013) Analysis of 6515 exomes reveals the recent origin of most human protein-coding variants. Nature 493(7431):216–220

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Griffiths RC, Tavare S (1998) The age of a mutation in a general coalescent tree. Commun Stat Stoch Models 14(1–2):273–295

    Article  Google Scholar 

  • Hardy GH (1908) Mendelian proportions in a mixed population. Science 28(706):49–50

    Article  CAS  PubMed  Google Scholar 

  • Harnandez-Hernandez V, Pravincumar P, Diaz-Font A et al (2013) Bardet–Biedl syndrome proteins control the cilia length through regulation of actin polymerization. Hum Mol Genet 22(19):3858–3868

    Article  Google Scholar 

  • Hartl DL, Clark AG (1989) Principles of population genetics, 2nd edn. Sinauer Associates, Sunderland

    Google Scholar 

  • Hostetler JA (1974) Hutterite Society. Johns Hopkins University Press, Baltimore

  • Jaynes ET (1976) Confidence intervals vs bayesian intervals. In: Harper AL, Hooker CA (eds) Foundations of probability, statistical inference, and statistical theories of science

  • Kim GH, Yang JY, Park JY, Lee JJ, Kim JH, Yoo HW (2008) Estimation of Wilson’s disease incidence and carrier frequency in the Korean population by screening ATP7B major mutations in newborn filter papers using the SYBR green intercalator method on the amplification refractory mutation system. Genet Test 12(3):395–399

    Article  CAS  PubMed  Google Scholar 

  • Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738

    PubMed Central  CAS  PubMed  Google Scholar 

  • Kimura M, Ohta T (1973) The age of a neutral mutant persisting in a finite population. Genetics 75:199–212

    PubMed Central  CAS  PubMed  Google Scholar 

  • King CR, Rathouz PJ, Nicolae DL (2010) An evolutionary framework for association testing in resequencing studies. PLoS Genet 6:e1001202

    Article  PubMed Central  PubMed  Google Scholar 

  • Li M-H, Stranden I, Tiirikka T, Sevon-Aimonen M-L, Kantanen J (2011) A comparison of approaches to estimate the inbreeding coefficient and pairwise relatedness using genomic and pedigree data in a sheep population. PLoS One 6(11):e26256

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Lyahyai J, Sbiti A, Barkat A, Ratbi I, Sefiani A (2012) Spinal muscular atrophy carrier frequency and estimated prevalence of the disease in Moroccan newborns. Genet Testing Mol Biomark 16(3):215–218

    Article  CAS  Google Scholar 

  • MacArthur DG, Balasubramanian S, Frankish A, Huang N et al (2012) A systematic survey of loss-of-function variants in the human protein-coding genes. Science 335(6070):823–828

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • MacArthur DG, Manolio TA, Dimmock DP, Rahm HL et al (2014) Guidelines for investigating causality of sequence variants in human disease. Nature 508:469–476

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Nowaczyk MJM, Waye JS, Douketis JD (2006) DHCR7 mutation carrier rates and prevalence of the RSH/Smith–Lemli–Opitz syndrome: where are the patients? Am J Med Genet Part A 140A:2057–2062

    Article  CAS  Google Scholar 

  • Ober C, Cox NJ, Abney M, DiRienzo A et al (1998) Genome-wide search for asthma susceptibility loci in a founder population. The Collaborative Study on the Genetics of Asthma. Hum Mol Genet 7(9):1393–1398

    Article  CAS  PubMed  Google Scholar 

  • Ricard G, Molina J, Chrast J, Gu W et al (2010) Phenotypic consequences of copy number variation: insights from Smith–Magenis and Potocki–Lupski Syndrome mouse models. PLoS Biol 8(11):e1000543

    Article  PubMed Central  PubMed  Google Scholar 

  • Riordan JR, Rommens JM, Kerem B, Alon N et al (1989) Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. Science 245:1066–1073

    Article  CAS  PubMed  Google Scholar 

  • Rowe SM, Miller S, Sorscher EJ (2005) Cystic fibrosis. N Engl J Med 352:1992–2001

    Article  CAS  PubMed  Google Scholar 

  • Santer R, Rischewski J, von Weihe M, Niederhaus M et al (2005) The spectrum of Aldolase B (ALDOB) mutations and the prevalence of hereditary fructose intolerance in Central Europe. Hum Mut 25(6):594

    Article  PubMed  Google Scholar 

  • Stenson PD, Mort M, Ball EV, Shaw K, Phillips AD, Cooper DN (2014) The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet 133(1):1–9

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Tabor HK, Auer PL, Jamal SM, Chong JX et al (2014) Pathogenic variants for Mendelian and complex traits in exomes of 6517 European and African Americans: implications for the return of incidental results. Am J Hum Genet 95(2):183–193

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altshuler D, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD, Bamshad MJ, Akey JM, Broad GO, Seattle GO, on behalf of the NHLBI Exome Sequencing Project (2012) Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337(6069):64–69 (PMID: 22604720)

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Thornton KR, Foran AJ, Long AD (2013) Properties and modeling of GWAS when complex disease risk is due to non-complementing, deleterious mutations in genes of large effect. PLoS Genet 9(2):e1003258

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Tomita Y, Takeda A, Okinaga S, Tagami H, Shibahara S (1989) Human oculocutaneous albinism caused by single base insertion in the tyrosinase gene. Biochem Biophys Res Commun 164:990–996

    Article  CAS  PubMed  Google Scholar 

  • Tripathi RK, Droetto S, Spritz RA (1992) Many patients with ‘tyrosinase-positive’ oculocutaneous albinism have tyrosinase gene mutations (abstract). Am J Hum Genet 51(suppl):A179

    Google Scholar 

  • Wahlund S (1928) Zusammensetzung von Population und Korrelationserscheinung vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas 11:65–106

    Article  Google Scholar 

  • Weinberg W (1908) Über den Nachweis der Vererbung beim Menschen. Jahreshefte des Vereins für vaterländische Naturkunde in Württemberg 64:368–382

    Google Scholar 

  • Wright S (1921) Systems of mating, I-V. Genetics 6:111–178

    PubMed Central  CAS  PubMed  Google Scholar 

  • Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159

    PubMed Central  CAS  PubMed  Google Scholar 

  • Wright S (1937) The distribution of gene frequencies in populations. Proc Natl Acad Sci 31(12):382–389

    Article  Google Scholar 

  • Xue Y, Chen Y, Ayub Q, Huang N et al (2012) Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing. Am J Hum Genet 91(6):1022–1032

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Zhang L, Karsten P, Hamm S, Pogson JH et al (2013) TRAP1 rescues PINK1 loss-of-function phenotypes. Hum Mol Genet 22(14):2829–2841

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven J. Schrodi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schrodi, S.J., DeBarber, A., He, M. et al. Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. Hum Genet 134, 659–669 (2015). https://doi.org/10.1007/s00439-015-1551-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-015-1551-8

Keywords

Navigation