Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Determining the incidence of rare diseases


Extremely rare diseases are increasingly recognized due to wide-spread, inexpensive genomic sequencing. Understanding the incidence of rare disease is important for appreciating its health impact and allocating recourses for research. However, estimating incidence of rare disease is challenging because the individual contributory alleles are, themselves, extremely rare. We propose a new method to determine incidence of rare, severe, recessive disease in non-consanguineous populations that use known allele frequencies, estimate the combined allele frequency of observed alleles and estimate the number of causative alleles that are thus far unobserved in a disease cohort. Experiments on simulated and real data show that this approach is a feasible method to estimate the incidence of rare disease in European populations but due to several limitations in our ability to assess the full spectrum of pathogenic mutations serves as a useful tool to provide a lower threshold on disease incidence.

This is a preview of subscription content, log in to check access.


  1. Browning SR, Thompson EA (2012) Detecting rare variant associations by identity-by-descent mapping in case-control studies. Genetics 190:1521–1531. https://doi.org/10.1534/genetics.111.136937

  2. Cannizzo S, Lorenzoni V, Palla I et al (2018) Rare diseases under different levels of economic analysis: current activities, challenges and perspectives. RMD Open. https://doi.org/10.1136/rmdopen-2018-000794

  3. Cohen AC Jr (1960) Estimating the parameters of a modified poisson distribution. J Am Stat Assoc 55:139–143. https://doi.org/10.1080/01621459.1960.10482054

  4. Grier J, Hirano M, Karaa A et al (2018) Diagnostic odyssey of patients with mitochondrial disease: results of a survey. Neurol Genet 4:e230. https://doi.org/10.1212/NXG.0000000000000230

  5. Hardy GH (2003) Mendelian proportions in a mixed population. 1908. Yale J Biol Med 76:79–80

  6. Karczewski KJ, Weisburd B, Thomas B et al (2017) The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res 45:D840–D845. https://doi.org/10.1093/nar/gkw971

  7. Kobayashi Y, Yang S, Nykamp K et al (2017) Pathogenic variant burden in the ExAC database: an empirical approach to evaluating population data for clinical variant interpretation. Genome Med. https://doi.org/10.1186/s13073-017-0403-7

  8. Rode J (2005) Rare diseases: understanding this public health priority. EURORDIS, Paris, France. https://www.eurordis.org/IMG/pdf/princeps_document-EN.pdf

  9. Schrodi SJ, DeBarber A, He M et al (2015) Prevalence estimation for monogenic autosomal recessive diseases using population-based genetic data. Hum Genet 134:659–669. https://doi.org/10.1007/s00439-015-1551-8

  10. Updated Study Analyzes Use and Cost of Orphan Drugs (2018) In: NORD Natl Organ Rare Disord. https://rarediseases.org/updated-study-analyzes-use-and-cost-of-orphan-drugs/. Accessed 19 Nov 2019

  11. Valdez R, Ouyang L, Bolen J (2016) Public health and rare diseases: oxymoron no more. Prev Chronic Dis. https://doi.org/10.5888/pcd13.150491

  12. Weinberg W (1909) Über Vererbungsgesetze beim Menschen. Z Für Indukt Abstamm- Vererbungslehre 2:276–330. https://doi.org/10.1007/BF01975801

Download references


The author would like to thank Drs. Mario Cleves, Charlotte Hobbs, Michelle Clark, Svasti Haricharan, David Dimmock and Sara Raskin for commenting on the manuscript. They would also like to thank the TESS foundation, A 501(c)(3) nonprofit corporation, (https://tessresearch.org/) for providing cohort data. This work was funded in part by gifts from the Liguori Family, John Motter and Effie Simanikas, Ernest and Evelyn Rady, and Rady Children’s Hospital San Diego.

Author information

Correspondence to Matthew N. Bainbridge.

Ethics declarations

Conflict of interest

MNB is a member of the TESS scientific advisory board.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Supplementary material

Source code for the simulation program, the precomputed maximum likelihood lambda estimator and an example MAF file is available from https://github.com/mnb922/RareDiseaseEstimator.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bainbridge, M.N. Determining the incidence of rare diseases. Hum Genet (2020). https://doi.org/10.1007/s00439-020-02135-5

Download citation


  • Rare disease
  • Genetics
  • Incidence
  • Simulation