Human Genetics

, Volume 124, Issue 6, pp 579–591

Lactose digestion and the evolutionary genetics of lactase persistence


  • Catherine J. E. Ingram
    • Department of Genetics Evolution and EnvironmentUniversity College London
  • Charlotte A. Mulcare
    • Department of Genetics Evolution and EnvironmentUniversity College London
  • Yuval Itan
    • Department of Genetics Evolution and EnvironmentUniversity College London
    • Centre for Mathematics and Physics in the Life Sciences and Experimental Biology, CoMPLEXUniversity College London
  • Mark G. Thomas
    • Department of Genetics Evolution and EnvironmentUniversity College London
    • Department of Genetics Evolution and EnvironmentUniversity College London
Review Article

DOI: 10.1007/s00439-008-0593-6

Cite this article as:
Ingram, C.J.E., Mulcare, C.A., Itan, Y. et al. Hum Genet (2009) 124: 579. doi:10.1007/s00439-008-0593-6


It has been known for some 40 years that lactase production persists into adult life in some people but not in others. However, the mechanism and evolutionary significance of this variation have proved more elusive, and continue to excite the interest of investigators from different disciplines. This genetically determined trait differs in frequency worldwide and is due to cis-acting polymorphism of regulation of lactase gene expression. A single nucleotide polymorphism located 13.9 kb upstream from the lactase gene (C-13910 > T) was proposed to be the cause, and the −13910*T allele, which is widespread in Europe was found to be located on a very extended haplotype of 500 kb or more. The long region of haplotype conservation reflects a recent origin, and this, together with high frequencies, is evidence of positive selection, but also means that −13910*T might be an associated marker, rather than being causal of lactase persistence itself. Doubt about function was increased when it was shown that the original SNP did not account for lactase persistence in most African populations. However, the recent discovery that there are several other SNPs associated with lactase persistence in close proximity (within 100 bp), and that they all reside in a piece of sequence that has enhancer function in vitro, does suggest that they may each be functional, and their occurrence on different haplotype backgrounds shows that several independent mutations led to lactase persistence. Here we provide access to a database of worldwide distributions of lactase persistence and of the C-13910*T allele, as well as reviewing lactase molecular and population genetics and the role of selection in determining present day distributions of the lactase persistence phenotype.

Supplementary material

439_2008_593_MOESM1_ESM.pdf (809 kb)
Supplementary figure 1a Interpolated maps of the ‘old world’ showing the distribution of (a) lactase persistence data taken from the literature (Supplementary data Table 1), (b) -13910*T distribution (c) lactase persistence frequency predicted from -13910*T distribution, using the data collection to be found in Supplementary data Table 3. Maps were generated using PYNGL ( Only includes individuals over 12 years of age, who are unrelated, and literature for which the original publications have been located and checked. Articles in which there was clear selection bias, and recent immigrant populations are excluded, but the data can be found in Supplementary data Table 1. The Americas are excluded from all maps because of the paucity of data. Most data are obtained from lactose tolerance tests using either breath hydrogen or blood glucose, though in some cases enzyme assay data were available. Locations were either as described precisely in the publication, or taken from capital cities or central points of a country or region where precise location is not mentioned. Where more than one data set was available weighted averages of the data were taken. Predicted frequency taken to be p2+ 2pq, where p is the frequency of -13910*T. Data points are shown as dots. It should be noted that the interpolation is inaccurate where there are few data points (PDF 809 kb)
439_2008_593_MOESM2_ESM.pdf (752 kb)
Supplementary figure 1b See supplementary figure 1a for legend (PDF 752 kb)
439_2008_593_MOESM3_ESM.pdf (766 kb)
Supplementary figure 1c See supplementary figure 1a for legend (PDF 766 kb)
439_2008_593_MOESM4_ESM.pdf (40 kb)
Primary source literature references of lactase persistence and lactose tolerance data used for maps depicting geographic distribution. Columns show numbers of people tested, country of origin, ethnic group, test method, and whether or not the data fulfilled all the criteria for inclusion (original reference found and checked; unrelated individuals; age 12 or more; unbiased selection criteria - e.g. not selected from patients with diarrhoea). Reasons for non-inclusion are shown in the notes. In those cases where children or family members were individually identifiable, they were excluded from the data sets and this is reflected in the numbers given. Recent immigrant populations are excluded from the maps shown in the review article. Locations (longitude and latitude) were either as described precisely in the original publication, or taken from capital cities or central points of a country in those cases that the precise location is not mentioned. Data included only in review articles were not used. Reviews searched for source references include: Flatz, (1987); Scrimshaw and Murray, (1988); Swallow and Hollox, (1999) (PDF 39.5 kb)
439_2008_593_MOESM5_ESM.pdf (29 kb)
Estimates of lactase persistence frequency in different countries obtained by adjusting for population census size (taken from CIA data; World Fact Book (PDF 29.0 kb)
439_2008_593_MOESM6_ESM.pdf (22 kb)
Literature and own frequency data for -13910*T. Data taken from SNP typing tests as well as from resequencing. Predicted lactase persistence frequency attributable to this allele taken to be p2 + 2pq (PDF 21.7 kb)

Copyright information

© Springer-Verlag 2008