Lactose digestion and the evolutionary genetics of lactase persistence
- First Online:
- Cite this article as:
- Ingram, C.J.E., Mulcare, C.A., Itan, Y. et al. Hum Genet (2009) 124: 579. doi:10.1007/s00439-008-0593-6
- 4.9k Views
It has been known for some 40 years that lactase production persists into adult life in some people but not in others. However, the mechanism and evolutionary significance of this variation have proved more elusive, and continue to excite the interest of investigators from different disciplines. This genetically determined trait differs in frequency worldwide and is due to cis-acting polymorphism of regulation of lactase gene expression. A single nucleotide polymorphism located 13.9 kb upstream from the lactase gene (C-13910 > T) was proposed to be the cause, and the −13910*T allele, which is widespread in Europe was found to be located on a very extended haplotype of 500 kb or more. The long region of haplotype conservation reflects a recent origin, and this, together with high frequencies, is evidence of positive selection, but also means that −13910*T might be an associated marker, rather than being causal of lactase persistence itself. Doubt about function was increased when it was shown that the original SNP did not account for lactase persistence in most African populations. However, the recent discovery that there are several other SNPs associated with lactase persistence in close proximity (within 100 bp), and that they all reside in a piece of sequence that has enhancer function in vitro, does suggest that they may each be functional, and their occurrence on different haplotype backgrounds shows that several independent mutations led to lactase persistence. Here we provide access to a database of worldwide distributions of lactase persistence and of the C-13910*T allele, as well as reviewing lactase molecular and population genetics and the role of selection in determining present day distributions of the lactase persistence phenotype.
Lactase, the small intestinal enzyme responsible for cleaving lactose into its constituent absorbable monosaccharides, glucose and galactose, is essential for the nourishment of newborn mammals, whose sole source of nutrition is milk, in which lactose is the major carbohydrate component. In adult mammals other than humans lactase production decreases significantly in quantity following weaning (Buller et al. 1990; Lacey et al. 1994; Sebastio et al. 1989). Although individual differences in the ability of human adults to digest milk had been remarked upon in Roman times, variation in expression of lactase was not established as a genetically determined trait until the second half of the twentieth century. Indeed before this, expression of high levels of lactase in adulthood was considered by people of European descent to be the ‘normal’ state of affairs, and widespread deficiency of lactase in adults was only appreciated in the early 1960s (Auricchio et al. 1963; Dahlqvist et al. 1963).
Here, we review all aspects of this polymorphism from description of phenotype to molecular and evolutionary genetics. Since we had noted that the population distribution data available in many literature reviews contained anomalous information (as will be discussed below) we also provide access to a newly constructed database of phenotypic data taken from source publications.
Determination of lactase persistence status
People whose lactase persists at high levels throughout adult life are said to be lactase persistent while those with little lactase as adults are described as lactase non-persistent (also referred to in the literature as primary adult hypolactasia). Since taking intestinal biopsies from healthy people is invasive and not acceptable unless the person is having other investigations, lactase persistence status is often inferred by a method depending on lactose digestion. This allows people to be classified as lactose digesters and maldigesters. This difference in digestion is measured by a test traditionally known as a ‘lactose tolerance test’ and thus the terms tolerant and intolerant are sometimes used, though this can be confused with dietary intolerance.
The lactose tolerance test usually involves giving a lactose load after an overnight fast and then measuring blood glucose or breath hydrogen. A baseline measurement of blood glucose or breath hydrogen is taken before ingestion of the lactose, and then at various time intervals thereafter. An increase in blood glucose indicates lactose digestion (glucose produced from the lactose hydrolysis is absorbed into the bloodstream), and no increase, or a ‘flat line’ is indicative of a lactose maldigester (probable lactase non-persistent) phenotype. An increase in breath hydrogen indicates maldigestion and reflects colonic fermentation of the lactose, as described in the following section. In both cases somewhat arbitrary cut-off points have to be set for distinguishing the two phenotypes and both methods inform upon the person’s ability to digest lactose rather than the given individual’s lactase expression. It must therefore be borne in mind that there will be an underlying error rate, leading to both false negatives and false positives. The relative efficiency of the tests has been examined in more than one study, and the breath hydrogen method was found the most accurate (Mulcare et al. 2004; Newcomer et al 1975; Peuhkuri 2000). It is also convenient and cheap. Lactase levels can, however, be secondarily reduced by gastrointestinal disease, leading to secondary lactose intolerance and also some people fail to produce hydrogen. In the clinical setting there are ways of improving the quality of the test. These include retesting, and giving a dose of a non-digestible carbohydrate, lactulose, to test for the presence of hydrogen producing bacteria (see section below), and investigation of other causes of the lactose intolerance, which might include examination of biopsy material.
Symptoms of lactose intolerance
Undigested lactose passing through the small intestine into the colon has two physiological effects. First, an osmotic gradient is set up across the gut wall, which results in an influx of water, causing symptoms of diarrhoea. Second, the lactose can be fermented by colonic bacteria, to produce fatty acids and gaseous by-products (including hydrogen, used in the tolerance test), potentially causing discomfort, bloating and flatulence. However most lactase non-persistent individuals can tolerate small amounts of lactose (as in tea or coffee), and some can consume a lot without ill effects (Scrimshaw and Murray 1988; Suarez et al. 1997). Variation in the composition of the gut flora between individuals (Hertzler et al. 1997; Hertzler and Savaiano 1996), as well as a psychosomatic component (Briet et al. 1997; Peuhkuri et al. 2000; Saltzman et al. 1999) may account for some of the interindividual variation in symptoms.
Worldwide distribution of lactase persistence
The noted correlation of lactase persistence phenotype with the cultural practise of milking generated the hypothesis that this trait has been subject to strong positive selection (Aoki 1986; Holden and Mace 1997; McCracken 1971; Simoons 1970, 1978).
Identifying the causes of lactase persistence
By the early 1970s it was established that the lactase persistence polymorphism in humans has a genetic cause, and is inherited in an autosomal dominant manner (Ferguson and Maxwell 1967; Metneki et al. 1984; Sahi 1974). Further evidence that lactase persistence is a genetic trait, and more specifically that it is caused by a cis-acting element, was produced in the early 1980s. Ho et al. reported a trimodal distribution of sucrase:lactase ratios in intestinal samples from British adults of northern European ancestry. The trimodal distribution was interpreted as attributable to groups of individuals homozygous for lactase persistence (highest lactase activity), heterozygotes with mid-level activity and non-persistent homozygotes with low lactase activity (Ho et al. 1982), and similar results were subsequently obtained in individuals of German ancestry (Flatz 1984). The intermediate lactase activity observed in the heterozygotes indicated that only one copy of the lactase gene was being fully expressed. Evidence for transcriptional regulation (Escher et al. 1992) and confirmatory evidence for the cis-acting nature of this (Wang et al. 1995) was obtained from mRNA studies.
The −13910*T allele was found to associate completely with lactase persistence, ascertained directly by enzyme activity in 196 Finnish individuals, and subsequent studies have confirmed a tight but not absolute association between −13910*T and lactase persistence as judged by lactose tolerance testing in populations of northern European ancestry (Bernardes-Silva et al. 2007; Hogenauer et al. 2005; Kerber et al. 2007; Poulter et al. 2003) and there was also a correlation, but not absolute, between genotypes and enzymatic activity (Poulter et al. 2003). However the A haplotype extends far beyond the 50 kb LCT gene region, with carriers of the −13910*T allele having almost identical chromosomes extending for nearly 1 Mb (Bersaglieri et al. 2004; Poulter et al. 2003).
Evidence for function of −13910*T
In vitro studies provided evidence that the −13910*T allele increases transcription in promoter–reporter construct assays in cell lines (Lewinsky et al. 2005; Olds and Sibley 2003; Troelsen et al. 2003), suggesting that it may have enhancer activity in vivo. A transcription factor, Oct-1, was identified which bound more strongly to the −13910*T containing motif than to the alternative C allele, providing a possible mechanism for up-regulation of LCT (Lewinsky et al. 2005), and suggesting that the cause of lactase persistence had been identified (Rasinpera et al. 2004), although many questions remain unanswered.
Population distribution of −13910*T: −13910*T does not account for lactase persistence worldwide and is rare in sub-Saharan African populations
Using carefully checked primary source literature data (Supplementary Table 1) we failed to obtain the tight correlation of −13910*T with published worldwide lactase persistence phenotype frequency reported elsewhere (Enattah et al. 2007), but it is clear that in Europe the frequency distribution of −13910*T is in broad agreement with that expected from distribution of the phenotype (Fig. 1). Figure 1a shows an interpolated contour map depicting the distribution of lactase persistence, prepared from phenotypic data taken from all the available literature, in which we were confident of the phenotypic testing, and from which children, family members, patients selected for likely intolerance, and twentieth/twenty-first century immigrant status were excluded. Figure 1b shows the distribution of −13910*T and details of the worldwide −13910*T data can be found in the supplementary information (Supplementary Table 3). Figure 1c shows predicted lactose tolerance distribution taken from −13910*T frequencies, assuming that −13910*T is the sole cause of lactase persistence and is dominant (p2 + 2pq).
In contrast to the high frequency in Europe, −13910*T is rare in sub-Saharan African populations (Fig. 1b) even in those populations where lactase persistence frequency is reported to be high (Mulcare et al. 2004), and it is also rare in the Bedouins of the Arabian peninsula, who are also frequently lactose digesters (Ingram et al. 2007). The allele was also absent from all but one of a series of phenotyped individuals of Sudanese ancestry (Ingram et al. 2007). An obvious interpretation was that -13910*T is not truly causal of lactase persistence, but is a very strongly associated marker of the causal element, which appeared on the lactase persistence carrying (A haplotype) chromosome after humans had spread out of Africa. However there was also no association with A haplotype in this African group and subsequent research indicated genetic heterogeneity.
New variants in intron 13 of MCM6, and multiple causes of lactase persistence in Africa
Details of SNPs known to be associated with lactase persistence as of July 2008
Position of SNP (in bps upstream of LCT)
Substitution (ancestral allele first, from comparison with chimp)
Evidence of association with lactase persistence
Evidence of function
Haplotype (Hollox et al. 2000 nomenclature)
Geographic location of highest observed frequency
G > C
Not included in dbSNP
Tishkoff et al. (2007)
Tishkoff et al. (2007)
T > G
C > T
Enattah et al. (2002)
C > G
Evidence of function for the alleles identified in Africa
It is important to critically evaluate the evidence for function of these recently described alleles. Footprint analysis, to determine DNA–protein binding sites, of sequence encompassing the intron 13 region revealed transcription factor recognition sequences for Cdx-2, GATA, HNF3α/Fox and HNF4α along with Oct-1 (Lewinsky et al. 2005). Two of the newly identified SNPs are located within the Oct-1 binding site (Fig. 4). Electrophoretic mobility shift assays (EMSAs) used to ascertain the effect of the new alleles on Oct-1 binding showed that only the original allele, −13910*T containing oligonucleotide probes bound strongly to Oct-1, -13907*G bound to a much lesser extent (Enattah et al. 2008; Ingram et al. 2007), and that binding of the other alleles was less still or undetectable. It can therefore be concluded that the simple change in binding of the protein Oct-1 to this site is unlikely to play a critical role in causing lactase persistence. The identification of the other associated allele, −14010*C, (Tishkoff et al. 2007), situated 100 bp away from the predicted Oct-1 binding site would appear to confirm this.
In vitro promoter/reporter analysis of the newly identified MCM6 intron 13 variant alleles however, lends some support to the idea that they do affect enhancer activity. Transcriptional activity of the LCT core promoter was enhanced up to tenfold by addition of sequences from MCM6 intron 13 (Lewinsky et al. 2005; Olds and Sibley 2003; Tishkoff et al. 2007) which include the ancestral variant. This activity increased further (by up to 25% more) when one of the variant alleles (−14010*C, −13907*G or −13915*G) was present (Tishkoff et al. 2007). This effect is in fact small and the authors did not include −13910*T as a positive control (previously shown to enhance transcription activity a further 80% compared to the ancestral allele (Troelsen et al. 2003). Although a recent paper of Enattah et al. (2008) does confirm an effect for −13915*G, the results are hard to evaluate because additional sequences are included in the construct, and the control −13910*T shows very little effect in this study. However, in the Enattah et al. (2008) paper the Caco-2 cells were not differentiated, as they had been in some of the previous studies (Troelsen et al. 2003). This also flags the problem of the appropriateness of the cell model. Caco-2 is a colon cell line, and the only line known to express lactase and has features more comparable with fetal small intestine (Hauri et al. 1985).
The predictive value of these in vitro functional studies with respect to the effect exerted in vivo by particular alleles is therefore uncertain, but the observations, together with those made previously (Lewinsky et al. 2005; Olds and Sibley 2003; Troelsen et al. 2003) do suggest, though do not confirm that this region is important in regulation of LCT expression. But how it allows low expression in fetuses, high expression in babies and then down-regulation in some but not other people is currently hard to envisage. Studies in mice flag the complexities of interpretation of in vitro studies, and indeed in vivo studies highlight the subtleties of tissue and developmental control (Bosse et al. 2006a, b, 2007; van Wering et al. 2004). Unfortunately there are severe restrictions to animal models in elucidating this uniquely human polymorphism.
The role of other factors influencing lactase expression
The immediate promoter of LCT is moderately well characterised in rat, pig and human (Fang et al. 2000, 2001; Krasinski et al. 2001; Lee et al. 2002; Mitchelmore et al. 2000; Spodsberg et al. 1999; Troelsen et al. 1994, 1997; van Wering et al. 2004; Wang et al. 2006), and there are several allelic variants within the first kilobase of human sequence (Harvey et al. 1995; Hollox et al. 1999; Lloyd et al. 1992). Although none of them is causal of persistence, it is just possible that variations in these SNPs affect expression under certain circumstances or at certain developmental stages: one study shows that the allele -958*T (characteristic of the B haplotype) reduces binding to an uncharacterised transcription factor (Hollox et al. 1999). Whilst it has been well established that regulation of LCT is predominantly under genetically determined transcriptional control there is evidence that other factors influence inter-individual differences in expression of the enzyme. Heterogeneity of the lactase non-persistence phenotype was reported by a number of research groups in their early studies. Some investigators observed individuals who show slower/abnormal processing of their lactase protein (Sterchi et al. 1990; Witte et al. 1990) which may imply variation in post-translational controls such as proteolytic cleavage, glycosylation and/or transport to the cell surface, which are involved in the normal processing of lactase (Jacob et al. 1994, 1995, 1996, 2002; Naim and Lentze 1992). Others have made observations suggestive of epigenetic regulation (Maiuri et al. 1991, 1994). Although most non-persistent individuals show no staining for lactase in the jejunal biopsies of the small intestine (concordant with low lactase activity and transcriptional regulation of LCT), some individuals show patchy expression of the enzyme in the intestinal epithelia (Maiuri et al. 1991, 1994). This mosaic expression pattern might be attributable to somatic cell changes in methylation, or histone acetylation but curiously this is not attributable to an ‘inherited’ change in expression pattern from a single stem cell, since in that case ‘ribbons’ of positively stained cells would be expected.
The original observations in the 1970s and 1980s of a positive correlation between lactase persistence frequencies and milk drinking led to the widely held notion that lactase persistence has been subject to positive selection. In the intervening years molecular evidence has accumulated which would appear to corroborate this hypothesis. Our group first reported on the unusual pattern of lactase gene haplotype diversity across populations (Hollox et al. 2001). We found only four common 50 kb haplotypes outside Africa, with many more within Africa, and a very high frequency of the A haplotype in northern Europe, and suggested that the very different haplotype frequencies observed in N. Europeans as compared to other populations are most probably explained by a combination of genetic drift and strong positive selection for lactase persistence (Hollox et al. 2001).
In our own study (Mulcare 2006) we used a marker for A haplotype chromosomes so that we could compare A haplotype chromosomes which carry the −13910*T with A haplotype chromosomes which do not, thus reducing the effect of pooling haplotypes of totally different lineages. Interestingly, we can see from this that the microsatellite haplotype that carries −13910*T is also the most frequent of the ancestral A haplotype chromosomes in Europeans, and also in non-Europeans. It can also be seen that within the non-A lineages there is a fairly frequent microsatellite haplotype which occurs in Europeans as well as non-Europeans (Fig. 5). It is associated with the B core haplotype in Europeans, and non-persistence. These observations suggest demographic factors additional to selection for one particular allele, as proposed previously (Hollox et al. 2001). Indeed, in the case of European lactase persistence, recent demic computer simulations indicate that the spread of farming from the near east during the Neolithic transition may have contributed to the high frequencies and genetic homogeneity of lactase persistence on the continent (Y. Itan, M. Thomas et al. manuscript in preparation).
Historical origins of lactase persistence; dating of the lactase persistence associated alleles
Each of the microsatellite diversity studies used the microsatellites to attempt to date the expansion of the −13910*T allele and the date ranges were 7,450–12,300 (Coelho et al. 2005), and 7,400–10,200 years ago (Mulcare 2006), and this agrees with date estimates obtained from extended haplotypes of 2,188–20,650 years ago (Bersaglieri et al. 2004). These dates are consistent with models of selection for lactase persistence along with the recent practise of dairying, approximately 9,000 years ago in Europe. Ancient DNA data obtained from human bones has shown that the −13910*T allele was either absent, or present at low frequencies, in early Neolithic Europeans. This is consistent with the -13910*T allele age estimates and supports a model whereby the cultural trait of dairying was adopted prior to lactase persistence becoming frequent (Burger et al. 2007).
The newly discovered −14010*C allele is also reported to occur as part of an unusually extended haplotype, suggesting that Africans too carry these signatures of recent positive selection for lactase persistence. In this case the allele is estimated to be between 1,200 and 23,200 years old (Tishkoff et al. 2007).
The identification of the newly associated alleles themselves suggests that lactase persistence has arisen and been selected for independently in several different human populations, thus the ability to digest milk has been extremely advantageous, at least for some, in the last few thousand years.
What were the evolutionary forces?
Because of the worldwide distribution of lactase persistence and the generally coinciding pattern of historically milk-drinking populations, Simoons and McCracken independently suggested, more than 30 years ago, that milk dependence created strong selection for lactase persistence (McCracken 1971; Simoons 1970). This has become known as the ‘culture historical hypothesis’, and suggests that the rise in lactase persistence co-evolved alongside the cultural adaptation of milk drinking, and its associated nutritional benefits. Nevertheless, the correlation is not absolute and there are exceptions in both directions. For example there are some ethnic groups who rely heavily on milk products and for whom cows or camels play a very important role in their lifestyle, but who have a low reported frequency of lactase persistence, for example, the Dinka and Nuer in Sudan (Bayoumi et al. 1982) and the Somali in Ethiopia (Ingram 2008). Statistical modelling shows that an incomplete correlation can be accommodated if some lactase persistent populations have recently stopped milking or conversely have only recently adopted the habit, therefore allowing insufficient time for lactase persistence to be driven to high frequency (Aoki 1986). Population migration may also have played an important role. In addition the cultural practise of milk fermentation (e.g. to yoghurt or cheese) reduces lactose content allowing non-persistent individuals to benefit from milk products.
Holden and Mace using regression analyses and correcting for relatedness of different populations claimed that lactose digestion capacity had most likely evolved as an adaptation to dairying, and concluded that high frequency lactose digestion capacity had never ‘evolved’ without the prior presence of milking (Holden and Mace 1997). Other evidence suggested to be in support of the culture-historical hypothesis has been provided by the observation that high-intra allelic diversity of cattle milk protein genes in Europe coincides with the geographic incidence of lactase persistence, which is consistent with large herd sizes kept for dairying and selection for high milk yields (Beja-Pereira et al. 2003).
However, it is noteworthy that at least in the Somali, one of us (CI) has obtained data to suggest that significant quantities of fresh milk are consumed by many who are lactase non-persistent (Ingram 2008) apparently without any adverse effects, and it seems likely that adaptation of the colonic bacterial flora allows digestion of lactose by these people. This means that under normal circumstances lactase persistence is unlikely to be under very strong selection in this population, and fits with the hypothesis that dairying and milk drinking can emerge before the genetic adaptation. It is likely that only at certain times and under more extreme circumstances, such as drought and famine, that the strong selective force operates. This is an extension of the arid climate hypothesis, first suggested by Cook and al-Torki (1975). These authors speculated that in desert climates (i.e. Middle and Near East) where water and food were scarce, nomadic groups could survive by utilizing milk as a food source, and in particular, as a source of clean, uncontaminated fluid (Cook and al-Torki 1975). This scenario is particularly pertinent to desert nomads whose major source of milk is obtained from camels, as these animals are able to survive up to 2 weeks without food and water by metabolising the fat contained in their humps. The benefits to persistent individuals may have become more pronounced during outbreaks of diarrhoeal disease, when non-persistent individuals would be unable to utilize milk as a water source without exacerbating their condition.
More recent research sought to address the question of why some populations and not others had adopted the cultural habit of milk drinking. The frequencies of lactose malabsorption were greater in populations where environmental conditions, such as extremes of climate or high incidence of endemic cattle disease, made it impossible to raise livestock (Bloom and Sherman 2005). The exceptions to the general distribution were a number of African groups with high lactase persistence frequency who managed to circumvent harsh environmental conditions by adopting a pastoralist way of life (Bloom and Sherman 2005).
Obviously, the benefits of milk drinking cannot be explained by the arid climate hypothesis in Northern Europe. Here, the advantage of improved calcium absorption has been suggested to explain the distribution of the trait (Flatz and Rotthauwe 1973). The low light levels experienced at high latitudes are associated with an increased risk of developing rickets and osteomalacia due to a lack of vitamin D production (which is synthesized by the skin in the presence of sunlight). Vitamin D is involved in the gut absorption of calcium, which is itself an essential mineral required for bone health. In addition, calcium may help to prevent rickets by impairing the breakdown of vitamin D in the liver (Thacher et al. 1999). Although lactase non-persistent individuals could obtain calcium from yoghurt or cheese, dairy foods that contain reduced lactose, milk proteins and lactose are believed to facilitate the absorption of calcium (for review see Gueguen and Pointillart 2000). Hence the ability to drink fresh milk which contains both calcium and components that stimulate its uptake (including small amounts of vitamin D) may have provided an advantage to persistent individuals.
Just one hypothesis has been put forward which suggests selection for lactase non-persistence. Since lactase non-persistence is the ancestral state, the need to invoke selection for non-persistence is counter-intuitive, but should not be ignored. In this proposal the selective agent is thought to be malaria (Anderson and Vullo 1994). This proposal came from the observations of high frequency of lactase non-persistence in regions where malaria is endemic, and that individuals with flavin deficiency are at a slightly reduced risk of infection by malaria. The consumption of milk, which is rich in riboflavins, was therefore proposed to be unfavourable since it would keep flavin levels in the bloodstream high. There is currently no support for this hypothesis (Meloni et al. 1998), and it seems unlikely to contribute to the current distribution of lactase persistence.
Present day health and medical considerations
Lactose malabsorption can readily be confused with milk protein allergy, which has quite different causes (reviewed in Crittenden and Bennett 2005), and in recent times lactose intolerance has been blamed for causing a variety of systemic conditions, often without clear evidence (Campbell and Matthews 2005; Matthews et al. 2005). Nonetheless it does appear that consumption of milk and milk products by those who cannot digest lactose is a relatively common cause of irritable bowel syndrome in Europe and the USA (Vesa et al. 2000). Many commercial dairy products and other foods (including yoghurts) contain high concentrations of lactose introduced in manufacturing, so that lactose is more widespread in the diet than it was for that same person’s ancestors. Lactose tolerance testing can be a useful way of detecting lactose malabsorption and enabling avoidance of the cause, but DNA testing is not yet useful, particularly for non-Europeans (Swallow 2006; Tag et al. 2008; Weiskirchen et al. 2007). In countries such as Finland, where there is a high frequency of lactase non-persistence in comparison with the rest of northern Europe, commercial low lactose products are readily available (Harju 2003).
Many association studies have attempted to demonstrate the health benefits of milk consumption in lactase persistent people, e.g. by providing protection against osteoporosis (Enattah et al. 2005a, b; Meloni et al. 2001; Obermayer-Pietsch et al. 2004), and others have claimed adverse effects of lactase persistence and associated high milk consumption (e.g. cataracts, ovarian cancer and diabetes) (Enattah et al. 2004; Larsson et al. 2006; Meloni et al. 2001; Meloni et al. 1999; Villako and Maaroos 1994). The often-contradictory findings are difficult to evaluate because of the high risk of confounding effects such as mixed ancestry, dietary intake and variation in gut flora.
Lactase persistence has been one of the leading examples of natural selection in humans, and also one of the first clear examples of polymorphism of a regulatory element. Further investigation of the molecular mechanisms as well as the evolutionary forces is however needed to fully understand this normal variation, which is providing an important model for understanding gene/culture co-evolution and disease susceptibility. The information accrued so far already illustrates the limitations of disease association studies and SNP tagging to find functional genetic variation attributable to multiple mutations, even if they are located in a single gene, and highlights the potential importance of distant regulatory elements.
CJEI and CAM were funded by BBSRC CASE studentships and YI was funded by UCL Graduate school, UCL ORS and B’nai B’rith/Leo Baeck London Lodge scholarships. We thank Neil Bradman, The Centre for Genetic Anthropology, UCL, for access to samples and Melford Charitable Trust for funding.