Introduction

Many papers on the genetics of common human diseases start with the following statement: 'Disease × is a complex, multifactorial disorder'. This rubric has been applied to schizophrenia, autism, depression, asthma, epilepsy, diabetes, rheumatoid arthritis, hypertension, coronary artery disease, obesity, Crohn's disease, Alzheimer's and Parkinson's disease, multiple sclerosis and probably hundreds of other conditions - even dandruff! But what does it mean? It means that the disease is influenced by multiple genetic and environmental factors. Diseases may earn this label if they are clearly heritable and also influenced by environmental factors (as in the case of diabetes), or if the inheritance of genetic liability is not sufficient to predict whether a person will actually develop the disease; that is, there is some probabilistic element to the emergence of the disease state itself (as in many inherited cancers or psychiatric disorders).

Quite apart from whether environmental or stochastic factors are at play, the terms complex and multifactorial are also commonly used to describe the architecture of just the genetic component of disease liability. In these cases, these terms are usually at least implicitly equated with the trait being polygenic; in fact, 'complex', 'multifactorial' and 'polygenic' are commonly used as synonyms. It is important here to make a distinction in how the term polygenic is used: the implication is that the disorder arises in each individual due to the combined effects of a large number of genetic variants [1]. This definition is distinct from a model of genetic heterogeneity, in which many different variants are involved across the population, but where each case is caused by a single variant (or a few variants).

Arguments that the inheritance of a disorder is polygenic usually derive from the observation that, while the disorder might aggregate in families, it does not tend to segregate in ways that are consistent with simple Mendelian inheritance. This is indeed true for the disorders referred to above, for which risk of disease is increased if an individual has a relative with the disease (and increased more, the closer the relative) but where sporadic cases are also common, sometimes forming the majority of cases. The precise values of the relative risks to family members of different degrees of relatedness can be fed into mathematical models of genetic architecture, and are, in many cases, consistent with polygenic inheritance (for example, for schizophrenia [24]). In some cases, these kinds of analyses have even been taken as proof that Mendelian inheritance with genetic heterogeneity can be rejected unequivocally as a model of the genetic architecture of the disorder (for example, [36]).

The conclusion that many common disorders are polygenic and caused by the aggregate effects in individuals of many common variants [1], along with the development of the HapMap Project [7], laid the foundation for genome-wide association studies (GWAS) designed to identify loci that harbor such common variants [810]. Unfortunately, this conclusion is based entirely on circular logic and a number of unfounded assumptions. The most fundamental of these is that the disorder in question represents a single, biologically valid category.

When is a disorder not a disorder?

Analyses of the segregation patterns of a 'disorder' across a population necessarily assume that the disorder in question is a valid category; otherwise, there would be no point in lumping all cases together and calculating things such as heritability and relative risks across the entire population. If the clinical diagnosis of a specific disorder is based on superficial criteria, then this assumption is unlikely to hold.

For example, 'blindness' is not a very informative diagnosis - genetic forms can be caused by cataracts, corneal defects, optic nerve atrophy and various forms of photoreceptor degeneration, such as retinitis pigmentosa (RP) [11]. Each of these, in turn, can arise due to mutations in any of a large number of different genes (over 100 for RP) [12]. Calculating the heritability of blindness or the relative risks to family members, averaged across all of these conditions, would not be a worthwhile or informative endeavor; in fact, the resultant figures would be pretty meaningless. Even within one 'condition', such as RP, such calculations would not be worthwhile as some cases are dominant, others recessive, some X-linked and others autosomal.

'Mental retardation' is another common condition that has very high underlying genetic heterogeneity [13, 14]. In many cases, this heterogeneity is apparent because the condition often arises as part of a distinct and discernible genetic syndrome (causing typical facial morphology, for example). But if we had only the intellectual disability to go on, there would be no way to distinguish these subtypes. If we looked at the inheritance of mental retardation as a whole, it would indeed fit the criteria for a 'complex' disorder. Yet there is no reason to think that most, or indeed any, cases of mental retardation arise due to a polygenic mechanism (that is, in the absence of a reasonably penetrant mutation).

Are 'diabetes', 'schizophrenia' or 'coronary artery disease' any more specific than 'mental retardation' as diagnoses? If two patients had different underlying causes, would we have any way to know this on the basis of their symptom profiles? Is it not possible, even likely, that as with blindness or mental retardation, many different insults could give rise to a similar end-state? This is especially likely if our descriptors are crude. For psychiatric disorders, for example, there is no definitive biomarker, brain scan or blood test that can aid in clinical diagnosis. These disorders are defined on the basis of surface criteria: the patient's behavior and reports of their subjective experience. The diagnostic categories are constantly being debated and the borders between them redefined (for example, [15]). Many patients' diagnoses are fluid over time and two patients can have the same diagnosis without sharing a single symptom in common.

None of this gives much confidence that many disease categories are natural kinds. Treating them as such is thus a massive leap of faith, and as we will see, the empirical evidence has not upheld this belief. GWAS have not uncovered the expected common variants that would explain polygenic inheritance across each of these disorders. By contrast, the identification of rare, individually causal variants in a large number of different genes in different people clearly demonstrates a very high degree of genetic heterogeneity underlying common, complex conditions.

This is especially noteworthy for psychiatric disorders such as autism and schizophrenia, where mutations in over 100 different loci have been found [1619]. For schizophrenia, genetic heterogeneity had supposedly been definitively rejected on the basis of the observed distribution of familial relative risks [24]. As we have seen, this is a circular argument: those numbers only make any sense if the condition is indeed monolithic. As it happens, it is trivial to show that a similar distribution can be generated on the basis of genetic heterogeneity, even by an arbitrary division of cases into different modes of inheritance [18]. Indeed, as originally pointed out by James [20]: there is 'an infinite number of parameter sets ... which lead to the same frequencies in relatives'.

The other argument against genetic heterogeneity is that if rare mutations of high penetrance exist, they should have been found by linkage analysis [4, 21, 22]. This conclusion again rests on several assumptions: that linkage was sought with the right phenotype, that the inconsistent replication of linkage results necessarily means that the large number found are all false positives, and that the level of genetic heterogeneity is low enough that even lumping many different families together into one analysis should still yield real linkage peaks [18, 23]. Again, the data indicate otherwise. Thus, the hypothesis of a polygenic architecture for these disorders arises from the unfounded assumption that they are actually common disorders, as opposed to umbrella terms for a diverse set of very rare genetic conditions that happen to share symptoms. This is, however, just the first of a series of assumptions underlying the search for common variants conferring disease risk.

The theoretical foundation of genome-wide association studies

GWAS are founded on the polygenic model of disease liability, which itself arises from an assertion of breathtaking audacity by the godfather of quantitative genetics, DS Falconer. In an attempt to demonstrate the relevance of quantitative genetics to the study of human disease, Falconer, based on work of others before him (for example, [24]), came up with a nifty solution [25]. Even though disease states are typically all-or-nothing, and even though the actual risk of disease is clearly very discontinuously distributed in the population (being dramatically higher in relatives of affected people, for example), he claimed that it was reasonable to assume that there was something called the underlying liability to the disorder that was actually continuously distributed. This could be converted to a discontinuous distribution by further assuming that only individuals whose burden of genetic variants passed an imagined threshold actually got the disease. To transform discontinuous incidence data (that is, mean rates of disease in various groups, such as people with different levels of genetic relatedness to affected individuals) into mean liability on a continuous scale, it was necessary to further assume that this liability was normally distributed in the population. The corollary is that liability is affected by many genetic variants, each having a small effect [1, 26].

This declaration meant that the statistical techniques developed in animal breeding could supposedly be applied legitimately to the study of human disease. Unfortunately, there is no basis for the underlying assumption of normally distributed liability, nor for the invoked threshold of genetic burden (Box 1; Figure 1).

Figure 1
figure 1

Modeling the genetic components of variance. (a, b) The idea of the multifactorial liability-threshold model is, first, that the actual discontinuous distribution of risk (a) (estimates given for schizophrenia risk to monozygotic twins (MZ) and first and second degree relatives of affected people) can be modeled as a continuous distribution of 'liability' (b). Second, at the extreme end of the normal distribution of 'liability', the cumulative genetic burden of risk alleles suddenly passes a tipping point (from n alleles to n + 1 alleles), triggering pathogenicity (b). (c) Increased risk to relatives can be modeled with a distribution of risk allele load that is shifted to the right. If n is small (0 or 1, for example), then the idea of a threshold of burden makes sense (for example, when there are dominant or recessive alleles). If n is supposed to be in the hundreds or even the thousands, this scenario becomes rather fanciful.

Nevertheless, GWAS have gone ahead on a very large scale for many complex disorders and have produced statistically significant findings. Do these findings validate the assumptions I have claimed are flawed? They do not, at least not necessarily.

What have we learned about complex disorders from GWAS?

GWAS follow a simple design: compare allele frequencies for hundreds of thousands of common variants spread across the genome between large samples of disease cases and controls [10]. If a variant predisposes to the disease, even slightly, this should be apparent as an increased allelic frequency in cases versus controls, given a large enough sample size. This design has been applied to many different complex disorders, with varying degrees of success.

It is important to note that these studies use a sample of single nucleotide polymorphisms (SNPs) to tag variation across the genome (on the basis of blocks of low recombination or linkage disequilibrium (LD)). If a single SNP shows an association with the disease, this does not necessarily mean that that SNP itself is involved, the association could be due to any of the other variants in LD with it. As we will see, these can include tagged rare variants [27]. The goal of these studies is thus not necessarily to identify causal alleles but to point to loci that might harbor them.

By that criterion, GWAS have been extremely successful for many complex disorders. They have pointed to numerous candidate loci for type 1 and type 2 diabetes, Crohn's disease, coronary artery disease, schizophrenia, bipolar disorder, multiple sclerosis and many other diseases. Some of these loci were previously implicated as sites of known rare mutations that cause obviously Mendelian forms of the disease in question, whereas others are novel findings that implicate new genes in disease processes. For some disorders, the findings converge on particular biochemical processes or pathways, such as beta-cell dysfunction or insulin action in type 2 diabetes [28, 29]; natriuretic peptide signaling in high blood pressure and cardiovascular disease risk [30]; immune system genes in multiple sclerosis [31]; innate immunity and inflammation in Crohn's disease [32]; and neural development in schizophrenia [33, 34]. These studies have also revealed some shared genetic risk across multiple disorders, including various autoimmune disorders (type 1 diabetes, Crohn's disease, multiple sclerosis and others) [35] and between schizophrenia and bipolar disorder [33, 34, 36].

The general trend across these studies is that the SNPs that give statistically significant association signals have tiny effects on disease risk, with odds ratios typically in the region of 1.05 to 1.2 (which means that if you carry such an allele, your risk of disease is increased 1.05- to 1.2-fold). This is exactly as predicted under a polygenic model: individual variants are not expected to have large individual effects. The aggregate risk caused by all the identified variants considered together is, however, also still relatively small. In most cases, the SNPs that meet the criteria for genome-wide significance can collectively mathematically explain only a small percentage of the genetic variance of the disorder [37] and hardly any of the familial risk [38]. There is also no statistical evidence for the kind of epistatic interactions that might be expected: combinations of alleles simply increase the overall effect additively (for example, for height [39, 40] and body-mass index [41]).

There are several, not mutually exclusive ways to interpret the positive signals that have emerged from GWAS of complex disorders (Figure 2). First, they represent the actual effects of either the genotyped SNP or a common variant that is in LD with it. Given the power of the studies and the lack of overall variance explained by the identified variants, one could further conclude that, while real, the overall effect of such common variants is very modest [42, 43]. This would be consistent with a model in which such variants act as modifiers of rare mutations, but, even in aggregate, do not cause disease in the absence of some such mutation.

Figure 2
figure 2

Interpreting positive GWAS signals. (a) GWAS can point to chromosomal loci that may harbor causal variants. (b) The associated SNP will act as a marker of multiple additional common variants in an LD block. The marker SNP or any of these other common variants could be the causal variant. The very low odds ratio across the population might represent a tiny effect of one of these variants in every individual or a large effect that arises only in the context of some rare mutation. (c) Alternatively, at least some common SNP signals could actually be tagging rare variants of large effect in the population, which are in strong LD with it (stars). If these occur, by chance, more prevalently on one haplotype than on another, this will lead to a slightly increased frequency of one allele in cases when compared to controls (that is, an association signal).

Second, the common variants found represent merely the tip of the iceberg. Many other variants exist that have even smaller effect sizes, which current studies are underpowered to detect. Collectively, these could explain a sizeable fraction of the overall genetic variance - much more than actually observed, due to incomplete LD with the causal variants - leaving little need to invoke rare mutations (for example, [33]). Simulations exploring this possibility are discussed in Box 2.

Third, the common SNPs that show association signals are actually tagging rare mutations that segregate in the sampled populations [23, 27]. Whenever a rare mutation arises, it necessarily occurs on the background of some ancient haplotype. If such a mutation predisposes to disease with relatively high penetrance, explaining even a small percentage of cases in a population, then the common haplotype will be slightly increased in frequency among cases when compared to controls. The odds ratio of the associated common SNP, which suggests a very modest increase in risk, could thus actually signal a highly penetrant variant on a fraction of the chromosomes with that haplotype. This kind of effect will be especially prevalent in studies from small, defined populations. Though one might expect it to be diluted out when multiple populations are combined (because different rare mutations in the same gene will occur on different haplotypes), it has been argued that synthetic associations with a single SNP allele can arise by chance due to multiple rare variants in the same locus [23, 27], though others contend this is unlikely [21, 22].

It is not possible to determine definitively which of these interpretations is correct from the GWAS data themselves. In particular, the strongly worded claim that GWAS signals provide strong support for a polygenic architecture of complex disorders, involving large numbers of common variants of small effect in each individual [33], is not justified [18, 27, 44] (Box 2). GWAS simply cannot determine whether the alleles responsible for the positive associations are common or rare, nor can aggregate scores or genome-partitioning models [45, 46] tell how many alleles are involved, either across the population or in each affected individual (especially if the assumption that the signals from the SNPs in question are independent is not valid (D Goldstein, personal communication)).

Fortunately, we do not have to rely on statistical simulations to answer these questions. There is direct empirical evidence that rare mutations play the predominant role in the inheritance of such disorders. This evidence includes the nature of the spectrum of genetic variation in humans and the growing number of examples of identified, rare, disease-causing mutations.

The spectrum of human genetic variation

There is a common view that the human genome can be divided into bases that are pretty much constant across all people and those that are polymorphic. The logical extension of this idea is that heritable phenotypic differences between people must be caused by the particular combinations of polymorphisms that they inherit at the variable sites (for example, [47]). Recent data from whole-genome sequencing efforts show just how wrong this view is.

Far from most of the genome being effectively constant, it seems that every position in the genome has been mutated many, many times over in the human population [4851]. Each of us carries thousands of very rare variants, including hundreds of novel mutations [5256]. Recent, rare mutations are far more likely to have a deleterious effect on protein production or function and much more likely to cause disease than common variants [5459].

New mutations may spread in the pedigree or population in which they arise for some time, depending largely on whether they have a deleterious effect on fitness or not [51, 60, 61]. Mutations that do have a deleterious effect will be quickly selected against, though the recent human population explosion could allow less penetrant or recessive alleles to persist for some time at low frequency [50, 60]. It is, however, highly paradoxical to suppose that variants that predispose to serious diseases would ever rise to a high frequency [51, 62, 63]. The casual invocation of balancing selection as a mechanism to maintain disease-causing alleles at high frequency is not supported by any evidence [64]. If a disorder is associated with reduced fitness, then the distribution of alleles affecting it will be expected to shift towards very rare ones that are highly deleterious [65] (and not, as previously concluded for disorders for which it was assumed that no correlation with fitness exists, towards slightly deleterious alleles at moderate frequency [66]).

Identifying rare mutations

The best evidence that so-called common disorders really encompass many distinct genetic disorders is the growing numbers of rare, highly penetrant mutations causing such disorders that are now being identified. Examples of single mutations causing disorders such as autism, schizophrenia, diabetes, epilepsy and many other common diseases have long been known. While these could be identified in only a small proportion of cases, they could, however, be disregarded as exceptions to the generality of the disease (for example, [67]): they did not cause 'real schizophrenia' or 'real autism'. But what if there is no such thing? What if all cases are due to some rare mutation? The growing number of cases explained by such examples makes this view more and more difficult to argue against.

Such cases include copy number variants (CNVs; that is, deletions or duplications of sections of chromosomes, which often affect more than one gene), as well as point mutations. CNVs have become more easily detected, using genomic microarray and sequencing technologies, and have been found to contribute significantly to the total number of cases of a range of psychiatric and neurological disorders, including schizophrenia [6876], autism [73, 7780], attention deficit-hyperactivity disorder [8184], Tourette syndrome [85], developmental delay and mental retardation [14], and epilepsy [86].

Whole-genome or whole-exome sequencing strategies are now also identifying many point mutations that predispose with high penetrance to various disorders. Studies on psychiatric disorders have again led the way here [13, 8793], but recent reports have also identified single mutations causing neonatal diabetes mellitus [94], coronary artery disease [95] and Crohn's disease [96].

Real sources of complexity in linking genotype to phenotype

If complex disorders really arise due to rare mutations, then why is their inheritance not more obviously Mendelian? There are a number of factors that contribute to the complexity of inheritance of these disorders. I have argued that much of the complexity is simply apparent, due to lumping together what are actually distinct disorders under one umbrella term. Certain pathophysiological states could arise due to mutation in any of a large number of different genes. There are clearly many ways to cause psychosis or seizures or poor control of blood sugar level, just as there are many different ways, genetically, to cause blindness or deafness or mental retardation. More common diseases are likely to reflect a larger mutational target (that is, more genes will be involved in the affected underlying processes) [97].

This can only be part of the answer, however. Even within single families, the inheritance patterns of these phenotypes are usually not simple. What other factors might contribute to this complexity? First, the definition of the phenotype is probably very imprecise. A major finding that has emerged from recent studies is that specific mutations do not respect the boundaries of diagnostic categories - their effects can manifest in many different ways, leading to different symptoms and diagnoses in different carriers [15, 17, 98103]. Analyzing segregation patterns on the basis of overly specific diagnostic categories could thus be highly misleading.

Second, any particular mutation could be required but not sufficient to cause disease in individual carriers [17]. It is extremely common, the norm actually, for Mendelian mutations to be modified by additional variants in the genetic background [104, 105]. Some highly penetrant mutations will be clearly responsible for disease, by themselves. For diseases that have a strong effect on fitness, these will be almost immediately selected against and therefore enriched in sporadic cases that are caused by a novel occurrence of the mutation. Mutations that have lower penetrance might require additional 'hits' to result in the disease phenotype. There are numerous examples of compound heterozygosity or digenic inheritance underlying complex disorders [51, 106110]. Oligogenic interactions, involving several mutations, might also be important in some cases. Epistatic interactions between multiple mutations could be highly complex and unpredictable [111114], or even paradoxical, as in cases where two disease mutations suppress each other's effects [115, 116].

Finally, non-genetic factors must also be important in the emergence of disease phenotypes, given the incomplete concordance in the phenotypes of monozygotic twins for many of these diseases. Incomplete penetrance and variable expressivity could result from environmental factors and also from intrinsic developmental variation. The latter is especially important for neurodevelopmental disorders, where a certain probability that a pathogenic route will be followed could be inherited but the actual outcome of development would be strongly influenced by stochastic events [117].

Concluding remarks

The apparent complexity of common disorders arises to a large extent because of our poor ability to discriminate between what are in reality many distinct genetic disorders. Most cases of such disorders are likely to be the result of a rare, recent mutation that has a strong biological effect, or of interactions between a small number of such mutations. GWAS point to loci that might be involved in complex disorders but the population-based metrics that they provide say little about the number or type of causal alleles in individuals. Modifying effects of common alleles are certainly possible, though the evidence for these remains indirect. This change in paradigm is already having an impact in the clinic, as more and more cases of complex disorders are clinically defined on the basis of a genetic diagnosis, indexing the primary cause of the disease and not merely the surface symptoms [9496, 118, 119].

Box 1. Falconer's fantasy

The multifactorial threshold model has some intuitive appeal, especially as an expression of an interaction between some genetic predisposition or vulnerability and the effects of an environmental stressor. This is not, however, how it is used in the context of the genetic architecture of complex disorders. Here, it is purely the genetic components of variance that are being modeled (Figure 1).

Visscher and colleagues [67, 120] have, rather surprisingly, used height as an example to illustrate the liability threshold model. Height is clearly continuously distributed in the population. They nevertheless imagine a disease called 'loftiness', which afflicts those above some arbitrary height threshold. In this scenario, even families that 'we consider tall might not have many individuals passing the threshold into loftiness', supposedly paralleling the situation in a complex disorder like schizophrenia. As it happens, height is a perfect example to illustrate why the threshold model makes no biological sense. There is no such threshold. An increasing burden of height risk alleles does not push people into gigantism - single mutations do (for example, [121]; and the same is true for dwarfism at the other end of the spectrum [122, 123]). In fact, the aggregate effects of the multiple common variants that affect height are remarkably linear [39, 40].

Complex systems are typically robust to the cumulative effects of small variations; in fact, they must be so in order to withstand the inherent noise in biochemical systems and effects from variables outside the system [124, 125]. The supposed tiny effects on expression level of common variants are highly unlikely to have a large effect precisely because the system has such fluctuations on a moment-to-moment basis as a constant (and essential) feature [126, 127]. In particular, the small-world architecture of complex networks is robust to many small changes but paradoxically vulnerable to 'attack' of certain nodes [128]; these networks can fail catastrophically in response to large changes in specific components.

Box 2. Aggregate polygenic scores

GWAS for various disorders have identified a number of SNPs that are above the threshold for genome-wide statistical significance as disease risk factors. Collectively, these loci explain a small fraction of risk. It remains possible, however, that additional SNPs are also associated with risk but that their signals are buried in the noise and cannot meet the burden of multiple testing correction with current sample sizes. To attempt to measure whether additional associated SNPs exist, various researchers have generated aggregate scores for the top x percentage of SNPs based on a ranking of P-values in a discovery sample. The scores for each individual in a replication sample are then used to see if they can distinguish cases and controls. In a study on schizophrenia, such an aggregate score, involving thousands of SNPs, was indeed significantly increased in cases versus controls. However, this score accounted for only 3% of the variance [33]. This effect has since been replicated in a family study, where it explained 5% of the variance [129], ruling out population stratification concerns, and in a much larger GWAS, where it explained 6% of the variance [34].

These data indicate that there probably are additional SNPs that are associated with schizophrenia risk that could be detected with larger samples. Indeed, the most recent GWAS mega-analysis for schizophrenia [34] reports additional significant SNPs that were not found in earlier studies. Taken at face value, however, they also suggest that the overall contribution of common variants to the genetic variance that affects the disorder is very modest (less than 10%). But based on the idea that the real signals might be swamped out by non-associated SNPs in these aggregate scores, and that the linkage between the genotyped and the putative causal SNPs was probably imperfect, the authors performed simulations where they suggest that the overall impact of common SNPs was in fact much higher (as much as 33% of the total variance). Despite the claims of convergence onto a narrow range of values, seven models (out of 560 tested), involving combinations of multiple parameters such as allele frequency, effect size, number of SNPs involved and others, actually give wildly different estimates of the total true variance explained (from 34% to 98%) and the number of SNPs contributing (from 6% to 100%). The actual lack of convergence does not provoke much confidence in the overall claim that these results 'provide molecular genetic evidence for a substantial polygenic component to the risk of schizophrenia involving thousands of common alleles of very small effect.'

That claim assumes (circularly) that the associations of common SNPs actually reflect the biological effects of common variants, as opposed to associations due to tagged rare variants [23, 27]. It also assumes that the signals carried by the SNPs used in the score are independent: if not, then no information can be deduced as to the actual number of underlying causal variants (D Goldstein, personal communication). For now, the conclusion that many loci are involved in the genetics of schizophrenia across the population is uncontested. The conclusion that thousands of loci are causally involved in the inheritance of the disorder in each individual is not justified.