Keywords

1 Introduction

The expanse of data generated by the Human Genome Project as well as the closely related 1000 Genomes Project and Human Genome Diversity Project ushered in a new era for genomics and exponentially expanded our understanding of human variation. The conjoined burst of genotyping technology has enabled a closer look at human genetics and disease. Association studies, developed to identify causative factors of disease, proved to be invaluable in broadening our understanding of the biological mechanisms through which disease manifests. These studies have uncovered highly penetrant causative genes for rare Mendelian diseases such as Huntington’s disease (Group, T.H.S.D.C.R 1993; MacDonald et al. 1992) and cystic fibrosis (Kerem et al. 1989) which has been revolutionary to the clinical field. These studies used the concept of linkage analysis to find rare causative variants of large effect that explain disease phenotypes. However, this platform proved to be predominantly unsuccessful when applied to many common diseases such as cardiovascular disease, cancer, schizophrenia and bipolar disorder among others. It is thought that the architecture of such diseases is far more complex and their manifestation depends on a multitude of genetic and environmental factors of small effect. For example, some family members carrying risk variants for certain mental disorders may express the phenotype while others may not due to the random segregation of numerous causative alleles that have complex interactions with one another to produce a disease phenotype (McCarroll et al. 2014). As a result, an alternative to simple linkage analysis was required to study common complex diseases and traits.

A Genome Wide-Association Study (GWAS) is a population-level approach that aims to determine a genome-wide statistical significance for the risk of disease manifestation based on the number and pattern of specific common genetic markers that co-occur with a specific disease phenotype (Bush and Moore 2012; McCarthy et al. 2008; McCarthy and Hirschhorn 2008a, b). GWASs adopt a hypothesis-free approach that assumes that any region of the genome may be responsible for the phenotype under study. These studies highlight susceptibility loci rather than identify specific causative variants. Once these regions of the genome are identified, further genotyping and sequencing is required to comprehensively catalog causative alleles. Additionally, the biological mechanisms associated with these loci and phenotype are explored. GWASs strive to not only elucidate individual risk based on the presence of susceptibility loci but also the underlying biology of disease on which prevention and treatment could be based.

2 Fundamental Concepts of GWAS

2.1 SNPs and Linkage Disequilibrium

Single nucleotide polymorphisms appear at high frequencies in the human population and typically have two commonly occurring base-pair possibilities for the corresponding genetic location, otherwise known as alleles. The frequency of an SNP is generally given as an allele frequency, either minor or major, depending on which allele appears more frequently in the population. Common SNPs (Minor Allele Frequency ≥ 5%) appear to have low mutation rates which make them good markers of human genetic history (Sachidanandam et al. 2001). In GWASs, these common SNPs are used as genetic markers to “tag” loci that may contain causative variants.

GWASs are based on the co-segregation or linkage disequilibrium (LD) of un-genotyped disease-causing variants with genotyped marker SNPs (Bush and Moore 2012; McCarthy et al. 2008; Visscher et al. 2012). Because most SNP alleles arise through rare mutational events along an ancestral genetic background, which is then inherited across generations, SNPs tend to be associated with each other along closely spaced sites on a single chromosome. Two loci are said to be in LD if the specific alleles are correlated on the population scale and are inherited together, thereby allowing one SNP to be used for predicting the presence of another. The term LD was originally coined to describe changes in genetic variation within a population over time. Recombination events will break apart segments of contiguous chromosome, which through generations, in a population undergoing random mating, will result in eventual linkage equilibrium of alleles, or the total independence of each allele (Fig. 4.1). Linkage disequilibrium, and its decay, is dependent on various factors such as population size, number of founding chromosomes in the population and the number of generations for which the population has existed (Bush and Moore 2012). It should be noted that linkage equilibrium does not occur over one generation, but instead, is the result of events accumulated through many generations over an extended period of time. This is because the appearance of novel SNPs is a relatively rare event, and the limited number of recombination events in not so distant generations are usually inadequate to separate a variant from the ancestral background on which it arose (Manolio et al. 2008). Additionally, it has been shown that recombination events do not occur at random, but are predominant at specific recombination “hot spots” (McVean et al. 2004). All of these elements contribute to various degrees and patterns of LD in different human populations and subpopulations. To illustrate this, African, European and Asian populations may be considered. African populations are the most ancient human genetic lineages, and thus have a substantially larger number of recombination events resulting in smaller regions of LD. In contrast, European and Asian descendants have stronger LD due to the founder effect which results in the loss of genetic variation by establishing a new population from a smaller number of individuals of a larger (African) population (Bush and Moore 2012). SNPs on the same chromosome are inherited in blocks and the pattern of SNPs in a block is a haplotype. The length of blocks possessing strong LD can vary across the genome and across populations. In populations of European or Asian ancestry, the estimated mean size of a haplotype block is 22 kb, while in populations of African ancestry it is estimated to be 11 kb. As humans are considered to be relatively young species, who have not had yet enough recombination events to separate many variants from their ancestral background, LD between common genotyped SNPs and un-genotyped causal variants can be exploited in GWASs (Christensen and Murray 2007).

Fig. 4.1
Three generations D N A. The ancestral D N A has a single mutation x in one of the genetic codes. The few generations D N A has segments of genetic code of variable length with mutation x on each genetic code. The many generations D N A has smaller segments of genetic code of variable length and the mutation x is on each genetic code.

The decay of linkage disequilibrium is a very slow and gradual process which enables SNPs to be used as genetic markers in GWASs. Here, the mutation, x, introduced into ancestral DNA is inherited, through generations, with segments of genetic code of variable length. The extent of recombination is one of the factors defining the genomic region remaining in LD with the original mutation

There are many definitions of LD, each depending on the specific features of the association being studied. For biallelic markers, LD is often expressed as r2 which is a correlation coefficient between two SNPs and is derived from allele and haplotype frequencies for each association (Reprinted from VanLiere and Rosenberg 2008 with permission from Elsevier; VanLiere and Rosenberg 2008):

$$ {r}^2={\left({P}_{AB}\hbox{--} {P}_A{P}_B\right)}^2\div \Big(\left({P}_A\left(1\hbox{--} {P}_A\right){\mathrm{P}}_{\mathrm{B}}\left(1\hbox{--} {P}_B\right)\right) $$
(4.1)
  • PAB = Frequency of haplotypes having allele A at locus 1 and allele B at locus 2

  • PA = Frequency of allele A

  • PB = Frequency of allele B

According to this relationship, if there are two polymorphisms with equal allele frequency and total disequilibrium, r2 will be 1. If both SNPs are in linkage equilibrium, completely independent of each other, r2 will be 0 regardless of allele frequency. However, a low r2 value may also be observed, for example, when one allele appears at a higher frequency than the other. This type of situation may arise if a recent mutation is introduced at one allele, and not the other in a small subset of a population. For example, in a given population, allele A is in LD, and appears at equal frequency, with allele B. If a recent mutation (b) occurs at allele B in only a small portion of the population, the inclusion of members of the larger, now admixed population in a GWAS may present LD between frequently observed association, AB, and, rarely observed association, Ab. In other words, within the population, the presence of the first allele (A) is not necessarily indicative of the second (b), more rare, allele, however, the presence of the rare allele (b) is always associated with the first allele (A). As a result, when using SNPs as genetic tags, r2 is considered to ensure that the chosen marker is truly representative of its surrounding genetic environment.

When a high r2 indicates that one allele of the first SNP is observed with one allele of the second SNP, only one of the 2 markers needs to be genotyped to capture variation (Bush and Moore 2012; Slatkin 2008). Thus, tagging a causal variant requires only a few genetic markers per chromosome making it necessary to use only a small number of markers to scan the entire human genome. In fact, despite the presence of over 10 million common SNPs, only about 500,000–1 million markers are needed to detect variation in non-African populations (Bush and Moore 2012). When designing a GWAS, the selection of this limited number of “best” tag SNPs sufficient to identify a haplotype block requires precise mapping of LD patterns (Takeuchi et al. 2005). The International HapMap Project identifies common variation across the genome and characterizes correlations between variants by mapping haplotype blocks and tag SNPs. As a result, this project identifies common variation across the genome and characterizes correlations between variants. A consortium of HapMap researchers produced a human haplotype map using 270 genotyped samples from four populations (Yoruba from Ibadan, Nigeria (YRI), Utah from residents of northern and western European ancestry in Utah, USA, (CEU), Japanese from Tokyo, Japan (JPT) and Han Chinese from Beijing, China (CHB)) of diverse ancestry (Manolio et al. 2008). Phase I genotyped approximately 1 million SNPs, while Phase II covered 3 million (Manolio et al. 2008). This project has since then expanded (Bush and Moore 2012; Slatkin 2008). The HapMap Project revealed that while many tag SNPs are transferable between populations, there is an extent of variability of LD patterns in populations like those of African descent. As a result, GWASs are conducted considering the correlation between selected SNPs and the sample population.

More recently, the 1000 Genomes Project has come to light as a more objective, comprehensive study that can be used to identifying SNPs and LD. In using 1092 human genomes from 14 different populations, the 1000 Genomes Project provided a haplotype map of 38 million SNPs, 1.4 million short insertions and deletions and more than 14,000 large deletions. Additionally, this study was able to map low-frequency variants, including those localized outside of coding regions, with greater accuracy (Genomes Project, C, et al. 2012).

2.2 GWAS Data Interpretation

2.2.1 Association Model: Determining Genotype – Phenotype Associations

Once genetic markers have been carefully chosen, sampling populations assembled and genotyped using these tag SNPs, each genetic locus is then measured for association with the phenotype. There are three different types of associations that can arise from GWASs: direct, indirect and spurious (Clarke et al. 2011). Direct associations occur when different alleles exist at the marker and are revealed to be causative. Indirect associations occur when the explored tag SNP is in LD with the causal variant and finally, spurious associations are those that result from biases introduced through study design and statistical analysis. Great care is taken when designing and conducting a GWAS and subsequently translating its genotyping data into meaningful associations so as to avoid conclusions based on spurious results (this is discussed further in this chapter). Various methods are used to quantify and characterize associations depending on the study design (case-control vs cohort), genotype model and type of phenotype (binary vs continuous). Because there is a distinction between biological associations and statistical associations due to the incomplete discovery of all factors contributing to a phenotype, the most closely representative statistical models are chosen.

In case-control GWASs, where sample groups are selected based on disease status and the sample sizes are determined by the investigators, SNP associations are often reported as odds ratios. Odds ratios provide an estimate of the odds of a phenotype manifesting in the presence of given a specific genotype compared to the odds of the phenotype occurring without the specified genotype (Szumilas 2010; Pepe et al. 2004). By comparing the odds of disease in an individual carrying allele A to the odds of disease in an individual carrying allele a, the allelic odds ratio describes the association between disease and allele. In contrast, the genotypic odds ratio describes association by comparing the odds of disease in an individual carrying one genotype with the odds of disease in an individual carrying another genotype (Table 4.1). Odds ratios are analyzed using statistical techniques such as logistic regression and contingency tables and are coupled with p-values (genome-wide significance ≤5.0 × 10–8) generated through chi-squared or Fisher exact test that are corrected for various variables that may induce spurious associations.

Table 4.1 Two forms of odds ratios

In case-control GWAS, the data is characterized, and subsequent odds ratios are derived based on one of five genotype models that define SNP associations on the statistical level. Because these models consider two alleles for a given SNP but make different assumptions about their contribution to the phenotype, the conclusions derived from each can be significantly different. As a result, it is important to carefully choose the most accurate and representative genotype model: dominant, over-dominant, recessive, multiplicative and additive (Horita and Kaneko 2015). For example, take the alleles A (major) and a (minor) and their relative contribution to risk (Table 4.2). The dominant model is expressed as a comparison between AA + Aa vs aa and assumes that having one or more copies of A (Aa or AA genotypes) increases the risk of having an altered phenotype in comparison with aa. The over-dominant model details that the combined effect of A and a (thus the Aa genotype), increases risk over AA or aa. The recessive model compares Aa + aa vs AA and assumes that one or more copies of a has the strongest impact on phenotype and, thus, requires an AA genotype to alter risk. The multiplicative model assumes that if the risk for Aa is k, then the risk for AA is k2. Alternatively, the additive model implies that if there is a risk of k for Aa, then there is a risk of 2 k for AA. The additive and multiplicative models rank genotypes with two allele copies over those that have one or none (ex: AA and aa) from highest to lowest impact on risk (McCarthy et al. 2008; Attia et al. 2003). While dominant and recessive models can be assessed at the individual-level, multiplicative and additive typically require evaluation at the allelic level (Clarke et al. 2011). It should be noted, however, that because the additive model does not fit well into a logistic regression model, the multiplicative model is more widely used.

Table 4.2 Genotype models and disease risk

Alternatively, cohort studies taking into consideration the time to disease onset compare the risk of developing a disease for one group with a specific genotype relative to another group with a different genotype. This additive model is used with prospective/retrospective cohort studies where groups are sampled and monitored over the course of an extended period of time. Associations in these studies are usually expressed as a relative risk ratio (incidence rate ratio). With rare diseases, the odds ratio and relative risk ratios can often express the similar information, however, significant differences are seen when investigating common complex diseases. Statistical bridges can be made between odds ratio and relative risk ratio when necessary and the use of these models will depend on the type of information an investigator hopes to acquire from their population study (Waltoft et al. 2015).

2.2.2 Classifier Model: Disease Risk Prediction

While odds ratios can provide valuable insight into genotype-phenotype associations and, subsequently the etiology of the disease, one of the most sought after applications of GWAS is disease risk prediction. Though both models use similar SNP data, the goals of association studies and risk prediction models are quite different. Association studies aim to reproducibly detect disease associated loci and often consider a single SNP at a time. In contrast, risk prediction strives to evaluate biomarker use in facilitating clinical decisions for individual patients. As a result, high predictive power is required. This means that though a variant may be deemed causative due to phenotype association, it does not necessarily equate good predictive power (Jostins and Barrett 2011).

There are many different prediction models currently being used, one of which is polygenic scoring (Abraham and Inouye 2015). This is a very simplistic method that sums the estimated effects of a limited number of known risk alleles. While polygenic scores have worked well for some diseases such as Crohn’s disease, it was insufficient in evaluating complex disease like coronary heart disease, emphasizing that no one standard prediction model is generally applicable (Abraham et al. 2013). Other, more complex prediction models, and later evolutions of polygenic scoring, take into consideration the actual effect size of each SNP and their possible interaction with one another in addition to their interaction with external factors such environmental contributions, known biological mechanisms, clinical risk factors, age (in relation to morbidity and/or mortality), etc (Chatterjee et al. 2016). Data may also be included from additional, family-based and prospective/retrospective cohort, studies that may provide more information about contributions to a specific disease’s onset. The design of the predictive model depends on various factors such as the genetic architecture of a disease, currently available genotyping data for the phenotype and the sample size used to acquire this data. However, regardless of the model chosen, the validity and robustness of the model are important. This is often evaluated using training datasets for which the SNP associations are known to be present for the phenotype in question (ex: GWAS) (Abraham and Inouye 2015).

One way the power of a predictive model can be assessed using a training dataset is through receiver operating characteristic (ROC) curves. This classifier test evaluates how well a model can distinguish between two distinct states, diseased (phenotype) and non-diseased (no phenotype), using collected GWAS data; for example, the number of individuals with a specific SNP (or genotype) who exhibit the disease phenotype and the number of individuals with the genotype who do not express the phenotype. The performance of a given risk prediction model is given in the context of a true positive fraction (with disease; sensitivity) and false positive fraction (without disease; 1-specificity), which, in continuous traits, are a function of predetermined cut-off or threshold values. The threshold value, based on the distribution of the data and decided by the investigators, will classify which data points are true positive and which are not. The shape of the curve is determined by the overlap between the distributions of the two data sets; the less the overlap between the two data sets, the more concave, and informative, the ROC curve (Fig. 4.2). The accuracy of the test is measured by calculating the area under the curve (AUC) of the ROC plot. The AUC provides an overall summary of the ROC curve and can be interpreted as the probability that a randomly chosen case subject is more likely to be ranked diseased than a randomly chosen control subject. An AUC of 0.5 indicates no discrimination; the model, and investigated genotype, predicts disease vs non-disease phenotype with an accuracy equal to pure chance. Likewise, 0.5 < AUC ≤ 0.7 is considered less accurate, 0.7 < AUC ≤ 0.9 is considered moderately accurate, and 0.9 < AUC < l is considered highly accurate. An AUC of 1 is a perfect test indicating large detection rate of true positives to a relative low rate of detection of false positives (Jostins and Barrett 2011; Greiner et al. 2000). However, in GWASs, each causative SNP is only one small fraction of the contributing factors that interact to induce a complex disease phenotype. Thus, analyzing data collected on only a small subset of SNPs using an ROC curve may produce low AUC values. Compiling information from all SNPs of genome-wide significance or all discovered SNPs from a GWAS may improve the AUC and provide a better predictor of a disease (Jakobsdottir et al. 2009).

Fig. 4.2
A graph illustrates the distribution overlap between true positive rate and false-positive rate. The graph plots three models. One model plots a rising line labeled poor model, the second model plots a less concave down increasing curve labeled good model, and the third model plots a more concave down increasing curve labeled the best model.

Receiver Operator Characteristic (ROC) curves can provide insight into the predictive power and strength of an SNP association with a defined phenotype. The less the distribution overlap between true positives and false positives, the more concave the ROC curve, with the Area Under the Curve (AUC) approaching 1, which is indicative of a more accurate test. In some cases, compiling GWAS data on multiple causative SNPs may improve the AUC and the predictive power of the model being tested using ROC curves

Despite the common use of odds ratios and ROC curves in reporting associations and predicting individual risk respectively, there are a multitude of biological and statistical contributions to these data outputs that can drastically alter their significance and concurrence. The clinical applicability of AUC relies on its specific purpose, that is, whether it requires high sensitivity or high specificity. For example, general population screening for ovarian or pancreatic cancer is often only relevant in the ROC curve from a test of high specificity (98–100%) (Feng 2010). This is because the vast majority of the general population will not have these rare diseases. In contrast, when evaluating a high risk population, it is important to perform a test at high sensitivity. Because association studies do not take into consideration the clinical context of genotype-phenotype associations, SNPs with high odds ratios may still prove to be poor risk predictors. Likewise, two SNPs with equivalent odds ratios may perform quite differently in ROC curves. As a result, risk prediction models are now more often built on data collected from specifically tailored studies that deviate in design from typical GWASs based on the distinct clinical question they attempt to address (Feng 2010).

3 GWAS Applications and Impact

3.1 Expanding the Scientific Landscape: Elucidating the Origins and Mechanisms of Disease Manifestation

One of the first GWAS discoveries was a polymorphism in the complement factor H that was identified as a risk factor in age-related macular degeneration and soon became a valuable therapeutic target (Troutbeck et al. 2012; Edwards et al. 2005). The first GWASs came out in 2005 (Klein et al. 2005) and 2006 (Dewan et al. 2006), however, the first large-scale, well designed GWASs for complex diseases using established technology were conducted by the Wellcome Trust Case Control Consortium (WTCCC) in 2007. This study used 14,000 cases and 3000 controls to investigate the genetic associations of 7 common diseases: bipolar disorder, coronary artery disease, Crohn’s disease, hypertension, rheumatoid arthritis, type 1 diabetes, and type 2 diabetes (Wellcome Trust Case Control, C 2007). Twenty four independent associations of high significance were discovered in this study: 1 in bipolar disorder, 1 in coronary artery disease, 9 in Crohn’s disease, 3 in rheumatoid arthritis, 7 in type 1 diabetes and 3 in type 2 diabetes. Additional associations were also discovered that may likely harbor causative variants.

To date, GWASs have uncovered thousands of loci associated with numerous complex diseases such as diabetes (Type 1 and 2), inflammatory bowel disease, prostate cancer, breast cancer, multiple sclerosis (MS), ankylosing spondylitis (AS), rheumatoid arthritis (RA), Crohn’s disease and many more. As a result, GWASs have provided insight into the biological mechanisms of many of these diseases (McCarthy et al. 2008; McCarthy and Hirschhorn 2008a; Visscher et al. 2012). For example, MS was previously attributed to the Human Leukocyte Antigen (HLA) locus, however, GWASs has expanded the number of associate MS genes to approximately 40 non-HLA loci (Bashinskaya et al. 2015; International Multiple Sclerosis Genetics 2013). Nearly all of these loci are involved in autoimmune pathways including genes coding for components of the cytokine pathway, costimulatory molecules and signal-transduction molecules. Similarly, the IL-23R pathways and IL-17 producing proinflammatory cells were implicated in AS with the identification of IL23R and ILI2B associated genes in GWASs. Additionally, polymorphisms in ERAP1, a gene involved in peptide processing before HLA class 1 presentation, were found to be associated with HLA-B27 positive AS. These findings provide valuable insight into the mechanisms through which HLA-B27 may induce AS; knowledge that has, thus far, eluded scientists (Tsui et al. 2014; International Genetics of Ankylosing Spondylitis, C, et al. 2013).

GWASs have also brought to light the concept of pleitropy in which one variant is implicated in multiple phenotypes (Solovieff et al. 2013). Examples of this can be found in Type 2 diabetes and myocardial infarction which share the gene CDKN2A/B, while Crohn’s disease and Type 2 diabetes share CDKAL1 (Welter et al. 2014). Numerous publications summarize GWAS findings for various traits and diseases (Visscher et al. 2012; Manolio et al. 2008; Solovieff et al. 2013; Daly 2010), however, an inventory of all genome-wide significant SNP associations discovered through GWASs can be found in the NHGRI-EBI Catalog (Welter et al. 2014; Burdett et al.). In addition to a thorough inventory of significant SNP associations, the consortium frequently updates and publishes an interactive diagram detailing the outcomes of these GWASs (Fig. 4.3). These findings have provided valuable insight into biological pathways and molecular interactions that were previously unknown and, as a result, have exponentially expanded the field of scientific research.

Fig. 4.3
An interactive map of 24 human genomes with chromosomal locations. The chromosomal locations in each genome are color-coded for various diseases, physical traits, drug responses, and biological processes.

All currently known SNP associations of genome wide significance (p-value ≤5.0 × 10−8) are cataloged by the NHGRI-EBI (Burdett et al.). This interactive map is updated regularly with the publication of new GWASs and is color coded for various diseases, physical traits, drug responses and biological processes

3.2 Elevating Patient Care: Identifying Novel Therapeutic Targets, Improving Individual Risk Assessment and Harnessing Personalized Medicine

Despite having uncovered over 2000 robust associations to more than 300 complex diseases and traits, the clinical translation of GWASs has been met with sceptisim (Manolio 2013). This uncertainty will be discussed later in this chapter, however, it should be noted that GWAS have made several significant contributions to medicine. Firstly, GWASs have been able to identify biological pathways and individual components of disease manifestation that are viable therapeutic targets. Several loci identified for MS are thought to be good candidates for therapeutics development and are being explored. Furthermore, anti-IL-17 therapeutics are being tested for AS. The identification of IL23R in association with inflammatory bowel disease has led to the development of IL23 inhibitors (Duerr et al. 2006; Sarin et al. 2011).

Secondly, clinicians hope to use population-level GWASs to evaluate individual risk. Due to many limitations discussed later in this chapter, individual risk assessment based on GWASs can be difficult, however, in the unique case of Type 1 diabetes mellitus (T1D) it has been quite successful (Jostins and Barrett 2011; Polychronakos and Li 2011; Clayton 2009). Previously, T1D had been primarily attributed to the major histocompatibility complex (MHC) locus. With the onset of GWASs, 50 additional loci were discovered. Coupled together, these variants were found to account for approximately two-thirds to three-fourths of familial risk of T1D. Risk prediction models using all currently known loci equates to an AUC of approximately 0.9 (which is the highest of any other complex disease) allowing for more accurate individual risk assessment. GWAS has also allowed for a better understanding of certain diseases such as cardiovascular disease and maturity-onset diabetes of the young (MODY) which has allowed clinicians to reclassify patients into more relevant risk groups thus enabling for enhanced treatment/prevention efficiency (Manolio 2013).

Finally, GWASs have enabled a greater understanding of pharmacogenomics and, with it, the concept of personalized medicine (Daly 2010). GWASs are used to identify loci that can impact drug response and/or susceptibility to adverse reactions. These causative variants are thought to be at higher frequencies (thus easier to find in GWASs) since selection pressure(s) may not have influenced these alleles due to the recent introduction of therapeutics into the human environment. These studies tend to focus on, but are not limited to, genes governing metabolic enzymes, drug-targeted receptors, immune response and mitochondrial functions. In a drug specific manner, these studies have identified several contributing loci that have allowed clinicians to tailor treatments based on a patient’s genetic background. Examples include coumarin anticoagulant drugs (drug response) warfarin and acenocoumarol, anti-viral ribavirin (drug response/adverse reactions), and lipid-lowering simvastatin (adverse reactions) (Daly 2009; Cooper et al. 2008). Though not all GWASs investigating drug response and adverse reactions are clinically translatable, personalized medicine has become a fast growing field.

3.3 Commercial: Direct-to-Consumer Personal Genotyping

A final contribution from the emergence of GWASs is the commercialization of genetic testing. Due to the widespread availability of low-cost technology, businesses are marketing ancestry and health-related genetic testing directly to the consumer. Many of these tests include screening for causative alleles associated with numerous physical traits, drug responses, genetic risk factors and inherited conditions such as Alzheimer disease (APOE), celiac disease, cystic fibrosis and various cancers (ex: BRCA1/2) among others. It should be mentioned, however, that the results of these tests (and recommendations provided by the companies) must be taken into account with great caution. At present, substantial controversy surrounds the sales of health-related genetic testing due to, in many instances, the lack of FDA approval of such tests and the absence of coinciding clinical guidance in the interpretation of their results (Jakobsdottir et al. 2009). In 2010, the FDA informed several genetic testing companies that their tests may be considered medical devices and may need approval or clearance prior to marketing (FDA. Letters to Manufacturers Concerning Genetic Tests). While some argued that consumers were not negatively impacted by genetic test results reporting disease risk factors, others cited the many limitations of genetic testing and scientifically unsupported claims about the tests that the consumer may not be aware of or fully comprehend (Green and Farahany 2014; Baudhuin 2014; Janssens 2015; Institute, N.H.G.R 2016). With the hopes of addressing concerns about the potential consequences of consumers receiving inaccurate health assessments, companies could provide evidence of analytical and clinical validity to support their marketing claims. Some chose to remove health/risk assessment reports from their services.

4 GWAS Limitations and Controversy

4.1 Missing Heritability

Despite the success of GWASs, many are skeptical of their translatability into the clinical setting due to their numerous limitations. One of the most often cited limitations is that these studies are underpowered to discover all causative elements, genetic and environmental, that contribute to a disease. When they began, the primary model dominating GWASs was the common disease-common variant hypothesis (CDCV) which states that complex diseases are the result of a number of common variants of small effect, each of which contributes to a percent of the risk in a given population (Schork et al. 2009; Gibson 2011). These variants will typically occur at moderate frequency and were assumed to explain a large proportion of disease occurrence. However, the multitude of data gathered from GWASs seems to only explain about 10–20% of heritability, thus bringing into question the validity of these association-based studies. This is reflected in the low penetrance and low odds ratios recovered from GWASs which many state can be explained more readily by population stratification rather than true association. As a result, several hypotheses have been formulated to explain this “missing heritability” including rare alleles, structural variants and imperfect tagging (weak signal due to variant being some distance away from tag) (Manolio 2013). Due to missing heritability, the statistical associations measured through data analysis will be inaccurate, to varying degrees, in explaining biological associations of complex diseases, thus complicating the clinical translation of GWASs discoveries.

Part of the debate stems from the frequency spectrum of disease-causing variants. Three models have been proposed: a large number of small-effect common variants (infinitesimal model), large number of large-effect rare variants (rare allele model) and the combination of genetic, environmental and epigenetic interactions (broad sense heritability model) (Gibson 2011). In reality, it is likely that each of these genetic architectures, and other variables, interact and contribute to different diseases at variable degrees. However, GWASs are underpowered and not designed to detect any of these models with consistent accuracy. Because GWASs rely on the strength of the association between selected common SNPs and un-genotyped causal variants, it is not surprising that these studies do not recover very many rare alleles which inherently occur at low frequency and are in low LD with nearby common variants. Typically, the larger the deleterious impact of the variant, the lower its frequency due to various selection pressures. As a result, common SNPs identified in association with a phenotype in GWASs are unlikely to explain all genetic variance (Fig. 4.4) (Visscher et al. 2012). Furthermore, the environmental contribution to genetic variation is difficult to decipher. Indeed using large population sizes of diverse ancestral backgrounds may help to elucidate these factors. Studies attempting to incorporate gene-environment interaction are progress in light of these questions (Thomas 2010).

Fig. 4.4
A bubble graph and a scatter plot labeled a and b illustrate effect size versus allele frequency and predictive capacity versus largest G W A S sample size, respectively. In graph a, few alleles and variants are plotted as bubbles along a falling line's top, center and bottom parts. In graph b, different categories of diseases are plotted.

Much of the controversy surrounding GWASs stems from whether common SNPs are well suited to identify causative variants that are predictive of individual risk. Because these variants exhibit small to mode effects, without a thorough understanding of all contributing factors, prediction models based on GWAS discoveries can be insufficient and inaccurate. As a greater understanding of GWASs emerges, as does more comprehensive, alternative methods to collect and analyze genotyped data that will produce more accurate prediction models (Figure adapted by permission from Macmillan Publishers Ltd from Manolio et al. 2009)

Another hypothesis that may explain part of the missing heritability is the contribution of structural variants such as insertions, deletions, inversions and copy number variants (CNV). In fact, CNVs have been implicated in susceptibility of both infectious and complex diseases such as HIV, autism, schizophrenia, several autoimmune diseases, and asthma among others (McCarroll et al. 2014; Lee et al. 2012; McCarroll and Altshuler 2007; Ionita-Laza et al. 2009). These structural variants are segments of the genome, approximately 1 kb and longer in size that encompass one or more genes and exist in different quantities, or copy numbers in the population (Henrichsen et al. 2009). It is thought that CNVs may influence disease phenotype by altering gene expression. In fact, a study conducted by the Wellcome Trust using 14,925 HapMap transcripts with SNPs and CNVs discovered that CNVs captured a surprising 17.7% of total genetic variation in gene expression, while SNPs captured 83.6% (Stranger et al. 2007). There are several proposed mechanisms through which it is believed CNVs regulate gene expression such as modifying gene structures or gene dosage sensitivity. Dosage sensitivity is considered an essential evolutionary mechanism that regulates gene dispensability and duplicability. Although the underlying causes of this sensitivity are poorly understood, several explanations have been put forth. For example, the balance hypothesis states that the proportional expression of interacting proteins (or subunits) must be maintained to allow for proper macromolecule functionality (Schuster-Bockler et al. 2010). Modification of these expression levels due to the presence (or deletion) of CNVs may induce disease phenotypes as a result of absent or dysfunctional protein complexes.

Approaches to incorporate CNVs into the genetic study of disease have been developed, however, several barriers hinder these types of investigations because CNVs are not as thoroughly explored as SNPs. In addition to difficulty in mapping and quantifying CNVs, because these structural variants can encompass multiple genes, it is often difficult to pinpoint if one gene or the interaction of several is responsible for a disease phenotype. Furthermore, because heterozygosity is prevalent in natural populations, the expression of genes encompassed in CNVs could also be largely masked in heterozygous individuals. Despite these limitations, several CNVs have been mapped which has allowed for the incorporation of these variants into GWASs. One such study investigating 8 common diseases was conducted by the Wellcome Trust Sanger Institute using 16,000 cases and 3000 shared controls (Wellcome Trust Case Control, C, et al. 2010). This study brought to light many difficulties in genotyping CNVs, however, it was also able to confirm 3 loci where CNVs were associated with disease: IRGM and HLA for Crohn’s disease, HLA for rheumatoid arthritis and type 1 diabetes and TSPAN8 for type 2 diabetes. However, it was found that these loci had previously been discovered through SNP-based studies. It may be possible to use SNPs to tag specific CNVs of interest by genotyping the CNV first using HapMap or similar tool. However, whether this is generally applicable to all CNVs will require extensive genotyping and mapping of these structural variants. In an effort to better understand CNVs and how best to incorporate them into GWAS, organizations such as the Wellcome Trust Sanger Institute have dedicated projects to address these questions (Institute, W.T.S).

4.2 Other Limiting Factors

Translating GWAS findings to the clinical setting can be complicated even when true causative loci are discovered. One reason for this is that pinpointing the exact risk variant(s) can be very difficult. For example, a 206 kb locus containing 19 genes was identified as associated with childhood asthma. Extensive study has implied two tightly linked genes (ORMDL3 and GSDMB) may be causative, however, deciphering an exact therapeutic target has been difficult (Manolio 2013). This process is further complicated by the implication of individual or interacting genes with currently unknown function. Additionally, GWAS has revealed a great number of loci lie in noncoding or regulatory regions whose clinical translation is hindered for many reasons. Many regulatory elements and other noncoding regions are unannotated and the mechanisms through which these components function that are currently unknown to the scientific community may be abundant (Ward and Kellis 2012). Though studies investigating noncoding regions are underway, it may be some time before a more accurate understanding of their contributions of complex diseases is uncovered.

Unlike highly penetrant Mendelian disorders like cystic fibrosis, risk predictions made for complex traits can only be probabilistic assumptions, thereby, mitigating their impact on clinical response. Indeed, because the effects of most SNPs are subtle, estimating their impact with accuracy is problematic; it is difficult to predict the probability of risk based on the presence of SNPs of unknown influence. Considering all discovered causative variants and their genotype scores when assessing individual risk has been shown to improve prediction accuracy. However, due to the vast amount of unknown contributing factors that introduce discrepancy between biological and statistical associations, it is very likely that any risk prediction of complex traits will be flawed to some degree. It should be remembered that data collected from GWASs are limited to the specific population used and variables explored in the study. As a result, when translating findings to a different population or individual, changes in allele frequencies, phenotypic effect sizes, disease incidences, etc. may drastically impact risk prediction (Manolio 2013).

As a greater understanding of these limitations is gained, techniques to improve GWASs are being explored. Many of these challenges may be addressed with deeper sequencing-based characterization of genetic variants, imputation, fine mapping, systems genetics, clinic-based study designs and denser SNP arrays in addition to CNV-SNP hybrid arrays (Manolio 2013; Civelek and Lusis 2014). Much of the controversy surrounding the clinical translatability of GWAS findings has been exacerbated by the expectation of fast pace discovery and optimistic scenarios for clinical use. However, GWASs are still a relatively young method and developing and validating techniques to enhance their performance is a slow and continuous process. As a result, the data generated by these association studies must be carefully weighed before clinically relevant conclusions can be drawn. Despite this, GWASs have uncovered hundreds of thousands of disease-associated loci that were previously unknown. Additionally, these association studies have brought to light genetic mechanisms, such as pleiotropy (one variant can affect multiple phenotypes), that were previously unknown or unclear. These discoveries have led to great strides in research and therapeutics for several diseases such as multiple sclerosis, ankylosing spondylitis, rheumatoid arthritis, inflammatory bowel disease, and type 1 diabetes. One must keep in mind that human genetics is a vast and exceedingly intricate field, dominated by undiscovered phenomenon. GWASs have made undeniably significant contributions to the scientific and clinical fields and have enabled us to place a foot in the door towards understanding the genetics of disease.

5 Discovering True SNP-Associations: Factors to Consider

Numerous factors can bias GWAS results. As a result great care has to be taken from study design, to data analysis and validation and it is important that each stage be carefully evaluated for all possible variables and strategies used to discover, reduce, eliminate and correct for prejudice that may produce spurious SNP associations.

5.1 Population Sampling

Typically, GWASs have been conducted using a case-control design. These studies primarily compare a “case” group of individuals selected for a specific phenotype (presumed to have a high prevalence for susceptibility alleles for that trait) and a “control” group who does not possess the phenotype (thought to have a lower prevalence of causative alleles) (McCarthy et al. 2008). Careful consideration is taken when selecting individuals for the above mentioned groups to improve the power and credibility of the study and eliminate bias. Various factors can impact GWAS results based on the selection criteria for case and control groups (Bush and Moore 2012; McCarthy et al. 2008).

One of the most important factors is sample size, here, it may be considered that the larger the sample size, the more comprehensive the study. Another factor is selection bias which suggests that the population acquired for the study may not be representative of the larger population that they represent. This is important to both case and control groups and is inclusive of misclassification bias and population substructure. Misclassification bias results from failing to correctly assign an individual to the correct case or control group. This may occur when a stringent definition of the phenotype is not used for subject selection. Often the consequences of these are small when using large sample sizes, however, strictly defined phenotypes are required when studying very common traits such as obesity. Population substructure includes both population stratification and cryptic relatedness. Population stratification is the disproportionate representation of different ancestral and demographic backgrounds between case and control groups. As discussed earlier, SNP alleles (their frequency) and LD may vary greatly between different populations of humans (i.e., European descent vs African). Thus, in order to avoid ethnicity-specific genetic influences on GWAS associations, populations included in case and control groups should be broadly similar. Finally, cryptic relatedness refers to kinship between case and control subjects that are unknown to the investigators at the time of selection and segregation into sampling groups. This can result in spurious results by introducing genetic bias. Fortunately, individual samples may be eliminated from analysis upon discovery of these population substructures. Several statistical tools exist to detect and adjust for these biases (Bush and Moore 2012; McCarthy et al. 2008).

Originally isolated to single population cohorts, more recently, GWASs have been utilizing trans-ethnic populations which have shown increased power due to increased sample size, enhanced mapping of causal variants due to the inclusion of populations with shorter LD and higher discovery rates for rare variants that are persistent even in the presence of differing LD patterns. The use of these diverse groups also minimize the effects of biases such as population substructure seen in single population studies (Li and Keating 2014). Though these studies introduce, sometimes substantially, variable genetic histories that can complicate both study design and data analysis, they may provide a more comprehensive understanding of the genetics of disease (Rosenberg et al. 2010).

5.2 Technology

Investigators may choose their genotyping platform based on their specific study design and financial restraints. Today, there are several companies, such as Affymetrix and Illumina, that offer both custom and universal microarray platforms for SNP genotyping (Perkel 2008). These microarray platforms, which contain DNA sequences that recognize specific alleles, allow for each genome to be scanned for thousands of variants at once. Because LD and associated SNPs differ between populations of different ancestral and demographic backgrounds, it is important to select or create an accurate and comprehensive microarray composed of genetic markers that are specific to the target population(s) (McCarthy et al. 2008; Manolio et al. 2008). Many companies offer population specific arrays and/or the option to customize arrays according to a specific study design. Additionally, researchers may also include CNVs in their analysis by using SNP-CNV hybrid arrays (Perkel 2008).

5.3 Data Quality

Several steps are taken following data collection to ensure only high quality data is analyzed. Significant differences in data extraction between case and controls must be avoided and poorly performing assays should be assessed using genotyping performance metrics. Furthermore, the accurate identification of true SNPs and the proper conversion from raw data to meaningful genotypes is essential (Nielsen et al. 2011; Nielsen et al. 2012). The next quality control step is evaluating the distribution of SNPs considering the relationship between genotype and allele frequencies. Often, drastic departures from the Hardy-Weinberg Equilibrium (HWE) in control groups can be discarded. The HWE states that, in the absence of evolutionary forces such as mutations, genetic drift and selection, genotype and allele frequencies will remain in equilibrium, constant from generation to generation. Because these evolutionary forces do have a practical impact on the human population, the HWE describes an ideal, hypothetical scenario. Therefore, caution is exercised when using this metric as a quality control measure to ensure true genotyping errors are eliminated while true associations displaying disequilibrium are retained (McCarthy et al. 2008).

5.4 Data Analysis

GWAS data analysis is another step during which biases may be introduced that can produce spurious associations between SNPs and disease phenotypes. Data can be analyzed in different ways; one way is to perform single-locus statistical tests which examine each high quality SNP for association with the phenotype. For case-control study designs, it is typical to utilize contingency tables (Cochran-Armitage test) or logistic regression models (Bush and Moore 2012). These tests are used in categorical data analysis to assess the presence of association between multiple variables and allows for adjustments for covariates such as population substructure. A genotype model (typically multiplicative and/or additive) must also be chosen, followed by the correction of GWAS data for known confounding or interacting variables such as sex, age and clinical covariates. Correcting for factors that may influence the trait being studied will reduce spurious associations due to sampling biases and other artifacts of study design (Bush and Moore 2012). Additionally, each single locus analysis generates a p-value that defines the significance, or deviation from the null hypothesis (there is no association between genotype and phenotype), of each SNP association. However, because there is a multitude of data generated by GWASs and high likelihood of obtaining false positives, these p-values are corrected for multiple testing taking into consideration both the power of the study and linkage disequilibrium. Thus, Bayesian approaches are often employed to detect a false positive report probability (FPRP) (McCarthy et al. 2008). FPRP takes into consideration the observed p-value, the prior probability that association between the SNP and phenotype is real and the statistical power of the test to evaluate whether there is a true deviation from the null hypothesis. Other methods also exist to correct of multiple testing such as permutation testing and genome-wide significance (Bush and Moore 2012).

Evaluating interacting effects between multiple loci can be daunting task when one considers the sheer volume of data generated from single-locus analysis. However, it is more often likely that many genes, in small effect, contribute to the phenotypic outcome of a disease. As a result, multi-loci analysis is essential. There are several approaches to undertake this task in a reasonable manner that reduces time and computational power. One way is to select a subset of genetic variants from the single-locus analysis that may have exceeded a set arbitrary significance threshold. Alternatively, the subset of variants can also be selected based on functional biological pathways that suggest the likelihood of specific genes interacting to contribute to disease. More recently, scientists have promoted the simultaneous analysis of all SNPs in order to maximize the discovery of associations in GWASs. This is because numerous genes of unknown function, small effect and low significance can be correlated; it may not be sufficient to derive gene-gene interactions from a small subset of SNPs (Hoggart et al. 2008). Analysis methods that consider the distribution of all SNP effects in the genome, using Bayesian methods for example, exist and are used in GWASs of animals and plants. Many believe it would be more prudent to utilize these techniques to evaluate all genetic and phenotypic information simultaneously in humans. Whole genome approaches focused on estimating genetic variation considering all SNPs simultaneously find that a larger proportion (one third to one half) of additive variation can be discovered in GWAS (Wray et al. 2013).

5.5 Data Validation

There are various criteria for validating and replicating GWAS findings. First, when conducting a replicate GWAS, one must consider using a larger sample size than the original sample population. This is to account for the “winner’s curse” which implies that the originally detected effect is likely stronger in the original GWAS sample than the general population (McCarthy et al. 2008). That being said, the replication sample should be broadly similar to the original population to account for population substructure. If the effect is successfully replicated, studies should be conducted using populations from different ancestries to evaluate both replication and the influence of ethnicity on the genetic variants. Replication studies should be conducted using the same phenotype criteria and data collection and analysis protocols as the original study. Alternatively, researchers have employed multi-stage study designs to reduce the cost of full-scale replication while retaining power. In these studies, promising signals from the first stage are used to identify a subset of SNPs to retype in the next stage (Ioannidis et al. 2009).

Using the criteria mentioned above, an association may be confirmed true if a similar effect (magnitude and direction) is seen in the same genomic region (SNP or SNP in high LD) as the originally identified genetic variant. Several factors can explain an unreproducible effect. The most straightforward is that the initially observed results were wrong. An alternative explanation is that there is an unknown source of heterogeneity between the original and replicate studies. The causes of heterogeneity are numerous including variable patterns of LD and the impact of environmental factors between the original and replicate sample population. Some contributions to heterogeneity can be tested for, such as LD patterns, while others a much more difficult to decipher (environmental).

5.6 Data Meta-Analysis

The discovery of causative loci of both modest and large effect size will be invaluable to the understanding of disease manifestation. However, as individual GWASs are underpowered, only a fraction of disease-associated loci, those that exhibit the largest effect, are discovered. As a result, researchers have investigated the use of meta-analysis which combines many GWASs to improve power at low cost. Furthermore, meta-analysis can also serve as another step of replication and validation. For example, several novel susceptibility alleles were discovered and previously known variants of modest effect size were confirmed when three type 2 diabetes GWASs were combined (4700 cases and 5700 controls) (McCarthy et al. 2008). There are several factors to consider when combining GWASs: first, each individual study should essentially be exploring the same hypothesis; and second the general design and analysis protocols should be similar and quality control measures should be standardized along with any covariate adjustments. Additionally, all study results should be reported relative to a common genomic build and reference allele. Because not all of these criteria can be met fully, statistical quantification of heterogeneity is necessary to evaluate to what degree each study varies. Additionally, when analyzing the same allele across multiple sets of studies, differences in genotyping platforms, covering various markers, may prove to be an obstacle. In this case, GWAS datasets may be imputed to generate results for common sets of SNPs across studies using known LD patterns and haplotype frequencies from references such as HapMap or 1000Genomes project. In this way, genotypes for SNPs, not directly genotyped in the study, can be computationally estimated (Bush and Moore 2012). However, imputation can introduce biases and, therefore, selection of a reference panel that reflects the same general population as the study is important (McCarthy et al. 2008; Ioannidis et al. 2009).

5.7 Follow-Up Analysis of Confirmed Signals

Following GWASs, validation, replication and, when possible, meta-analysis, recovered true effects can be investigated and validated further. These genomic regions should be re-sequenced and subjected to fine mapping in an effort to identify the specific causal variants. These genetic elements can then be evaluated for biological functionality in order to understand the mechanisms through which they induce disease (Ioannidis et al. 2009). These findings can then be translated to the clinical setting through various avenues such as drug development against causal variants, individual risk assessment and personalized medicine.

6 Concluding Remarks

GWASs have enabled researchers to explore the genetics of complex diseases and gain a greater understanding of how the genome works and how it interacts with the environment during expression of genetic information. These SNP-based association studies have made a significant impact on both research and clinical fields and have introduced individual risk assessment and personalized medicine as well as genetics-based drug development as routes through which to improve public health. Additionally, GWASs have also enabled us to explore the history of the human race and what defines many of our characteristics. Though much controversy surrounds the validity of using common SNPs to study complex diseases, researchers continue to explore, fine-tune and expand on this method as evident in the organization of a multitude of consortia dedicated to various aspects of GWASs, the growing number of cohort studies and establishment of various biobanks (Chatterjee et al. 2016). With continued effort, these studies are sure to yield a more conclusive, clinically relevant understanding of human genetics.

7 Notes

  1. 1.

    Genome Wide Association Studies (GWASs) are a set of approaches to identify genetic variants in different individuals and/or populations, associated with a particular disease or a trait.

  2. 2.

    GWASs utilize microarray/SNP genotyping technology to identify associations between specific phenotype (i.e., disease state or a trait) and genetic variants across the entire genome.

  3. 3.

    GWASs are based on the co-segregation or linkage disequilibrium of un-genotyped disease-causing variants with genotyped marker SNPs. Disease-causing genes can be mapped by linkage analysis without prior knowledge of function and product of the gene. A positive association arises when there is a greater frequency in the presence of a genetic variant in individuals with a disease than in control/unaffected individuals.

  4. 4.

    There are three different types of associations that can arise from GWASs: direct, indirect and spurious. Great care must be taken when designing and conducting a GWAS and subsequently translating its genotyping data into meaningful associations so as to avoid conclusions based on spurious results. A correct determination of disease phenotype and a choice of a GWAS genotype model play an important role in this analysis. Appropriate statistical methods reduce the risk of false positive results. One of the most important factors affecting the outcomes of GWAS is sample size. A large study population is required.

  5. 5.

    GWAS studies detect association not causation, thus further investigation and validation of the results and underlying molecular mechanism(s) is required.

  6. 6.

    GWASs have enabled a greater understanding of origin of many diseases, identification of novel therapeutic targets and the development of novel drugs. It also improved individual risk assessment and personalized treatment approaches.

  7. 7.

    Despite the obvious success, translating GWAS findings to the clinical setting can be complicated even when true causative loci are discovered.