Human Genetics

, Volume 130, Issue 1, pp 41–58

Type 2 diabetes and obesity: genomics and the clinic


  • Mary E. Travers
    • Oxford Centre for Diabetes, Endocrinology and Metabolism, Churchill HospitalUniversity of Oxford
    • Oxford Centre for Diabetes, Endocrinology and Metabolism, Churchill HospitalUniversity of Oxford
    • Wellcome Trust Centre for Human GeneticsUniversity of Oxford
    • Oxford NIHR Biomedical Research Centre, Churchill Hospital
Review Paper

DOI: 10.1007/s00439-011-1023-8

Cite this article as:
Travers, M.E. & McCarthy, M.I. Hum Genet (2011) 130: 41. doi:10.1007/s00439-011-1023-8


Type 2 diabetes (T2D) and obesity represent major challenges for global public health. They are at the forefront of international efforts to identify the genetic variation contributing to complex disease susceptibility, and recent years have seen considerable success in identifying common risk-variants. Given the clinical impact of molecular diagnostics in rarer monogenic forms of these diseases, expectations have been high that genetic discoveries will transform the prospects for risk stratification, development of novel therapeutics and personalised medicine. However, so far, clinical translation has been limited. Difficulties in defining the alleles and transcripts mediating association effects have frustrated efforts to gain early biological insights, whilst the fact that variants identified account for only a modest proportion of observed familiarity has limited their value in guiding treatment of individual patients. Ongoing efforts to track causal variants through fine-mapping and to illuminate the biological mechanisms through which they act, as well as sequence-based discovery of lower-frequency alleles (of potentially larger effect), should provide welcome acceleration in the capacity for clinical translation. This review will summarise recent advances in identifying risk alleles for T2D and obesity, and existing contributions to understanding disease pathology. It will consider the progress made in translating genetic knowledge into clinical utility, the challenges remaining, and the realistic potential for further progress.

Introduction: genetic diseases of global impact

The rising prevalences of obesity and type 2 diabetes (T2D) indicate a crisis in global health (IDF Diabetes Atlas 2010). Worldwide, there are more than 400 million adults with a body mass index (BMI) exceeding 30 kg/m2 (defining “obesity”) and 220 million with T2D (Fig. 1), figures which are projected to rise to 700 million and 366 million, respectively, by 2030 (World Health Organisation 2010; International Diabetes Federation 2010). Both diseases have substantial implications for mortality: in 2004, over 112,000 deaths in the United States were attributed to increased cardiovascular disease (CVD) resulting from obesity (Flegal et al. 2007), and in the same year, diabetes-related complications were estimated to account for 5% of all global mortality (World Health Organisation 2010). In 2006, for the first time, more people died as a result of being overweight than underweight, whilst treatment of T2D and obesity-related complications was responsible for 8% of all healthcare costs in the European Economic region (World Health Organisation 2010).
Fig. 1

Prevalence of type 2 diabetes and ‘overweight’ by country (as of 2010). aColour intensity represents percentage of individuals aged 20–79 with diabetes (fasting plasma glucose >7.0 mmol/L). bColour intensity represents percentage of females aged 15–100 with BMI > 25 kg/m2. cColour intensity represents percentage of males aged 15–100 with BMI > 25 kg/m2 (“overweight” data from World Health Organisation 2010:; diabetes data from International Diabetes Federation:

Despite these sobering figures and their ramifications for individuals, families and healthcare systems, current understanding of the basic pathophysiology of T2D and obesity remains rudimentary. Whilst both diseases have monogenic and syndromic counterparts—including maturity onset diabetes of the young (MODY), neonatal diabetes and Prader–Willi syndrome (PWS)—which have enjoyed significant progress in the characterisation of causal mechanisms, the overwhelming majority of cases of T2D and obesity have a more complex aetiological basis. Typically, an individual’s risk of disease development reflects the intersection of inherited variation at many genetic sites and exposure to modern environmental stressors, including increased energy intake and decreased physical activity (Stumvoll et al. 2008).

Clearly, the current explosion in T2D and obesity prevalence must be due primarily to environmental change—the timescales involved are far too short for shifts in susceptibility variant frequency. But not everyone exposed to the increasingly pervasive “obesogenic” environment seems at equivalent risk. Quantifying the genetic component of complex disease is not straightforward, but most estimates place the heritability of T2D at around 25% (Poulsen et al. 1999) and that of obesity between 50 and 80% (Maes et al. 1997). Heritability estimates are likely to fluctuate with time and space (reflecting changes in environmental variance and overall disease prevalence), but there is little empirical evidence of an attenuation of heritability in more contemporary studies (Wardle et al. 2008).

The clinical benefits of genomics: lessons from monogenic obesity and diabetes

Thanks to their high penetrance, the alleles responsible for rare, monogenic forms of non-autoimmune diabetes and obesity were relatively easily identified through linkage analysis (reviewed in Owen and Hattersley 2001; O’Rahilly and Farooqi 2006). These discoveries have led to molecular classifications of disease with demonstrable prognostic and therapeutic relevance. For example, individuals with maturity onset diabetes of the young (MODY) due to mutations in HNF1A respond particularly well to treatment with sulfonylureas, whilst those with mutations in glucokinase (GCK) can often come off medication entirely given their relatively benign prognosis (Schnyder et al. 2005; Pearson et al. 2003). Infants with neonatal diabetes due to mutations in the KCNJ11 gene, conventionally treated with insulin, have typically shown substantial improvements in diabetes control when oral sulfonylureas are substituted (Pearson et al. 2006; Gloyn et al. 2004). Meanwhile, identification of mutations in the leptin gene (LEP) causing severe early-onset obesity (Montague et al. 1997) has resulted in the development of recombinant leptin therapy as a life-saving treatment for affected children (Farooqi et al. 1999).

As a consequence of such advances in genetic understanding and classification, molecular diagnostics and personalised therapy are now standard components of clinical care for patients with these monogenic forms of disease and for their families. The ambition, then, is for an improved understanding of the genetic basis of common forms of T2D and obesity to inspire similar insights into disease biology and to underpin future developments in clinical care.

A brief history of susceptibility gene discovery

The first and unavoidable step towards genomic medicine lies in the identification of genetic variants robustly associated with the disease of interest. However, the multifactorial nature of complex diseases such as T2D and obesity has rendered even this initial stage an enormous challenge.

Family-based linkage approaches, so successful in identifying the mutations responsible for monogenic and syndromic subtypes of obesity and T2D, proved poorly suited to revealing the variants of lower penetrance implicated in more typical forms of the diseases. By 2006, the Human Obesity Gene Map (Rankinen et al. 2006) listed 253 loci “linked” to obesity, but very few of these had been replicated in multiple studies. A meta-analysis of linkage data from >31,000 individuals and ~10,000 families failed to reveal any convincing BMI-influencing loci (Saunders et al. 2007).

Attention turned instead to association approaches in larger, unrelated samples sets (Merikangas and Risch 2003). Association analyses, however, rely upon typing the causal variant or a closely correlated proxy, and hence, initial efforts were constrained by practical limitations of genotyping cost and capacity to the evaluation of variants within pre-defined candidate genes. Nonetheless, this approach heralded the first wave of robustly associated variants. For T2D, non-synonymous variants in genes encoding the targets of two drugs widely used in T2D management [P12A in PPARG (Altshuler et al. 2000) for thiazolidinediones and E23K in KCNJ11 (Gloyn et al. 2003) for sulfonylureas] showed consistent, though modest (per-allele odds ratios of ~1.2), evidence of association with disease risk. For obesity, variants within two genes already known to harbour mutations implicated in monogenic obesity—MC4R (V103I, I251L) and PCSK1 (N221D, Q665E-S690T)—were shown to be associated with common obesity risk (Heid et al. 2005; Geller et al. 2004; Benzinou et al. 2008).

However, the candidate gene approach is restricted by its intrinsic reliance upon prior knowledge and expectation. When, as with T2D and obesity, our understanding of disease pathogenesis is imperfect, there is a manifest need to extend the search for susceptibility variants across the entire genome in an unbiased, hypothesis-free manner. The first gene to be implicated in T2D susceptibility without prior biological candidacy was TCF7L2, discovered following systematic association analysis across a region of previously identified linkage (Grant et al. 2006). The most strongly associated variants at this locus have the greatest effect on T2D susceptibility of any common variant so far identified.

The advent of genome-wide association studies (GWAS) proved transformative for the field. For T2D, the first wave of GWAS in 2007 confirmed the known loci at PPARG, KCNJ11 and TCF7L2, but added a further six novel loci including signals near CDKAL1, HHEX, SLC30A8, IGF2BP2 and CDKN2A (Frayling et al. 2007; Saxena et al. 2007; Scott et al. 2007; Sladek et al. 2007; Zeggini et al. 2007; Steinthorsdottir et al. 2007; WTCCC 2007). At the sixth locus, near FTO (Frayling et al. 2007), association with T2D was predicated entirely on case–control differences in adiposity, serendipitously revealing FTO as the first common variant signal for body mass index and risk of obesity.

Successive rounds of GWA meta-analysis (Zeggini et al. 2008; Willer et al. 2009; Voight et al. 2010; Speliotes et al. 2010; Loos et al. 2008; Heid et al. 2010) have brought the count of confirmed common variant signals for T2D to more than 40, and for BMI and obesity to over 30 (McCarthy 2010). As expected, the improvement in power derived from increasing sample size has been particularly beneficial in exposing variants of smaller effect and more extreme risk allele frequency (Fig. 2). For T2D, the Diabetes Genetics Replication and Meta-analysis (DIAGRAM) Consortium first combined data from three published GWAS to reveal six novel loci (Zeggini et al. 2008) and subsequently aggregated data from an additional five GWAS to capture a further 12 signals (Voight et al. 2010). The Genomic Investigation of Anthropometric Traits (GIANT) Consortium first combined data from 15 GWAS cohorts to reveal six new loci contributing to variation in BMI, as well as replicating the by-then established common variant signals at FTO and MC4R (Willer et al. 2009). Almost in parallel, the deCODE group reported ten BMI-influencing loci (Thorleifsson et al. 2009). The synthesis of these two efforts, involving genetic analysis of almost 250,000 individuals, confirmed 14 existing loci and revealed 18 novel signals for BMI and obesity (Speliotes and Johnson 2010). Common copy number variants (CNVs), as opposed to SNPs, have only been implicated at one of these loci, NEGR1, where a 45-kb deletion upstream of the gene is in perfect LD with a strongly associated SNP (Willer and Havulinna 2009). The role of rare CNVs in obesity has not been well examined so far, but rare deletions at chromosome 16p11.2 have been shown to have high penetrance for obesity and mental retardation (Walters et al. 2010).
Fig. 2

Allele frequencies and effect sizes of known BMI-influencing and T2D-susceptibility loci. Risk allele frequencies and effect sizes are shown in European populations unless otherwise stated. Note how most recently identified loci (in red, filled markers) from ever-larger and better-powered meta-analyses have revealed loci with smaller effect sizes (bottom of plot) and/or more extreme risk allele frequencies (far left and right of plot). Gene names represent best candidate transcripts on the basis of location and biological plausibility, not necessarily proven to be causal. a BMI-influencing loci, coloured and filled by year of identification and with symbols according to experimental method of discovery. Note that NEGR1, TMEM18, SH2B1 and KCTD15 were identified simultaneously by single GWAS (Thorleifsson et al. 2009) and GWAS meta-analysis (Willer et al. 2009). b T2D-susceptibility loci, coloured and filled by year of identification and with symbols according to experimental method of discovery. KCNQ1: two SNPs in low LD in European populations are likely to represent independent association signals. UBE2E2, C2CD4A/B, KCNQ1 (odds ratio ~1.4): odds ratios from Japanese populations and allele frequencies from HapMap-JPT. SRR, PTPRD: odds ratios from Chinese populations and allele frequencies from HapMap-CHB. DUSP8: odds ratio shown across both maternally and paternally inherited alleles—there is evidence that the risk effect may be larger from paternally derived alleles and protective from maternally derived alleles

Most of the early GWAS involved individuals of European descent, but a growing number of discoveries are being made in other ethnic groups. For example, BMI-associated variants in MC4R have been confirmed in Indian Asians (Chambers et al. 2008, Been et al. 2010), whilst GWAS in East Asians have revealed several novel associations for T2D (Yasuda et al. 2008; Unoki et al. 2008; Tsai et al. 2010, Yamauchi et al. 2010). By and large, although differences in allele frequency and effect size (combined with chance) may mean that initial discoveries are more likely to be made in one or other ethnic group (Myles et al. 2008), almost all the common variant signals for T2D so far examined have a consistent effect on diabetes risk across multiple ethnic groups (Waters et al. 2010). This provides strong evidence that common variant signals are driven by causal variants that are themselves frequent (and of rather ancient origin), rather than by “synthetic” effects of multiple rare causal variants—as has sometimes been proposed (Dickson et al. 2010). Further evidence of the utility of studies in diverse ethnic groups is provided by fine-mapping of obesity association at the FTO-locus in African-derived populations. Thanks to their weaker linkage disequilibrium patterns, the list of potentially causative variants has begun to be narrowed (Hassanein et al. 2010; Adeyemo et al. 2010).

The majority of GWAS have approached T2D as a binary disease phenotype, comparing allele frequencies between case and control groups. In contrast, most studies have tackled obesity through its cognate quantitative trait, BMI. The few studies that have taken the complementary approach have been especially instructive. The ‘Meta-Analyses of Glucose and Insulin-related traits Consortium’ (MAGIC; Dupuis et al. 2010) investigated the genetic basis of normal physiological variation in continuous glycaemic measure such as fasting glucose and insulin. Evaluating the resulting common variant signals with respect to T2D risk showed a wide disparity in consequences, ranging from relatively strong (e.g. MTNR1B) to negligible (e.g. G6PC2) effects on diabetes susceptibility. This range demonstrates that the mechanisms influencing physiological and pathophysiological variation in glucose homeostasis are only partially overlapping. Similarly, case–control studies of extreme obesity (Scherag et al. 2010, Meyre et al. 2009) have identified loci that seem quite distinct from those shown to influence population-level variation in BMI.

The first step to translation: from associated variants to biological mechanisms

The discoveries of the past few years have necessarily focused, for both theoretical and practical reasons, on the identification of common susceptibility-alleles, with emphasis rightly placed on obtaining robust statistical support for association. From a clinical perspective, the value of these discoveries lies primarily in the opportunities they provide for enhanced understanding of disease biology, and the major challenge lies in unlocking the mechanisms whereby these associated variants influence disease progression. Success in this endeavour is not unduly influenced by the relatively modest effect sizes of many of the loci identified: in the case of KCNJ11 and PPARG, for example, the therapeutic consequences of manipulating protein function (with sulfonylureas or thiazolidinediones, respectively) are out of all proportion to the limited effects of T2D-susceptibility variants within their encoding genes.

For most of the common variant loci revealed by GWAS, the transition from association signal to causal mechanism has proved far from straightforward. It has been challenging, because of the extensive linkage disequilibrium in most human populations, to tie the association signal to a single causal allele. With so many signals mapping to non-coding sequence, it has been difficult to define which of many regional transcripts is likely to be mediating the association. Both issues represent obstacles to the translation of genetic discoveries into an improved understanding of disease biology.

Fortunately, there are exceptions. At a subset of T2D-susceptibility loci, including GCKR, PPARG and SLC30A8, there is substantial statistical and biological evidence to support particular coding sequence variants as causal. For example, the T2D association signal on chromosome 2p23 (Saxena et al. 2007) can, through a combination of genetic and functional approaches, reasonably be attributed to the P446L variant in the glucokinase regulatory protein GCKR, one of 17 genes mapping to the original 420 kb interval of association (Orho-Melander et al. 2008). Functional characterisation has shown that the T2D-risk allele alters fructose-6-phosphate-mediated regulation of GKRP, with consequences for glycolytic flux which explain the variant’s effects on both glucose and lipid metabolism (Beer et al. 2009). The R325W variant in SLC30A8 (Sladek et al. 2007) represents a further functionally active missense polymorphism which appears causal for its local association. SLC30A8 encodes a zinc transporter, ZnT8, known to be expressed in the pancreatic islet and implicated in the proper function of β cell insulin granules (Nicolson et al. 2009). In mice, β cell-specific knockouts of Znt8 are glucose intolerant and display defects in insulin production, crystallisation, packaging and secretion (Wijesekara et al. 2010), whilst the variant protein shows reduced zinc transport activity (Nicolson et al. 2009).

However, even the most obvious candidate variants must be treated with some caution, as the story of the T2D-associated E23K variant in KCNJ11 demonstrates (Gloyn et al. 2003). All the right indications were there: a missense mutation in an excellent biological candidate gene, encoding a subunit of the KATP channel which is central to glucose-stimulated insulin secretion. However, fine-mapping efforts have demonstrated that in European, West African and East Asian populations, E23K is in perfect LD with a second nonsynonymous variant (A1369S) within the adjacent gene ABCC8, which happens to encode the second protein component of the same KATP channel (Florez et al. 2004). Recent functional studies suggest that the ABCC8 variant may, in fact, be the stronger candidate for mediating a T2D-risk effect (Hamming et al. 2009).

Progress has also been made at loci for which no obvious causal coding variant can be identified. At TCF7L2, fine-mapping studies have converged upon the intronic SNP rs7903146 as the most compelling candidate variant (Helgason et al. 2007). ChIP-Seq (chromatin immunoprecipitation sequencing) studies have shown that this variant maps within a region of islet-specific open chromatin, and that the two alleles differ in their capacity to achieve or maintain this state (Gaulton et al. 2010). Whilst alterations in chromatin state have not yet been shown to alter TCF7L2 transcription, this seems the most likely mechanism. TCF7L2 mRNA levels in human pancreatic islets increase with number of risk alleles and are fivefold higher in human islets isolated from T2D patients than those isolated from controls (Lyssenko et al. 2007). Over-expression of TCF7L2 leads to reduced glucose-stimulated insulin secretion (Lyssenko et al. 2007) and, perhaps contradictorily, reduced apoptosis (Shu et al. 2008).

However, relatively few T2D- and obesity-susceptibility loci have been so obliging. The T2D association signal on chromosome 10q, for example, contains three genes—HHEX, KIF11 and IDE—each with biological credibility for a role in T2D pathogenesis. Even the much-studied BMI association signal near FTO comprises a 47-kb linkage disequilibrium block which may be involved in regulation of the adjacent gene, RPGRIP1L, as well as FTO itself (Stratigopoulos et al. 2011). Whilst there is evidence from rodents that manipulation of Fto expression does indeed influence adiposity (Church et al. 2009, 2010), data from human studies are less persuasive (Meyre et al. 2010; Boissel et al. 2009). The T2D-susceptibility signal on chromosome 9p maps some 200 kb from the coding sequence of CDKN2A and CDKN2B, genes considered, on the basis of known biology, the most plausible candidates (Krishnamurthy et al. 2006; Harismendy et al. 2011). Whilst the discovery of a noncoding RNA (variously called ANRIL or CDKN2BAS), transcribed from the region of maximal diabetes association and thought to influence CDKN2B expression (Pasmant et al. 2007), highlights a potential disease mechanism, empirical evidence is so far not available. To be so reliant upon prior biology can feel a somewhat uncomfortable necessity for an experimental approach which explicitly set out to be biology-agnostic.

Despite challenges in establishing the relevant mechanism at each associated locus, the unbiased genome-wide approach has offered considerable insight into the broad pathophysiological processes of disease pathogenesis. Analyses in normoglycaemic individuals have shown that most T2D-associated loci exert their primary effects on disease risk through reduced insulin secretion rather than increased insulin resistance (Perry and Frayling 2008; Voight et al. 2010), helping to address the long-standing debate over the relative roles of these processes in diabetes pathogenesis. Genes implicated in cell cycle regulation are overrepresented within T2D-associated regions, adding weight to the notion that control of β cell mass is central to the maintenance of normal glucose homeostasis (Voight et al. 2010). Amongst loci where the T2D-risk alleles are associated with reduced insulin sensitivity, only the signal at FTO seems to be driven by an effect on body mass index. At the other loci, including KLF14 and ADAMTS9, association evidence has the potential to highlight entirely novel mechanisms connecting adipocyte and hepatic dysfunction to insulin resistance and diabetes.

Equivalent insights into the pathogenesis of obesity and the regulation of body mass index have been harder to derive. The fact that many of the most obvious positional candidates at BMI and obesity-associated loci have proven or suspected roles in the function of the central nervous system (CNS) is consistent with the known role of the hypothalamus in appetite regulation (Willer et al. 2009; Speliotes et al. 2010). For example, NEGR1 is involved in neuronal growth (Marg et al. 1999), whilst SH2B1 is involved in hypothalamic leptin signalling (Ren et al. 2007). Sh2b1 knockout mice are obese, but the phenotype can be rescued by targeted expression of Sh2b1 in neurons (Ren et al. 2007). These findings reinforce the view of common obesity as a disorder of behaviour rather than metabolism, mediated through hypothalamic dysregulation. In contrast, equivalent studies of fat distribution, rather than overall adiposity (Lindgren et al. 2009; Heid et al. 2010; Heard-Costa et al. 2009) have highlighted candidate transcripts implicated in the regulation of adipocyte development and function.

The growing power of techniques for genetic and functional evaluation of regional targets is likely to catalyse further successes in characterising causal variants and connecting them to the genes, pathways and networks they modulate. For example, transethnic fine mapping approaches, particularly in samples of African origin, should help to pin down the causal variants within common GWAS signals. Resequencing efforts will reveal novel loci, as well as rare, obviously functional, coding alleles in transcripts close to known loci that can be causally linked to the disease of interest (Nejentsev et al. 2009). At the same time, ever richer genomic characterisation of key tissues will help tie disease-associated variants to local transcript expression and other regulatory functions, whilst functional studies in cellular systems and animal models will expand the ability to explore the physiological effects of genetic variation.

Nevertheless, for diseases such as diabetes and obesity, limited access to the tissues most obviously implicated in disease pathogenesis—the pancreatic β cell and hypothalamus, respectively—represents a serious obstacle to such studies. Advances in stem cell science offer the exciting prospect of overcoming this limitation through re-differentiation of patient-derived induced pluripotent stem (iPS) cells to generate authentic cellular models of key tissues. In parallel, ongoing large-scale sequencing studies are likely to reveal novel low frequency and rare risk alleles in coding sequence, some with larger effects than those encountered by existing GWAS. The expectation is that these will be inherently more amenable to experimental follow-up, accelerating the pace of functional discovery and delivering biological insights that will underpin the development of novel diagnostic and therapeutic options.

The second step: from biology to clinical practice

Public funding for genetic research is made available on the premise that the knowledge gained will improve our capacity to prevent and treat the conditions that afflict us, and it is against such translational advances that the success of research will ultimately be judged. Given that the interval between initial discovery and subsequent translational implementation typically exceeds 20 years (Contopoulos-Ioannidis et al. 2008), and since so many of the major genetic discoveries have emerged in the past few years, any attempt to draw up a “translational scoresheet” must be regarded as premature and provisional.

Broadly speaking, there are two principal ways in which genetic discoveries, and the biological insights they engender, can foster translational benefits. The first, and arguably more important, lies in using an improved understanding of disease pathogenesis as the basis for development of novel approaches for the diagnosis, monitoring, treatment and prevention of diabetes and obesity. With respect to drug targets, the fact that variants within PPARG and KCNJ11 (encoding the targets of glitazones and sulfonylureas, two of the major classes of diabetes therapeutics), consistently emerge from genome wide association scans provides confirmation that unbiased genome-wide discovery efforts can reveal pathways capable of useful clinical manipulation. It is of course premature to expect newly discovered loci to have made this transition, not least because for most we are still unclear about the responsible transcript, but some promising candidates have emerged and excited interest from the pharmaceutical and biotechnical communities.

For example, variation in the melatonin receptor 1B gene (MTNR1B) is associated with insulin secretion, fasting glucose and T2D risk (Lyssenko et al. 2009; Prokopenko et al. 2009). MTNR1B expression is localised to the β cell within human islets and shows altered expression in islets from type 2 diabetic donors, whilst the receptor it encodes mediates the inhibitory effect of melatonin on glucose-stimulated insulin response (Lyssenko et al. 2009). Inhibition of this melatonin-ligand receptor system is therefore an attractive therapeutic option for T2D, particularly given that a dopamine receptor agonist which regulates melatonin content, bromocriptine (Zawilska and Iuvone 1990), is already approved for T2D therapy. In a similar vein, the demonstration that variants within the SLC30A8 gene (encoding the islet Zn transporter ZnT8) are associated with reduced insulin secretion has highlighted the importance of zinc as a modulator of islet function and prompted efforts to explore the potential for pharmaceutical and public health approaches to treatment and prevention (Sun et al. 2009).

The second route to translational advance lies in the capacity to use genetic variation as a tool to explore inherited predisposition at the level of the individual and to use such information to deliver personalised medicine through more accurate diagnosis, better prognostication and/or therapeutic optimisation. To date, successful applications of personalised medicine in the clinical management of patients with diabetes and obesity are restricted to the highly familial, monogenic forms of disease.

The fundamental problem for efforts to build equivalent diagnostic and prognostic tools for more typical forms of diabetes and obesity lies in the modest effect sizes of the common variants so far discovered and therefore the limited proportion of heritable variance which they explain. FTO, the largest effect locus for obesity, is associated with an increase in adiposity of 0.3 BMI units (kg/m2) per risk allele and explains only 0.34% of overall heritable variation (Willer et al. 2009). Consideration of all 32 currently known BMI-influencing loci increases this figure to only 1.45% (Speliotes et al. 2010). In combination, the 40 or so common variant loci implicated in T2D susceptibility explain a sibling relative risk of ~1.2 (Voight et al. 2010), well below epidemiological estimates that range between two and three (Köbberling 1982) and therefore define little more than 10% of the observed familial aggregation.

For useful clinical prediction, genetic testing must be both sensitive and specific in discriminating between those who will and will not develop diabetes or become obese on follow-up. The standard measure of this discriminative capacity is provided by receiver operating characteristic (ROC) curves, which plot the performance of a given diagnostic test in terms of those two factors. An area under the curve (AUC) of 0.5 indicates that the test performs no better than chance, whilst an AUC of 1.0 describes a perfect test: in many clinical settings, an AUC of 0.8–0.9 is considered a “good” test. On these criteria, the discriminative performance of currently known markers is distinctly unimpressive (Tables 1, 2). For T2D, an AUC of approximately 0.60 for genetic information alone is typical and compares unfavourably with values close to 0.80 for conventional risk factors—age, BMI, gender—alone (Lango et al. 2008). Adding genetic information to a model that already incorporates those conventional risk factors leads to little further improvement in discriminative performance (Cornelis et al. 2009). The situation for obesity is little better: genetic factors alone produce an AUC of around 0.57 and add little to age and sex (Li et al. 2010).
Table 1

Studies assessing the contribution of genetic risk variant information to prediction of T2D disease status, as quantified by area under the curve (AUC) in receiver-operating characteristic (ROC) curve studies




Study type

Sample size

Variants included

Clinical data included

AUC clinical data only

AUC genetic variants only

AUC clinical plus genetic

Weedon et al. (2006)

Warren 2 and others

Caucasian (UK)





Cauchi et al. (2009)


Caucasian (French)




Age, sex, BMI


Lango et al. (2008)


Caucasian (Scotland)




Age, sex, BMI




Lyssenko et al. (2008)

MPP, Botnia


Prospective (23.5yrs)



Age, sex, BMI, diabetes family history, blood pressure, triglycerides, fasting glucose




Meigs (2008)

Framingham offspring

European ancestry (US)

Prospective (28 yrs)



Age, sex, BMI, diabetes family history, fasting glucose, blood pressure, HDL cholesterol, triglycerides



Vaxillaire et al. (2008)


Caucasian, age 30–65

Prospective (9 yrs)


3 (out of 19 tested)

Age, sex, BMI




van Hoek et al. (2008)

Rotterdam study

Caucasian, age 55+

Prospective (10.6 years)



Age, sex, BMI




Cornelis et al. (2009)


European ancestry (US)




Age, sex, BMI, diabetes family history, smoking, alcohol intake, physical activity



Lin et al. (2009)






Age, BMI, family history, WHR, triglyceride/HDL-cholesterol ratio




Miyake et al. (2009)





11 (out of 23 tested)

Age, sex, BMI




Schulze et al. (2009)



Prospective (7 years)



Age, weight, height, lifestyle factors, A1C, fasting glucose



Sparso et al. (2009)






Age, sex, BMI




de Miguel-Yanes et al. (2011)

Framingham offspring

European ancestry (US)

Prospective (34 years)



Age, sex, BMI, diabetes family history, fasting glucose, triglycerides, blood pressure, HDL cholesterol



Table 2

Studies quantifying the contribution of genetic risk variant information to prediction of obesity (BMI ≥ 30 kg/m2) status, as quantified by area under the curve (AUC) in receiver-operating characteristic (ROC) curve studies




Study type

Sample size

Variants included

Clinical data included

AUC clinical data only

AUC genetic variants only

AUC clinical plus genetic

Renstrom et al. (2009)







Cheung et al. (2010)a

CRISPS and others






Li et al. (2010)


European (UK)




Age, sex




Sandholt et al. (2010)






Age, sex, diet, physical activity, smoking, education, employment, obesity drugs




Speliotes et al. (2010)






Age, sex




Peterson (2011)


European and African-American




Age, sex, ethnicity


aObesity defined as BMI ≥ 27.5 kg/m2

bIncluding variants of suggestive as well genome-wide significance from Willer et al. (2009) and Thorleifsson et al. (2009)

However, it is clear that genetic variants so far identified do not substantially improve the discriminative accuracy of disease prediction based on clinical characteristics. Even genetic models which incorporate thousands of additional putative common variant association signals are likely to offer limited improvement (Evans et al. 2009). In some studies, genetic prediction has been shown to be slightly more effective in certain groups, such as the young (Meigs et al. 2008; de Miguel-Yanes et al. 2011) or those with increasing duration of follow-up (Lyssenko et al. 2008), but at the individual level, accuracy still falls well below any reasonable threshold for clinical utility. Enumeration of the extent and nature of any statistical interactions between genetic variation and environmental factors may allow for some improvement in prediction. However, the evidence to date for gene–environment interaction is limited, and arguably, genome-wide association meta-analysis will favour the detection of loci without appreciable heterogeneity of effects as a result of gene–environment interactions. Similarly, as data from a wider range of ethnicities become available, an awareness of the interaction of differing genetic and environmental risk factors will help to reveal how relative risk effects may vary across populations (Helgadottir et al. 2006).

If individual prediction is not yet feasible, it is certainly true that risk allele scores have the capacity to highlight groups at either end of the risk distribution. Individuals carrying more than 12 T2D-risk alleles (in the highest quintile of the population according to loci known at the time) have twice the disease risk that would be predicted on the basis of BMI alone (Lyssenko et al. 2008). There is a fourfold difference in T2D risk between the top and bottom 1% of individuals in terms of risk allele score (Lango et al. 2008), and the 7–10% of children with three or four FTO and MC4R risk alleles have a threefold increased risk of childhood obesity compared to the 20–24% of individuals who carry no risk alleles (Cauchi et al. 2009). Consequently, existing genetic tests may have some utility in providing risk stratification at the group level, leading, for example, to design of more efficient clinical trials (Schork and Topol 2010). Since lifestyle interventions are often seen to be beneficial even for those at highest genetic risk (Hivert et al. 2011; Florez et al. 2006), identification of high risk groups could support targeting of preventative public health measures.

However, the main hope for improved prognostic and diagnostic precision, as for biological insight, lies with more complete enumeration of the genetic component of predisposition and the integration of such data with relevant information from other sources (including pertinent environmental exposures and epigenetic changes). In particular, a great deal of hope rests on the expectation that resequencing studies will reveal causal variants in the low (MAF 0.005–0.05) and rare (MAF < 0.005) frequency ranges. Whilst causal allele in the lowest MAF ranges will demand alternative discovery, analysis and translational approaches (Gloyn and McCarthy 2010), optimism remains that a proportion of the variants within them will have larger effect sizes than the common variant signals found to date and will therefore provide valuable boosts to predictive performance. Whereas over 400 common (MAF of 0.3) variants with allelic odds ratios between 1.05 and 1.10 would fail to provide a discriminative test with an AUC > 0.8, the inclusion of just ten rarer variants (MAF of 0.05) with odds ratios of 3.0 would achieve this benchmark (Janssens et al. 2006).

It is worth remembering that the most effective risk stratification strategies may not require genetic testing at all. In situations where genetic discoveries reveal a pathogenetic process that can be captured through a serum or urine biomarker, there may be considerable advantage (for instance, the ability to bypass the challenges to molecular diagnostics of locus and allelic heterogeneity) in focusing clinical attention on that instead. A measurement of serum LDL-cholesterol levels is far easier and has better predictive power than obtaining sequence and genotype data at the many loci now known to influence lipid levels. An early example of this approach comes from the identification of C-reactive protein (CRP) levels as a diagnostic marker for HNF1A-MODY, a specific subtype of monogenic diabetes. GWA studies (Ridker et al. 2008; Reiner et al. 2008) had shown that common variants near HNF1A have a strong influence on CRP levels, raising the possibility that CRP levels might be more dramatically disturbed in individuals carrying rare, large-effect HNF1A mutations causal for MODY. Confirmation that this is indeed the case (Owen et al. 2010) has exposed CRP as a diagnostic tool that can be used to screen individuals with early-onset diabetes and increase the confidence with which HNF1A-MODY patients can be passed for definitive molecular diagnostic testing.

If clinical prediction has not yet achieved general clinical utility, what of the hopes for therapeutic optimisation? Monogenic forms of diabetes and obesity already provide some of the most dramatic success stories for pharmacogenetics: the transfer of children with permanent neonatal diabetes caused by mutations in KCNJ11 and ABCC8 from insulin to high-dose sulfonylureas (Sagen et al. 2004; Rafiq et al. 2008) and the treatment of morbid obesity due to leptin deficiency with recombinant leptin (Farooqi et al. 1999).

To date, pharmacogenetic studies in common forms of obesity and T2D have not offered such dramatic applications. The field is in its infancy, and with some notable exceptions, much of the work to date has been candidate driven, small scale, and has not displayed the inferential rigour seen in discovery GWAS. Even variation within CYP2C9, a long-established rate limiting enzyme for the metabolism of sulfonylureas, has been difficult to tie down to an effect upon sulfonylurea treatment outcomes (reviewed in Pearson 2009).

However, there have been some successes. In T2D, the presence of common polymorphisms in known diabetes drug targets has presented obvious candidates for pharmacogenetic analysis. Evidence of a relationship between ABCC8/KCNJ11 genotype and sulfonylurea response is encouraging. Recent analyses in large cohorts have reported, for example, a 45% increased risk of glibenclamide treatment failure amongst risk compared to non-risk allele homozygotes (Sesti et al. 2006) and a greater decrease in fasting plasma glucose following gliclazide treatment amongst risk allele carriers (Feng et al. 2008). An effect upon gliclazide response is consistent with functional data which demonstrates that the risk variant KATP channel has 3.5 times increased sensitivity to gliclazide inhibition (Hamming et al. 2009).

Studies relating effects of PPARG variants on thiazolidinedione treatment have yielded inconsistent results (reviewed in Pearson 2009). More convincing are data demonstrating that genotype at the TCF7L2 T2D risk locus is associated with variation in response to sulfonylurea treatment. In a retrospective observational study, patients carrying two risk alleles at TCF7L2 were almost twice as likely to fail treatment objectives as those carrying no risk alleles, with an intermediate effect for heterozygotes (Pearson et al. 2007).

For metformin, polymorphisms within organic cation transporter 1 (OCT1) and the multidrug and toxin extrusion (MATE) 1 protein (SLC47A1) have been significantly associated with drug response (Shu et al. 2007; Jablonski et al. 2010; Becker et al. 2009). Most recently, a GWA study provided compelling evidence that variants near the ATM gene are significantly associated with glycaemic response to metformin therapy (Zhou et al. 2011). As with the TCF7L2 association, the effect sizes are too modest to suggest clinical utility from the perspective of individual therapeutic optimisation, not least because they only explain 2.5% of total variance in metformin response (Zhou et al. 2011), but implication of ATM may provide valuable clues to the mechanisms through which metformin exerts its beneficial effects on glucose metabolism.

Conclusion: where to from here?

In the past few years, human genetic research has begun to make progress in characterising the genetic basis of common forms of diabetes and obesity. In fact, the scale and power of the most recent association studies and meta-analyses mean that, as far as common variants are concerned, it is likely that future discoveries will be limited to alleles of rather small effect size: there are almost certainly no FTOs or TCF7L2s left to find, in European subjects at least. The first challenge to be faced lies in extending these genome-wide surveys to encompass a wider diversity of ethnic groups and more complete range of variation types, examined across the full allele frequency spectrum. Thanks to ever growing collaboration, and the falling costs and growing capacity of resequencing technologies, such studies are already well underway. We will soon be in possession of a far more complete and systematic view of the relationship between DNA sequence variation and individual predisposition to diseases such as diabetes and obesity. Application of these same tools within prospective studies and clinical trials should provide equivalent insights into the genetic basis of individual differences in the speed of disease progression, risk of disease-specific complications and response to therapeutic or preventative interventions.

This information brings with it new challenges. A list of associated loci and SNPs is of little value unless we can synthesise those findings into an improved model of disease biology. That means identifying the causal variants, and unravelling both the proximal (i.e. molecular) and distal (i.e. cellular and physiological) mechanisms whereby they execute their effects on disease predisposition. Given the large number of loci likely to be discovered, obtaining these mechanistic insights will require far better integration between high-throughput science (providing, for example, tissue-specific functional annotation of regulatory sequence, or large-scale siRNA manipulation of transcripts of interest) and the more detailed knowledge of “molecule-specific” domain experts.

All of this activity should be motivated by the need for clinical translation, particularly for diseases such as obesity and diabetes which represent major causes of global morbidity and mortality, and for which currently available therapeutic and preventative options are manifestly inadequate. In time, it is reasonable to expect that the more complete model of disease pathogenesis which is emerging from human genetics discovery will lead to novel therapeutic, diagnostic and preventative approaches that are of widespread benefit.

The future of personalised medicine is less secure. At this stage, we simply do not know the circumstances under which knowledge of an individual’s genome sequence will be of genuine clinical benefit. The answer is likely to depend on the genetic architecture of the disease in question and the relative contributions of inherited genetic variation (as opposed to other factors such as environment, somatic mutations and epigenetic changes) to predisposition. It will also depend on the health status of the individual: the value of a genome sequence in a perfectly healthy 40 year old is likely to be very different to that of a child afflicted by some serious genetic disease. Fortunately, such considerations need not be the subject of conjecture for too many years since the gathering pace of sequence data accumulation will soon provide empirical resolution.

Reservations about the current value of genetic testing have not, of course, prevented the commercialisation of personal genome analysis. Whilst decisions regarding access to such personal data are best left in the hands of the individuals concerned, the limitations of currently available tests (for now based around common variant arrays) are all too obvious. For instance, individual estimates of disease risk are not stable and often change as additional variants are discovered and incorporated into risk models (Mihaescu et al. 2009). In some ways, the relatively poor performance of current reagents has served to protect individuals and their health-care contacts from the more difficult decisions that are likely to come with access to sequence-level data. These include the appropriate personal response to the identification of rare alleles, particularly those that lie in genes which may invite medical action (e.g. BRCA1) or those with significant prognostic importance but for which therapeutic options are limited (e.g. APOE or HTT).

In our view, such concerns provide no justification for proscriptive regulation of access, but they do call for greater dialogue between the relevant parties and for aggressive educational efforts at all levels. In a recent survey, 48% of doctors in the United States reported that they were very likely to use an FDA-approved genetic test for a complex disease, 39% even before published evidence of clinical efficacy (Grant et al. 2009). Such enthusiasm sits poorly with the limited training of most current medical practitioners in the interpretation of genetic findings and the small numbers of the specialists (clinical geneticists, genetic counsellors) that might be expected to fill this gap.

From the perspective of the population at large, the omens seem good. In the same US study, over 70% of respondents said a “high risk” or “good responder” result was very likely to improve their motivation to adopt lifestyle changes and adhere to medication regimes, whilst only 1.3% said they would be less motivated by a “low risk” result. These claims deserve to be treated with some degree of scepticism given that very sound diet and lifestyle advice is frequently ignored, but it is also possible that genetic data are more psychologically influential. Surveys of those who have sought and obtained genetic risk information based on common variant array genotyping have generally reported a positive experience, with little evidence of any of the adverse outcomes that some have predicted (Grant et al. 2009; Bloss et al. 2011).

In short, we have just begun a phase of genetic discovery which has the potential to transform the ways in which we manage diseases of global impact such as diabetes and obesity. The precise ways in which this transformation will play out are difficult to predict, but the ever-accelerating pace of human genetics discovery will reveal the landscape on which these developments will take place.


MET is funded by a Medical Research Council studentship. MMC is supported by the Oxford NIHR Biomedical Research Centre.

Conflict of interest

The authors declare that they have no conflict of interest.

Copyright information

© Springer-Verlag 2011