Introduction

Obesity results from a chronic surplus of energy intake compared to energy expenditure, which leads to storage of excessive amounts of triglycerides in adipose tissue [1]. The adverse metabolic effects caused by obesity may result in increased risk of type 2 diabetes, many forms of cancer, fatty liver disease, hormonal disturbances, hypertension, cardiovascular disease (CVD) and increased mortality [25]. Body mass index (BMI) is commonly used as a surrogate measure of overall obesity and, accordingly the World Health Organization, classifies a person with a BMI ≥ 30 kg/m2 as obese and a BMI ≥ 40 kg/m2 as extremely obese [6].

In the United States, the current prevalence of obesity among adults is about 33% [7], compared to estimates four decades earlier showing a prevalence of about 13%. Similarly, in the United Kingdom, prevalence has increased to about 23% [8] and these prevalence figures are reflected throughout the rest of the world [912]. Current estimates suggest that by 2015 more than 700 million individuals worldwide will be obese [6]. Prevalence is likely increasing as a result of changes in lifestyle, decreased physical activity, and socioeconomic development, among others. An important concern is that obesity rates are also increasing in children and adolescents all over the world [6, 7], predisposing them to poor health from an early age. Obesity is now recognized as an epidemic [6], and despite current intensive efforts to reduce obesity by diet, exercise, education, surgery, and drug therapies, an effective, long-term solution to this epidemic is yet to be provided.

The Genetic Etiology of Obesity

Obesity is commonly classified into subgroups depending on suspected etiology: monogenic obesity (extremely severe obesity in the absence of developmental delays), syndromic obesity (clinically obese subjects additionally distinguished by mental retardation, dysmorphic features, and organ-specific developmental abnormalities), and polygenic or common obesity, which affects the general population (but may have associated health risks, such as increased risk of CVD).

The first single gene defect causing monogenic obesity was described in 1997, and to date, there are about 20 single gene disruptions that result in an autosomal form of obesity [1]. Interestingly, all these mutations position the leptin/melanocortin pathway in the central nervous system (CNS) as critical in the regulation of whole-body energy homeostasis [13], and obesity in these cases appears to be the result of increased appetite and diminished satiety. Syndromic obesity arises from discrete genetic defects or chromosomal abnormalities at several genes, and can be autosomal or X-linked. One of the most well-known forms of syndromic obesity is Prader-Willi syndrome (PWS), which is caused by a chromosomal abnormality of an imprinted region on chromosome 15q11-q12. PWS is characterized by early-onset obesity resulting from hyperphagia caused by CNS dysfunction [14]. Because both monogenic and syndromic forms of obesity tend to have high penetrance, detection of causal genetic variants has been quite fruitful [15]. The remainder of this review will focus on the genetics of common forms of obesity.

It is worth noting that sex and age are associated with differences in obesity and body composition. For instance, women tend to store more fat subcutaneously rather than in visceral adipose tissue, so at the same BMI, women will tend to carry more body fat than men [7]. Fat distribution follows two general patterns: android (adipose deposition in the abdominal area) and gynoid (adipose deposition around the hips). Android fat distribution is an established, independent risk factor for CVD and type 2 diabetes [16], whereas the gynoid pattern is thought to be protective or inversely correlated [16]. To account for these differences in fat distribution, waist-to-hip ratio (WHR = waist circumference [WC]/hip circumference) is commonly used and BMI and WHR are correlated (r 2 ~ 0.6) [17••].

The Epidemiology and Heritability of Common Obesity

Epidemiologic studies of common obesity have shown that concordance for obesity decreases in parallel with the degree of relatedness, pointing to a genetic component in obesity susceptibility. For instance, the concordance rate between monozygotic twin pairs is more than twice that of dizygotic pairs (~ 0.68 vs ~ 0.28) [1821]. In comparison, in adoptee studies, the BMI of the adopted individual is correlated closest with the biological parents’ BMI, further highlighting the role of the genetic factors over the shared familial environment [22, 23]. The high heritability (h 2) for different measures of obesity—BMI (h 2 = 0.4–0.7) [24], subscapular skinfold thickness (h 2 ~ 0.77) [25], WC (h 2 ~ 0.76) [21, 26], and WHR (h 2 ~ 0.45) [26]—highlight the effect of genetics in increasing risk to obesity. Nevertheless, exposure to an obesogenic environment is necessary for the development of obesity. One hypothesis is that genes that once provided an evolutionary advantage (by allowing maximum efficiency of nutrient storage) are severely challenged when exposed to obesogenic environments [27], although this remains to be formally tested and proved.

Finding Genes for Common Obesity

The search for obesity susceptibility variants was initially carried out using candidate-gene association studies or linkage analysis. Although these were exceptional tools in the discovery of genes for syndromic and monogenic obesity [15], the success of finding genes for common obesity susceptibility was limited, with very few reproducible results [15]. Association studies (basically comparing allele frequencies between cases and controls) were affected by lack of power in small size study cohorts, limited knowledge of the biology and physiology affecting the candidate gene selection, heterogeneity in the samples, poor phenotyping, and the high cost and effort of genotyping. Numerous linkage studies using affected, related individuals were carried out using microsatellite markers to identify regions of linkage to obesity [15]. However, despite intense efforts, no linkage regions were unambiguously replicated, even after meta-analysis of 37 studies (including ~ 31,000 individuals) [28], demonstrating the limitations of the linkage approach.

Insights into the Genetics of Common Obesity

The first substantial advances in the discovery of obesity susceptibility loci were made in 2007 [29••, 30]. This was possible through technical and analytical developments allowing for genome-wide association studies (GWAS). GWAS capitalize on the realization that common genetic markers can be inherited together as “blocks” due to linkage disequilibrium [31]. This allowed investigators to capture about 80% of all common variations (>14 million variants [32]) using as few as 500,000 carefully chosen single nucleotide polymorphisms (SNPs) [33, 34]. Genotyping such selected sets of markers is an attractive approach that allows the capture of most common variation in a single array. The initial success of finding the first robustly associated obesity susceptibility locus, FTO (fat mass and obesity associated), in 2007, [29••, 30, 35••–3840••] revealed the small effects sizes of obesity genes and propagated the notion that increasing power and sample size of studies was necessary to identify further susceptibility loci. As a result, even larger and better powered studies have followed, as have multicenter collaborative studies and meta-analyses (Table 1), which have accumulated more than 20 replicated obesity loci (Fig. 1).

Table 1 Details of genetic association studies
Fig. 1
figure 1

Genes associated with obesity-related anthropometric measures. BMI body mass index, WC waist circumference, WHR waist to hip ratio. aIndicates type 2 diabetes association. bIndicates association with monogenic obesity

Importantly, in most cases we do not know which gene(s) in these loci are contributing to obesity, and the identity of the causative variants is currently unknown. However, in the majority of cases, we can identify the most physiologically likely candidate-gene within/near the associated SNP and further research is needed to verify whether these are the obesity-associated ones or not. Overall, many genes within the associated regions have been reported to fall within two broad categories: genes affecting CNS function and those that are suggested to operate peripherally, often through adipose tissue.

Several of the genes located within or near the associated regions are highly expressed in the CNS, particularly in the hypothalamus [36••], and appear to be involved in appetite, satiety, energy expenditure, and behavior. This suggests that similar mechanisms would be affected in common forms as in monogenic forms of obesity. FTO has a potential role in nucleic acid demethylation and is highly expressed in parts of the brain that govern energy balance [41••] and feeding behavior [42, 43]. MC4R (melanocortin 4 receptor) is reported to be responsible for up to 6% of early-onset or severe adult obesity cases [44]. In addition, disrupted MC4R in mice causes hyperphagia, hyperinsulinemia, and hyperglycemia [45]. SH2B1 (Src-homology-2 [SH2] domain-containing putative adaptor protein-1) is associated with increased serum leptin [46]. BDNF (brain-derived neurotrophic factor) and NRXN3 (neurexin 3 ) are linked to substance abuse and reward behavior, probably interfering with dopamine neurotransmission in pathways involved in reward effects, motivation, and decision making [47, 48]. NEGR1 (neuronal growth regulator 1) is a growth promoter that participates in the regulation of axon outgrowth in the developing brain [37••]. TMEM18 (transmembrane protein 18) enhances the migration capacity of cells [49] and is highly expressed in the hypothalamus [36••].

Another eight genes associated with obesity are highly expressed in the hypothalamus [36••]: KCTD15 (potassium channel tetramerization domain-containing 15), GNPDA2 (glucosamine-6-phosphate deaminase-2) MTCH2 (mitochondrial carrier homolog 2), SDCCAG8 (serologically defined colon cancer antigen 8), FAIM2 (Fas apoptotic inhibitory molecule 2), ETV5 (ets variant 5), NCP1 (endosomal/lysosomal Niemann-Pick C1 gene), and PRL (prolactin) are thought to be involved in obesity susceptibility via CNS-mediated effects.

Other associations observed in GWAS, which affect susceptibility to obesity, may operate peripherally (those primarily associated with fat-distribution and/or central obesity). Among these, we include the following: TFAP2B (transcription factor activating enhancer-binding protein 2 β) preferentially expressed in adipose tissue, which is involved in glucose transport, lipid accumulation, and adiponectin expression [17••]; NCR3 (natural cytotoxicity triggering receptor 3) and PTER (phosphotriesterase related), which might mediate their effect through the hypothesized low-grade inflammation of adipose tissue [50]; and lastly c-MAF, a transcription factor involved in adipogenesis [38••]. In addition, the interesting female-only association to LYPLAL1 (lysophospholipase-like-1) is caused by a lipase with increased expression in subcutaneous adipose tissue [17••]. This finding is of particular interest as it supports previous hypotheses that there are sex-specific genes contributing to variation of obesity-related traits and that genes account for more variance of fat distribution in women than in men [51••]. It is also worth noting that although involved in the production of leptin and thus proposed to operate through CNS, SHB2B1 also has an effect on total fat [52] and may have a dual role in obesity susceptibility.

These first insights, gained through the results of the initial wave of GWAS, support the biology that monogenic disorders point to CNS regulation of overall obesity (BMI), whereas more peripheral effects operate on central obesity and fat distribution. However, it is noteworthy that these conclusions are based on likely candidate genes in associated regions, but many of the genes are still uncharacterized and can therefore not be disregarded as candidates. Furthermore, some of the significant associations are located in gene deserts and in non-coding regions. Dissecting these associations, identifying causal variation, and unravelling the functional role is a major challenge that lies ahead.

The Future of Genetic Studies in Obesity

Despite the initial success of the GWAS strategy, the established loci together explain less than 2% of the interindividual BMI variation [17••] and less than 1% of the interindividual WHR variation [36••]. With heritability estimates of 40% to 70% for BMI and 30% to 60% for WHR (even after adjusting for BMI) [26], there must be many more susceptibility loci to uncover. Different strategies for how to do this have been proposed and are currently being explored.

Improving and Expanding GWA Studies

One approach is to continue the successful meta-analysis efforts thus far and increase the sample size and power. Such efforts are well advanced within the GIANT (Genetic Investigation of Anthropometric Traits) consortium, in which both BMI and WHR are being currently analyzed in more than 100,000 samples. This will add new common variants, with small effect sizes, that are robustly associated with these obesity phenotypes. Also, power to identify novel loci might vary between different populations, as a result of differences in allele frequencies, and because the majority of studies have been done in samples of European ancestry, power could be increased by using samples from different ancestries. One example of this is the WHR association to a locus on chromosome 12q24 (Fig. 1), reported recently in a Korean population-based cohort (Table 1) [53••]. This association is not possible to evaluate in current GWAS efforts in European samples because the variants are not present in the CEU samples from HapMap (and no imputation can be performed). Also, fewer samples were needed to identify the MC4R BMI locus in Indian Asian samples than in Europeans, as the risk allele frequency is higher in Indian Asians. Therefore, expanding efforts to studies using samples of other ethnicities should provide excellent opportunities to discover additional obesity susceptibility loci.

Improving Subject Selection and the Phenotype

Most studies hereto have focused on anthropometric measures of obesity (eg, BMI, WC, WHR) because these are straightforward, noninvasive, and inexpensive surrogate measures of adiposity. Although these studies have been successful, they have required very large sample sizes for identification of robust associations (Table 1). More precise measures (eg, CT scan, dual-energy x-ray absorptiometry (DXA scan) or MRI) are more expensive and time consuming to collect, but are expected to reduce the number of samples needed to identify robust associations to adiposity. It is worth noticing that considerable variation between machines, calibrations, operators, and sites will introduce noise even in these fine-tuned adiposity measures and that the most successful studies will put emphasis on both of these.

Looking Beyond Common SNPs

Rare and Low Frequency Variants

The genotyping arrays currently used for GWAS are designed to provide excellent coverage of common variants, especially when married with genotype imputation methods. Thus far, risk allele frequencies of the obesity susceptibility variants, which are identified through GWAS, are all greater than 5%. Rare and low frequency variants (minor allele frequency <5%), however, are not well captured by these methods. It has been discussed that rare and low frequency variants would have higher penetrance and would thus explain more of the “missing heritability” than common variants have done hereto. To date, robust associations to obesity have been observed for low frequency, nonsynonymous variants in three loci—MC4R [54], prohormone convertase 1/3 (PCSK1) [55], and BDNF [56]—in samples of European ancestry, using a traditional candidate gene approach. Fuller, genome-wide evaluation of such variants is imminent with advances in high-throughput sequencing technologies and the large efforts of the 1000 Genomes project (a massive collaborative effort to carry out a deep characterization of genomic variation in over 1000 individuals derived from a number of populations worldwide). These efforts will aid in cataloguing variants of lower frequency and might lead to new genotyping arrays and imputation methods that will capture a fuller allele frequency spectrum than we have been able to until now.

Copy Number Variants

It has recently been suggested that common copy number variants (CNVs) are unlikely to contribute substantially to the genetic basis of common human diseases [57]. The extent to which CNVs might contribute to “missing heritability” of common traits and disorders is currently not clear [58]. A large fraction of human CNVs arise from common, diallelic polymorphisms [57, 59] and most of these CNVs are in linkage disequilibrium with adjacent SNPs, so their contribution to phenotypes can be assessed via these SNPs. This strategy recently proved its value when a SNP significantly associated with BMI (NEGR1, rs2815752; Fig. 1) was also found to tag a 45-kb deletion upstream of NEGR1. The deletion is a causal candidate for the association signal, but this needs further fine-mapping and functional work to be proven [36••].

Further evidence that CNVs contribute to the genetic architecture of human obesity came with the finding that large, rare chromosomal deletions on chromosome 16p11.2 are associated with severe, highly penetrant obesity [60••, 61•]. The deletions span a large number of genes including SH2B1, which is known to be involved in leptin and insulin signaling [62]. The carriers of the deletions exhibit hyperphagia and severe insulin resistance, which resemble the phenotype in rodents with the deletion of Sh2b1 [63].

As common variants at the same locus near SH2B1 have been associated with BMI (used as a quantitative trait) in GWAS (Fig. 1; [36••, 37••]), the data also provides further evidence that we might expect to find more loci where common variants are associated with common forms of obesity as well as rare variants that are associated with severe, extreme forms of obesity. Thus, further evaluations of the full range of CNVs are needed to fully estimate their impact on obesity.

Non-Coding RNA

Even more of the heritability of obesity might be explained by determining the role of non-coding RNAs, such as microRNAs (miRNA). These are short, non-coding RNA molecules that regulate gene expression post-transcriptionally, by binding to the 3′UTR of their target genes and limiting translation into protein. Each miRNA is suspected to target several genes. miRNAs are known regulators of adipocyte differentiation, obesity, insulin resistance, and appetite regulation [64]. Thus, dysregulation of miRNA expression (caused by genetic or environmental “drivers”) can potentially influence obesity susceptibility. A better understanding of miRNA physiology could be exploited, to elucidate the etiology of the disease, and manipulated to reduce obesity, as has already been done in animal studies for microRNA-122 [65].

Epigenetic Modifications

Modifications that affect gene expression but do not alter the DNA sequence are termed epigenetic modifications [66]. These include methylation and histone modifications, which are likely to have key roles in the inheritance and susceptibility to obesity [67], by affecting the expression of associated genes. Also, intrauterine environment during specific developmental stages can alter the epigenetic profile of an individual and may serve as a primer for the onset of obesity and other phenotypes in later age [68]. Next-generation sequencing and ChIP-Seq technology can be applied to discern the epigenetic profile of the genes already associated with obesity and may also be applied in a genome-wide approach and allow integration with existing GWAS and genome sequencing data.

Differences between the metabolically deleterious and the metabolically healthy forms of obesity

Although the majority of individuals with overall obesity develop metabolic complications, a proportion (~ 10% to 25%) remains metabolically healthy [69]. It has been speculated that this is due to retained insulin sensitivity, despite the metabolic challenges that overall obesity presents. Importantly, recent, robust genetic associations with obesity phenotypes can thus be used to explore the differences between the metabolically deleterious and the metabolically healthy forms of obesity, which might aid in the distinction between obesity loci that lead to metabolic dysregulation and those that do not [70].

Identifying the Causal Variant/Gene

After identifying susceptibility loci, caution in the interpretation of the results and associations is still necessary because associated genetic variants are not always within known genes and associations can sometimes span large areas containing a number of genes. Thus, in most cases, we cannot say for certain that we have identified the “smoking gun.” Large-scale fine-mapping and resequencing efforts are needed to catalogue and evaluate the genetic variation from the full allele spectra in associated regions. The use of different linkage disequilibrium patterns in samples from different ethnicities in the previously mentioned efforts is an obvious strategy to attempt to hone in on regions with higher likelihood of containing the etiologic variant. Dissecting function and proving causation of genetic variants is not necessarily a straightforward task, even for relatively simple cases of monogenic diseases, and a challenging task lies before the research community to, in a robust manner, translate these findings into characterization of function and consequences on physiology.

Conclusions

The current obesity epidemic does not have a purely genetic basis, although genetics do play a large role in susceptibility. Changes to lifestyle over the past century have created an “obesogenic environment” in which underlying individual genetic factors contributing to risk can be exposed. With the advent of GWAS, we have finally started to detect robust associations between common genetic variants and obesity. Many of the earliest genetic associations hinted that susceptibility to obesity might act through CNS action, and that the response to obesogenic environment exposure may be neurobehaviorally driven. In addition, other evidence suggests that some genetic variants act peripherally (eg, in adipose tissue).

Although GWAs have been successful in identifying obesity loci, these only explain a small fraction of the interindividual variation, so that additional genetic factors remain to be detected. However, these findings will only yield useful therapeutic interventions, once functional variants are exposed and further molecular and physiologic characterization of the genes and pathways involved is performed.