Introduction

Type 2 diabetes (T2D) is a metabolic disorder characterised by hyperglycemia due to β cell dysfunction and insulin resistance [1, 2]. T2D affects 8.3% of the adult population worldwide and is one of the most common non-communicable diseases of current times [3, 4]. Aetiologically, T2D arises in response to a combination of genetic predisposition and environmental or lifestyle factors. The genetic origins of T2D have long been supported by family and twin studies [5]. The most recent GWAS meta-analysis in T2D identified > 400 genetic risk variants explaining 15–18% of the heritability of the disease [6, 7]. Most of the T2D-risk variants identified to date act through effects on β cell function [5, 8] and to a lesser extent through effects on insulin resistance and obesity [1].

Epigenetic modifications include DNA methylation (DNAm), post-translational modification of histone proteins, and non-coding RNAs (ncRNAs) [9,10,11]. More rarely studied modifications include 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) which are sequentially produced by oxidation during enzymatic demethylation [12]. Epigenetic modifications act at the interface between the environment and coordinated transcriptional control and may also contribute to T2D disease risk. This may be partly via genetic influences on epigenetic modifications, but epigenetic response to the influence of environmental and lifestyle exposures is likely to be predominant. Evidence supporting this comes from greater epigenetic variation observed in population studies compared with that in discordant monozygotic twins [13].

The most common epigenetic modification analysed in epidemiological studies of complex diseases is DNAm [10, 11, 14], mainly found at CpG dinucleotides [9,10,11]. Growing evidence supports an association between T2D and DNA methylation variation measured before [15] and after disease onset [16•]. However, it is unknown whether epigenetic markers (particularly those identified in non-target tissues) play a causal role in the development of T2D, if they are a consequence of disease status, or are due to residual confounding [9]. Moreover, the epigenetic signature observed in collected samples that include mixed cell types can influence associations that may or may not be completely considered in all observational studies despite currently available biostatistical tools [17,18,19,20].

This review presents an overview of current evidence around T2D and epigenetics, with a primary focus on DNA methylation and epidemiological studies in humans. It describes inherent limitations of epigenetic studies in ascertaining causality and provides examples of studies that have attempted to address these limitations by implementing causal inference methods such as Mendelian randomisation.

General Considerations in the Design of DNA Methylation Studies

The use of DNAm in epidemiological studies imposes some constraints, particularly when the aim is to identify evidence for a causal role of DNAm on disease. Common challenges include tissue specificity, confounding, effect modification, and statistical power to identify methylation variable sites with enough interindividual variation to be informative between comparison groups [21, 22]. Additionally, technological advances have moved the field of epigenetics from the study of a few CpG sites within specific genes, towards the genome-wide assessment of variation in DNAm at single-base resolution [23•]. Genome-wide studies of DNAm allow ascertainment of disease-relevant variation more comprehensively than candidate gene studies. However, they also increase the multiple testing burden (~ 400–800 K sites analysed), requiring larger samples to robustly identify small effect sizes [10, 24]. In addition, current array-based methods for genome-wide assessment only cover < 2% of the total methylation sites available in the genome [11]. Prediction of DNAm at unmeasured sites is more difficult to achieve than for common genetic variants (SNPs) measured using arrays due to the more complex correlation structure of DNAm over specific genomic regions and CpG site densities, and due to the temporal and tissue-specific variation of DNAm [25].

In principle, when the aim is to study aetiology, DNAm markers should be studied in disease-relevant tissues (i.e. insulin-responsive tissues or insulin-producing β cells) to validate their role in the causal pathway to disease [26]. For T2D, such tissues might include pancreatic islets, liver, adipose tissue, or skeletal muscle. However, internal tissues are more difficult to access than peripheral blood, especially at scale [23•, 27•]. In addition, tissue contamination due to ongoing inflammatory processes in obese patients with T2D (i.e. infiltration of blood cells into adipose tissue) or due to sample derivation should be considered. Cellular heterogeneity can influence the comparison of DNAm across tissues. Currently, cell-type deconvolution methods are being developed to estimate the proportion of cells in tissues different from peripheral blood [28]. These methods could aid in discriminating the inflammatory proportion in internal target tissues. Conversely, when the aim is primarily for prediction (irrespective of mechanistic role), it may be appropriate to study more accessible tissues like blood, saliva, buccal cells, cells in urine, skin cells, and faeces, which are commonly collected in large-scale epidemiological studies [11, 14]. Replication of signals detected in accessible tissues within internal target tissues may provide further evidence of their biological role in T2D pathophysiology and their potential use in diagnostics or therapeutics.

Methylation may be influenced by environmental factors related to the disease being studied, by the disease itself or disease treatment, and this possibility of reverse causation means that it can be difficult to discern causality in cross-sectional studies [11, 14, 22]. Studies that measure DNAm before clinical detection of T2D can be valuable in this regard [14, 23•, 26], as they reduce the likelihood of confounding by reverse causation (i.e. disease influencing DNAm variation). However, longitudinal studies are expensive and uncommon when compared with cross-sectional studies [11, 14, 23•] and do not completely eliminate risk of reverse causation due to subclinical manifestations of the disease [10]. In T2D, several studies have replicated markers of predisposition detected longitudinally, in cross-sectional case-control studies [27•, 29•]. This suggests that variation in methylation detected prior to disease onset is not necessarily indicative of causality since the observed associations can still be influenced by unmeasured environmental or genetic confounders [10]. Alternatively, the analysis of glycemic and other T2D-related traits has been useful to identify markers of predisposition and potential prediction of T2D in disease-free participants [15, 30,31,32].

Another challenge that can hamper the identification of epigenetic mechanisms in T2D is sample size. Adequate sample sizes required to detect associations in epigenetic epidemiology studies have been estimated to be in the region of ~ 1000 samples, whereas the vast majority of published literature in this field falls well below this threshold [24]. Studies of the epigenetic epidemiology of T2D have tended to be small-scale (< 100 participants) and therefore suffer from low statistical power. More recently, replication and meta-analysis of associations across studies have become more common [11, 22]. Replication and meta-analysis are facilitated by the emergence of large consortia of cohorts that use similar profiling methods for DNAm and standardised protocols for data pre-processing and analysis [10, 11, 22].

Summary of Current Knowledge from Human DNA Methylation Studies in T2D

Candidate Gene Analyses

Epigenetic studies based on candidate loci rely on previous knowledge in order to select the genomic region(s) to study [33]. Several candidate gene studies have been conducted to date that use T2D relevant tissues such as human pancreatic islets [34,35,36,37] or skeletal muscle biopsies [38] from T2D donors and appropriately selected controls. These studies have identified differential methylation at genes related to insulin activity (INS and GLP1R) [35, 37], β cell function (PDX1) [36], and energy balance (PPARGC1A) [34]. In addition, there have been several candidate gene studies aimed at identifying methylation variable loci using peripheral blood DNA [39,40,41,42,43] (Table 1), and they have been reviewed in detail elsewhere [23•, 33, 44]. One recent example is the study conducted by Seman et al. [45] looking at differential methylation at the promoter of SLC30A8, a pancreas-specific zinc efflux transporter [23•]. The authors identified hypermethylation of five CpG sites in SLC30A8 in T2D cases (n = 509) versus controls (n = 441) in a large Malay study [45].

Table 1 Characteristics of candidate gene studies and genome-wide studies associated with type 2 diabetes.

Overall, results of the candidate gene approach have shown that differential methylation at the promoter regions of well-established genetic loci for T2D is associated with T2D risk. Two hypotheses arise from this observation (i) that genetic or epigenetic perturbation at the locus of interest may both contribute to disease risk (an additive effect), or (ii) that DNAm might be important in mediating the effect of known genetic variants and T2D (a mediating effect). However, recent genetic studies of DNAm using peripheral blood have provided less evidence that DNAm is mediating the effect between known T2D-SNPs and T2D risk [46, 47•], with the exception of methylation at the T2D candidate loci KCNJ11, WFS1 [47•], and KCNQ1 [46].

Epigenome-Wide Approaches

Epigenome-wide association studies (EWAS) of T2D have been conducted to identify novel markers of disease incidence or prevalence (Table 1) using longitudinal and cross-sectional studies, respectively.

Blood DNA Methylation as a Marker of Incident T2D

One of the earliest genome-wide studies of T2D was conducted by Toperoff et al. using a microarray-based technology [48]. This study identified several differentially methylated regions (DMRs) associated with T2D that were enriched in genetic loci previously reported for T2D. In a second prospective cohort, the authors identified that hypomethylation of one of the associated regions (in FTO) was observed in young individuals who later progressed to T2D, relative to the individuals who stayed healthy [48].

More recently, Chambers et al. conducted the largest study to date looking at differential DNAm in association with future T2D. This multiethnic longitudinal study included samples of Indian Asian (discovery) and European (replication) origin [16•]. Hypomethylation at CpG sites in TXNIP (cg19693031), PHOSPHO1 (cg02650017), and SOCS3 (cg18181703), and hypermethylation at SREBF1 (cg11024682) and ABCG1 (cg06500161) were associated with greater risk of developing T2D over the ~ 8.5-year follow-up [16•] (Table 1). In addition, these five CpG sites were combined into a methylation risk score that predicted a 3.51 (95% CI = 2.79–4.42) increased risk of future T2D among Indian Asians. This association was independent of adiposity and the homeostasis model assessment for insulin resistance (HOMA-IR) [16•]. Due to the short time elapsed between sample recruitment and disease detection (mean = 8.5 years), it is possible that some individuals with subclinical disease could have been misclassified as disease-free at baseline in this study. However, an independent study by Dayeh et al. [54] replicated associations at ABCG1 and PHOSPHO1 using samples from the Botnia prospective family-based study. In this study, unaffected participants were on average followed-up during 8.1 years until clinical detection of T2D. Dayeh et al. also demonstrated that methylation of ABCG1 and PHOSPHO1 was associated with other metabolic risk factors [54]. To further support a mechanistic role of methylation at ABCG1 and PHOSPHO1 on T2D, Dayeh et al. compared the association at these sites using target tissues for T2D, identifying consistency in the direction of effect between blood and adipose tissue for ABCG1, and between blood and skeletal muscle for PHOSPHO1 [54]. Lastly, gene expression of ABCG1 was inversely correlated with methylation of ABCG1 in muscle but not in peripheral blood, whilst no correlation between these traits was identified for PHOSPHO1 in any of the tissues interrogated [54].

Blood DNA Methylation as a Marker of Prevalent T2D

The vast majority of T2D EWAS have used a cross-sectional case-control design to compare DNAm in diagnosed T2D cases and controls who are presumed to be disease-free [15, 27•, 29•, 48, 49, 52, 55, 56]. Some of the CpG sites identified in studies of prevalent T2D (TXNIP, ABCG1, and SREBF1) have also been reported in studies of incident T2D [16•]. This might be explained by misclassification of subclinical T2D as disease-free (as discussed above), or it might also reflect a causal effect of DNAm at these sites on T2D, which persists once the disease is established. It could also reflect persistent confounding factors, including underlying genetic effects on T2D and DNAm.

In EWAS of T2D using peripheral blood, the CpG site that has most commonly been associated with T2D, independently of body mass index (BMI), is TXNIP (cg19693031) [15, 27•, 29•, 55, 56•]. For example, in a case-control study of ~ 1500 adults, Florath et al. [29•] identified 39 T2D-associated CpG sites in a discovery cohort, with replication of methylation differences in a second subset of the cohort for a signal mapping to TXNIP. At this site, T2D cases are consistently hypomethylated compared with controls according to studies in Europeans [16•, 29•, 55], Indian Asians [16•], Mexican Americans [15], Arabs [49], and Ghanaians [56•]. The generalisability of the association at TXNIP across populations supports the potential clinical use of this site as a biomarker of T2D risk. In addition, methylation of TXNIP appears to be inversely associated with HbA1c [29•, 55, 56•] and fasting glucose [29•], leading to the hypothesis that sustained hyperglycemia may be one of the factors driving hypomethylation of TXNIP [57•]. TXNIP is the thioredoxin interacting protein, which is responsive to glucose concentrations in the cell. The protein is overexpressed in humans and animals with T2D [23•], and its function has been linked to vascular complications by modulating angiogenesis and inhibiting the vascular endothelial growth factor (VEGF) [23•].

In addition to TXNIP, reproducible CpG sites in T2D have been reported at ABCG1 (cg06500161), C7orf50 (cg04816311), and CPT1A (cg00574958) [56•], and more population-specific sites have been discovered at DQX1 (cg06721411), TPM4 (cg07988171), and MSI2 (cg23586172) in samples of Qatari [49], Ghanaian [56•], and Korean origin [58], respectively.

Considering the growing evidence of DNAm as a marker of T2D predisposition and state, Walaszczyk and colleagues evaluated the replicability of the CpG sites most recently reported in the literature in association with T2D, HbA1c, and fasting glucose [27•]. Associations considered for replication were CpG sites that had been previously reported across ethnic groups and tissues [27•]. The target sample for replication was a case-control subsample (n = 200, cases = 100, controls = 100) of the LIFELINES prospective population-based study from the Netherlands with availability of whole blood DNAm [27•]. Replication was achieved for T2D-associated CpG sites in ABCG1, LOXL2, TXNIP, SLC1A5, and SREBF1, and for fasting glucose-associated CpG sites in ABCG1 and CCDC57 (Table 1). Additionally, the authors reported poor cross-tissue consistency in T2D-associated CpG sites, as none of the associations previously reported in the liver, pancreas, and adipose tissue were replicated in blood.

EWAS of prevalent and incident T2D using peripheral blood DNA have also demonstrated that methylation variable loci in T2D do not overlap with previous GWAS loci for the disease. Thus, DNAm may influence biological mechanisms of tissue response to hyperglycemia different from those implicated by genetic studies, which appear to be primarily associated with β cell function and insulin activity.

EWAS of T2D in Disease-Relevant Tissues

EWAS of T2D have also been conducted in disease-relevant tissues and have recently been reviewed by Davegårdh et al. [57•]. Sample sizes used in these studies tend to be smaller due to tissue or cell availability. However, replicated associations between DNAm and T2D, or T2D-related traits (i.e. obesity, BMI, fat distribution, diet, exercise), have been identified in adipose tissue [50, 51, 59,60,61,62,63,64], islets [53, 65,66,67], skeletal muscle [59, 68, 69], and liver tissue [70, 71] (Table 1). However, there has been little overlap between CpG sites identified in EWAS of these tissues and EWAS of blood, except for at ABCG1, which was hypermethylated in blood and in adipose tissue of T2D cases [54], and MSI2, which was hypomethylated in blood and in pancreatic islets of T2D donors [58]. Conversely, unlike in EWAS of T2D in blood, some of the methylation loci identified in EWAS of disease-relevant tissues overlap with GWAS loci for T2D [60,61,62, 67].

The genetics of epigenetics and using Mendelian randomisation as a method to infer the causal role of methylation variation in T2D

The genetics of epigenetics and using Mendelian randomisation as a method to infer the causal role of methylation variation in T2DDisease-associated methylation variation may be causal or consequential [10, 72]. Several mechanisms explain how variation in methylation arises prior to disease onset, for example via stochastic changes during development, or in response to environmental exposures at any stage of the life course [12]. However, variation in methylation detected prior to disease onset is not always an indicator of causality [12]. Because observational studies do not allow us to distinguish between causal and consequential epigenetic variation, following robust replication of findings, triangulation of methods for assessing causality is increasingly informative [73, 74]. Methods include parental negative control studies [75], cross-cohort comparisons [73], matched within sibship designs [76], and Mendelian randomisation (MR) [9, 11, 74, 77,78,79]. MR is increasingly widely applied in epigenetic studies and is reviewed here.

MR uses germline genetic variations as instrumental variables to establish the causal relationship between a modifiable exposure (in this case, DNAm) and a related outcome (in this case, T2D) in observational epidemiology [9, 80,81,82]. Because genetic variants are randomly transmitted from parents to offspring, they are fixed at conception and not influenced by behavioural, socioeconomic, or physiological factors commonly affecting observational associations, or by the disease itself through reverse causation [9, 80, 81]. MR can be applied to epigenetic studies in a number of different ways (Fig. 1); (a) to seek causal evidence of an exposure (e.g. smoking, alcohol intake, obesity) on methylation variation [78]; (b) to seek causal evidence of a mediating role of methylation variation on a disease outcome (e.g. smoking, methylation change, lung cancer) [83]; or (c) to asses directionality of an observed association when reverse cause is suspected [32•].

Fig. 1
figure 1

Example of how Mendelian randomisation can be applied to ascertain causality in epigenetic studies of T2D. a Investigate the causal role of known risk factors for T2D on variation in DNA methylation using EWAS evidence. Genetic proxies for the risk factor are extracted from the largest GWAS meta-analyses. These genetic variants should be independent of known confounders of the main association. b Use of MR to interrogate the mediating role of DNA methylation variation in the association between established risk factors and T2D. This design is known as a two-step epigenetic MR. The first step of the analysis calculates the causal effect of a risk factor on variation in DNA methylation based on EWAS findings and using GWAS loci to proxy variation in the exposure. The second step calculates the causal effect of DNA methylation (mediator) on T2D using independent methylation quantitative trait loci (meQTL acting in cis or trans) to proxy for variation in DNA methylation. meQTL are extracted from large studies of meQTL catalogues. Lastly, the mediated effect is calculated by multiplying the intermediate causal effects between the risk factor and DNA methylation, and between DNA methylation and T2D. c Applying bidirectional MR to investigate the causal direction of an observational association identified between DNA methylation and T2D in an EWAS

Sources of genetic instruments to conduct MR studies are GWAS of relevant exposures or traits, and studies identifying methylation quantitative trait loci (meQTL), which detect common genetic variants (SNPs) associated with variation in DNAm at CpG sites [9, 11, 84]. Due to the nature of DNAm, meQTL need to be identified in a temporal and tissue-specific manner, ideally consistent with the time-point and tissue where the epigenetic association was observed [9, 11, 84]. Large-scale meQTL studies have been conducted by Gaunt et al. (www.mqtldb.org) [84], Bonder et al. (BIOS QTL browser, https://genenetwork.nl/biosqtlbrowser/) [85], and most recently, by the genetics of DNAm consortium (GoDMC, www.godmc.org.uk/). Collectively, these studies provide a catalogue of known meQTL. However, they have the limitation that meQTL are exclusively derived from peripheral blood DNA. To date, the two largest consortia for the study of the genetics of T2D and glycemic traits are the Diabetes Genetics Replication and Meta-analysis (DIAGRAM, www.diagram-consortium.org) and the Meta-analysis of Glucose and Insulin-related traits consortium (MAGIC, www.magicinvestigators.org/).

Special considerations for the design of MR studies have been described elsewhere [80,81,82]. Based on the source of data used to derive effect estimates of the association between the genotype, the modifiable exposure, and the outcome, the MR approach can be a single sample MR (estimates from a single sample with individual-level data) or a two sample MR (estimates from two independent samples with summary data) [81, 82]. Previously, MR studies in T2D have been performed to understand the causal role of adiposity, blood lipids, and inflammatory risk factors on the disease [86]. However, as outlined above, MR can also be extended to study the causal role of DNAm as a mediator in the exposure-outcome association, or as the exposure or outcome of interest [77]. In either case, causality needs to be supported by identifying SNPs in strong association with methylation at the CpG site(s) of interest [11, 47•, 77, 78, 87].

Despite continues efforts to increase sample size, the power of current meQTL studies only allows identification of a small number of independent SNPs strongly associated with DNA methylation levels at CpG sites of interest [88]. This phenomenon imposes some limitations when using meQTL as instruments in MR studies due to the small variance in methylation captured by the meQTL (i.e. weak instrument bias), and the inability to conduct further sensitivity analyses to rule out confounding by horizontal pleiotropy [88]. Evidence to date supports a highly polygenic architecture of DNAm. Future datasets of meQTL are expected to provide stronger instruments for a larger number of CpG sites and will need to include the development of methods to allow the use of multiple meQTL in a single instrument whilst accounting for their likely correlation with each other. Approaches such as multiple trait colocalisation have proven useful in strengthening causal inference [88] but further methodological development is warranted. The risk of false positives can be reduced by conducting an in-depth inspection of the associations identified drawing upon various sources of tissue-specific reference data for example.

In comparison with GWAS of complex traits that include large sample sizes, studies with availability of genetic and DNAm data are generally modest in size [47•]. In principle, having a small sample size can limit the use of DNAm in a single sample MR analysis, but this can be circumvented in a two sample MR design, where associations are retrieved from summary data using two independent and well-powered samples [47•].

Interaction Between Genetic and Epigenetic Variation in T2D

The role of the epigenome in regulating gene function is not independent of the genotype, as SNPs can influence methylation variance at CpG sites that also have a component of environmental variance [77]. In some instances, SNPs can affect methylation directly by introducing or removing a CpG site in the context of CpG-SNPs [23•, 89, 90], which have been identified in blood [84] and in T2D relevant target tissues [90,91,92]. Despite identifying SNP-DNAm associations at candidate loci for T2D [46, 47•, 90] and obesity [93], it is still unclear whether genetic variants affect both traits, DNAm and the disease, simultaneously or independently. In a study conducted by Elliott et al. [46], the principles of MR were used to ascertain the role of DNAm as a mediator in the association between the genotype (i.e. established GWAS SNPs for T2D) and future liability to T2D, based on methylation profiled in unaffected young participants [46]. Multiple CpG sites associated with T2D-SNPs were identified as potential non-causal biomarkers for T2D [46]. However, only for one site (mapping to the KCNQ1 gene) was there any evidence that DNAm was on the causal pathway to disease in later life [46].

Instead of using T2D-SNPs as causal anchors to identify CpG sites associated with liability to T2D, Richardson et al. used meQTL as instrumental variables to ascertain the causal role of DNAm as a mediator in the genotype-T2D and genotype-glycemic traits association [47•]. meQTL were extracted from the mQTL database [84], while associations with the outcome were extracted from the latest GWAS meta-analysis in T2D [94] and glycemic traits [95, 96]. Analyses were performed using a two sample MR, and after multiple testing correction, a causal role of DNAm on T2D (at p < 1.39 × 10−8) was identified at CpG sites in cg04198914 (HNF1B), cg03864215 (KCNJ11), cg23956648 (IGF2BP2), and cg25064352 (WFS1) [47•], as well as at cg15453836 (PEAK1) and cg01883759 (JAZF1) [47•]. With respect to the glycemic traits, a causal effect of methylation on fasting proinsulin was detected at five CpG sites (in PDE2A, PTPMT1, STARD10, and ARAP1), and with HbA1c at seven CpG sites (in G6PC2, TBCD, and FN3K) [47•]. The use of colocalisation methods further indicated that the same causal variant was explaining variation in DNAm and T2D at KCNJ11 and WFS1, while for the remaining loci, associations were explained by two different but correlated instruments [47•]. To ascertain the true direction of effect, a reverse MR (T2D➔DNAm) was conducted for associations with previous evidence of colocalization [94]. In this analysis, 25 SNPs identified in a recent GWAS meta-analysis for T2D were used as genetic instruments. Compared with results of the forward MR (DNAm➔ T2D), results of the reverse MR showed weaker evidence (p > 1.39 × 10−8) that T2D was causally determining changes in DNAm at KCNJ11 and WFS1 [47•]. Overall, the study by Richardson et al. illustrates how MR methods can be used to prioritise DNAm markers with potential influence on T2D and related traits. However, CpG sites identified in this causal analysis cannot be regarded as true mediators of the SNP-T2D and SNP-glycemic trait associations, as possible horizontal pleiotropic effects (i.e. SNP-T2D association independent of DNAm) could not be completely ruled out, even after incorporating colocalisation methods.

Causal Effect of DNA Methylation on T2D and Related Traits Based on EWAS Findings

MR can also be applied to associations detected observationally using EWAS, e.g. for BMI [32•, 78, 87], although this approach has yet to be formally adopted for T2D. For BMI, MR analyses have demonstrated that changes in methylation are more likely to be a consequence of BMI rather than the cause [32•, 87].

A logical extension of this causal evidence (that the disease state impacts methylation and not vice versa) is that methylation variable loci may be informative in prediction of future comorbidities. In a study by Wahl et al. [32•], a methylation risk score generated using 11 CpG sites prospectively associated with BMI was able to predict future T2D risk (relative risk = 2.3, 95% CI = 2.07–2.56 per 1SD increase in the score) [32•].

Considering the growing evidence of methylation variable loci associated with T2D based on well-powered EWAS and meta-analyses of EWAS of T2D, it is necessary to strengthen evidence of the causal role of these signals using triangulation of causal inference methods, including MR, to prioritise candidate methylation loci for the early detection, adequate subtyping, and treatment of T2D. Even if they are not causal, CpG sites detected prospectively in association with T2D can be used as biomarkers based on the replicability of these associations across studies, and on their relevance in revealing new biological mechanisms of disease.

Conclusion

Epigenetic studies of T2D offer a new avenue to discover novel biological mechanisms implicated in T2D aetiology alongside biomarkers of disease that are potentially informative for disease prediction. A number of loci have been detected in large-scale studies measured predominantly in blood, including TXNIP, ABCG1, CPT1A, and SREBF1. Methods to establish causality of epigenetic markers in T2D aetiology are becoming commonplace. Because observational studies do not allow differentiation between causal and consequential epigenetic variation, triangulation of methods for assessing causality is increasingly informative. MR is a frequently used method for assessing causality that we have reviewed here. Ultimately, understanding the causality of epigenetic markers in T2D aids prioritisation of CpG sites as earlier biomarkers to detect disease, or in drug development to target epigenetic mechanisms in order to treat patients.