Background

According to the International Diabetes Federation (IDF), “Diabetes Mellitus is one of the fastest growing global health emergencies of the twenty-first century” [1]. The global prevalence of type 2 diabetes mellitus (T2DM) was 536.6 million in 2021 and is projected to increase to 642.7 million by 2030 and 783.2 million by 2045, which is almost 46% increase in the prevalence [1]. It is estimated that the highest percentage increase will be in middle-income countries compared to high- and low-income countries [1]. The highest prevalence of diabetes in people aged 20–79 years is reported in the Middle East and North African region (MENA) (18.1%). In contrast, the African region has the lowest prevalence (5.2%) which was attributed to comparatively low levels of urbanization and low levels of obesity [1]. China (140.9 million), India (74.2 million) and Pakistan (33 million) have the most significant number of adults with T2DM and are expected to remain the same in 2045. Almost one in two adults with diabetes is undiagnosed, and 87.5% of the undiagnosed are in middle- and low-income countries. In 2021, excluding the mortality associated with the COVID-19 pandemic, almost 6.7 million between the age of 20–79 years died due to diabetes-related complications which is almost 12.2% of the global deaths from all causes [1]. Among this, almost 32.6% of the deaths occurred in working-age people [1]. Diabetes-related costs have increased by 316% over the past 15 years. This highlights the urgent need to improve the ability to prevent the development of T2DM at an early stage.

The increasing burden of T2DM is not completely understood as the aetiology of diabetes is multifactorial, including genetic factors coupled with environmental factors such as rapid urbanization, urban migration, and lifestyle changes [2]. Decades of rapid urbanization and associated socio-economic transformation have resulted in healthier lifestyles and dietary preferences shifting to unhealthy practises [3]. Environmental factors play a significant role in the development of diabetes, but they do not impact everyone in the same way. Even with the same environmental exposures, some are more susceptible to developing diabetes than others, and this increased risk is considered to be inherited [4].

Currently, there are non-clinical and clinical measures for diabetes prevention and management such as lifestyle modifications (dietary modifications, physical activity, behavioural modifications), medical nutrition therapy (MNT), bariatric surgeries, treatment using medicinal plants [5, 6] and clinical measures such as oral anti-diabetic drugs and insulin [7]. Although the benefits of lifestyle modification in diabetes prevention and the effectiveness of pharmacological treatments are well approved, there continues the increase in the prevalence of T2DM. Research suggests that the T2DM has a critical genetic predisposition. Evidence indicates that Indians are more susceptible to insulin resistance than Europeans of similar age and body mass index, suggesting the significant possibility of population-specific genetic risk factors [4, 8, 9]. Furthermore, studies have identified that South Asians have a greater tendency for visceral fat deposition, higher total body fat percentage and insulin resistance compared to other ethnic groups at similar levels of body mass index [4]. Epidemiological studies have reported that migrant Asian Indians living in different parts of the world show a much higher prevalence of diabetes than the residents of countries [4].

Genetic factors play an important role in the pathogenesis of diabetes and thus are an essential element in understanding the cause of the disease and possible prevention methods. Advances in genotyping and sequencing have led to the identification of SNP as genetic variants associated with type 2 diabetes or related glycaemic traits [9]. Combined genetic risk scores composed of the weighted sum of the risk alleles at these loci have been tested for their ability to predict diabetes in individuals beyond the information provided by clinical risk factors [10]. Genome-wide association mapping is a concept well utilized for identifying new risk alleles/loci. Developed countries like the USA and Nordic countries have initiated the Precision Medicine Initiative (PMI) for some major non-communicable diseases such as cancer and T2DM [11]. The emerging field of precision medicine requires understanding the risk alleles of each population. This review aims to analyse the known genetic factors of T2DM in the global population, including India, and identify the significant risk alleles.

Materials and methods

In this review, we included original research and meta-analysis studies that assessed the genetic determinants of T2DM among people in the Indian subcontinent and globally, published to date (the year 2021). We included studies that explored marker–trait associations from observational studies (n = 59) and genome-wide association studies (GWAS) (n = 6). Those studies that reported the criteria of diagnosis for type 2 diabetes mellitus using World Health Organization (WHO) [12] or American Diabetes Association (ADA) criteria [13] only were included in the review. Genome-wide association study reports are from peer-reviewed published works where the samples were taken based on the standard procedure and traits.

Search strategy and study selection

Data sources such as PubMed, Google and Google Scholar were used to identify the studies. The keywords used were “Genetics of T2DM”, “SNPs”, “India” and “Global”. The flowchart of data extraction and review is given in Fig. 1. We classified the SNPs associated with T2DM according to regions such as South East Asian region (India, Sri Lanka), Western Pacific region (Japan, Australia), Eastern Mediterranean (Jordan), African region (Western Africa, Ghana, Nigeria and Kenya) and European countries and American regions (Mexico, Latin America, the USA).

Fig. 1
figure 1

Flowchart of data extraction

Genetics of T2DM

Type 2 diabetes mellitus is polygenic, and over 100 genes have already been reported [4]. Three primary methods are adopted to identify the genetic predisposition of T2DM, which primarily focuses on linkage peaks from family studies, candidate genes on a biological basis and genome-wide association analysis.

Family and twin studies have indicated 20–80% of inheritability of diabetes [14]. First-degree relatives of individuals with T2DM were three times more likely to develop the disease than individuals without a positive family history [14]. Studies have reported that individuals born to affected parents were more likely to develop T2DM [odds ratio (OR) = 6.1, 95% CI = 2.9–13] compared to people with unaffected parents (OR = 3.4–3.5). Although maternal and paternal diabetes conferred risk for developing diabetes, the Framingham offspring study reported that offspring with maternal diabetes had a slightly more chance for abnormal glucose tolerance than those with paternal diabetes (OR = 1.6, CI = 1.1–2.4) [1]. Multiple twin concordance studies in T2DM reported a higher concordance rate in monozygotic twins (OR: 0.29–1.00) than in dizygotic twins (OR: 0.10–0.43), indicating a significant genetic component of the disease [14].

A candidate gene is a gene whose chromosomal location is associated with a trait of interest. Because of its location, the gene is suspected of causing the disease or other related phenotype [15]. Candidate gene association studies focussed on the association of pre-specified genes of interest and the disease. The genes that were found to be associated with T2DM include peroxisome proliferator-activated receptor gamma (PPARG), insulin receptor substrate 1 (IRS1) and IRS-2, potassium inwardly rectifying channel, subfamily J, member 11 (KCNJ11), Wolfram syndrome 1 (wolframin) (WFS1), hepatocyte nuclear factor-1 alpha (HNF1A), HNF1 homeobox B (HNF1B) and HNF4A [14]. The genes, including Rap guanine nucleotide exchange factor 1 (RAPGEF1) and tumour protein 53 (TP53) were identified using an algorithm that prioritized candidate genes for complex human traits based on trait-relevant functional annotation but had not been consistently replicated in later studies [4]. Other candidate genes are tyrosine-protein kinase (LYN), DENN domain-containing protein 1B (DENND1B), mitochondrial ribosomal protein (MRPL30) 3-hydroxyisobutyrate dehydrogenase (HIBADH) [16]. PPARG and KCNJ [11] were the most validated diabetes-associated genes identified through functional candidate analysis [17]. With the rapid improvements in the genotyping technology of SNPs and the Hap Map project, the methods for identifying susceptibility genes have changed dramatically [18]. GWAS identified more than 70 genetic variants associated with T2DM [4]. These gene variants were related to different metabolic pathways of the disease. Studies conducted among European communities have identified 41 SNPs associated with T2DM and found that genes associated with glucose homeostasis, insulin pathway and pancreatic development pathways were the candidate genes associated with T2DM [19]. SNPs at high mobility group box 1 pseudogene 1 (HMG1L1)/CCCTC-binding-like factor (CTCFL), paired box 4 A4 (PLXNA4), cleavage-activating protein (SCAP), chr5p11 and a novel locus at 13q12 at sarcoglycan gamma (SGCG) were associated with T2DM [20]. Table 1 provides a comprehensive list of marker–trait association of T2DM, and includes significant findings related to the genes of T2DM in the global population, including India. Table 2 describes the functional classification of major genes related to T2DM and their related morbidity.

Table 1 Comprehensive list of marker–trait association of T2DM: the studies covering the global population, including India are taken and the significant findings related to the genes of T2DM are listed here
Table 2 Functional classification of major genes related to T2DM and their related morbidity

Genetic studies in the Indian population

South Asians have higher rates of T2DM compared to other ethnic populations. Migrant studies have also reported the same [17]. The most investigated functional candidate genes in South Asians include PPARG, TCF7L2, insulin-like growth factor 2 MRNA-binding protein 2 (IG2BP2), adiponectin, C1Q and collagen domain containing (ADIPOQ) and alpha-ketoglutarate-dependent dioxygenase (FTO) [4]. A study in Sri Lanka replicated the 36 SNPs associated with Europeans. Out of the 36 SNPs, 31 were significantly associated with T2DM. The strongest effects were seen at TCF7L2 and solute carrier family 30, member 8 (SLC30A8) [18]. The Ala gene of PPARG was found to lower the 2-h plasma glucose among the Caucasians, while no effect was seen among the populations in Chennai. Sanghera et al. identified this gene's protective effect among the Sikh community of India [4]. A study conducted in seven geographically distinct areas of India explored 91 SNPs of 55 candidate genes and identified five genes associated with T2DM such as TCF7L2 (rs7903146, rs12255372), insulin-degrading enzyme (IDE) (rs1887922), haematopoietically expressed homeobox protein (HHEX) (rs1111875, rs5015480), ectonucleotide pyrophosphatase/phosphodiesterase 1 ENPP1 (rs1044498) and FTO (rs9939609, rs3751812) [12]. These genes play a major role in the metabolic pathways of diabetes pathobiology. The study also identified an increased risk (OR = 2.44, 95% CI = 1.67–3.59) when TCF7L2, HHEX, ENPP1 and FTO were combined [14]. KCNJ11 rs5210 and potassium voltage-gated channel subfamily Q member 1 (KCNQ1) rs2237895 variants were found to be significantly associated with risk of T2DM in the Indian population but were found insignificant in the South Indian population [21, 22].

A protective-odds (OR = 0.28, 95% CI = 0.19–0.43) was identified with a genotypic combination of IDE, HHEX, ENPP1 and FTO among controls. A study conducted among the Indo-European individuals in Delhi and Pune identified strong association at rs7903146 of TCF7L2 with OR 1.67 [16]. A study by Radha et al. identified the association of rs4810424 and rs736823 of HNF1A gene with T2DM. Genome-wide studies have mapped a susceptibility locus for T2DM to 3q27, where ADIPOQ gene is situated. SNPs of this gene have been studied, and two SNPs, a silent T to G substitution in exon 2 and a G to T substitution in intron 2, were found to be associated in the Japanese population [20]. A study identified that + 10211T/G polymorphism in the adiponectin gene was associated with T2DM in the Asian Indian population [23]. In Hyderabad, South India, a study mapped 3 SNPs associated with T2DM rs7903146, rs12255372 and rs11196205. Among them, rs7903146 was more at risk for T2DM [24]. Initial European studies on the FTO gene identified rs9939609 as associated with high body mass index (BMI). In contrast, among South Indians, rs9939609 was associated with T2DM independent of body mass index (BMI). TCF7L2 is the most widely studied gene, which has been positively associated with T2DM in Europeans [25]. The Chennai Urban Rural study showed similar results where rs12255372 and rs7903146 were associated with T2DM. The “T” allele of these SNPs showed association with non-obese participants. The variants rs9939609 T/A and rs7193144 C/T of FTO were associated with obesity in Asian Indians [26]. Recently, six variants—rs9940128, rs7193144, rs8050136 (intron 1), rs918031, rs1588413 (intron 8) and rs11076023 (3´UTR (unique transaction reference number)), across three regulatory regions of the FTO gene with obesity and T2D in a South Indian population showed that the rs9940128 A/G, rs1588413 C/T and rs11076023 A/T variants were associated with T2D but not with obesity [26]. The C/A variant of rs8050136 was associated with T2DM mediated through obesity. The haplotype “ACCTCT” of this SNP conferred a lower risk of T2DM in the South Indian population [26].

A study in Kerala assessed SNPs of Retinoic acid-inducible gene (STRA6) (rs974456, rs351224, rs736118 and rs4886578), retinol-binding protein 4 (RBP4) (rs3758538, rs36014035 and rs34571439) and glucose transporter type 4 (GLUT4) (rs5412, rs5418 and rs5435). The SNPs of STRA6 were associated with T2DM, while no association was found in RBP4 and GLUT4 17]. Figure 2 shows the SNPs identified across the states of India.

Fig. 2
figure 2

Mapping of SNPs associated with T2DM in India

Genetic distribution of T2DM across regions of the world

Genetic studies in diverse populations are essential for several reasons. Identifying a population-specific variant associated with T2DM can help identify subjects at high risk in that population who could be selected for lifestyle or therapeutic, preventive intervention. Further, discovering causal genes in these populations can expand our understanding of T2DM or lead to a potential therapeutic target that could be valuable even in populations where the genetic variant that prompted the discovery is not present [14].

European region

The initial studies on the genes associated with T2DM were conducted among Europeans [27]. A study by Barroso et al. analysed 71 candidate genes based on their known or putative role in glucose metabolism. The selected genes were subdivided into three broad groups based on their function such as (1) genes primarily involved in pancreatic β-cell function; (2) genes primarily influencing insulin action and glucose metabolism in the main target tissues, muscle, liver and fat; and (3) other genes [28]. Twenty SNPs in 11 different genes showed statistically significant association with disease status (p < 0.05). The strongest statistical evidence for disease association was for genes such as son of sevenless homologue 1 (SOS1), phosphoinositide-3-kinase regulatory subunit 1 (PIK3R1), ATP-binding cassette subfamily C member 8 (ABCC8), insulin receptor (INSR) and KCNJ11 [28]. GWAS among the Europeans have identified T2DM susceptibility loci at PPARG (rs1801282) [29], KCNJ11 (rs5219) [30], WFS1 (rs10010131) [31], IGF2BP2 (rs4402960) [32], SLC30A8 (rs13266634) [33], CDKN2A/B (rs10811661) [33, 34], HHEX/IDE (rs1111875) [32], FTO (rs8050136) [35], neurogenic locus notch homolog protein 2 (NOTCH2) (rs10923931) [36], thyroid adenoma-associated (THADA) (rs7578597) [36], KCNQ1 (rs231362) [36], prospero homeobox protein 1 (PROX1) (rs340874) [36], B cell lymphoma/leukaemia 11A (BCL11A) (rs243021) [30], glucokinase regulator (GCKR) (rs780094) [37], TCF7L2 (rs7903146) [38]. The Pro12Ala variant of PPARG showed protective effects in Finnish, Czech and Scottish ancestries [39].

African regions

Pirie et al. concluded that the risk polymorphisms identified in Caucasian populations were not associated with type 2 diabetes in South African subjects of Zulu descent, except for rs7903146 (TCF7L2) [40]. The study analysed rs1801282 (PPARG), rs5215 (KCNJ11), rs12255372 (TCF7L2), rs7903146 (TCF7L2) rs9939609 (FTO) and rs1111875 (HHEX) which were found to be significant among the European ancestry. At the locus TCF7L2, homozygosity for the C allele (CC) was less frequent in the subjects with type 2 diabetes. Heterozygosity (CT) at rs7903146 (TCF7L2) occurred more frequently in the subjects with type 2 diabetes. No difference was found between subjects with type 2 diabetes and controls for the TT genotype at rs7903146 [40]. This variant of TCF7L2 was also associated with T2DM in the Western African population [40, 41].

Furthermore, Pirie et al. identified that the Africans have only a homozygous variant of KCJN11, unlike American and European ancestry with heterozygous and homozygous variants. The K variant found significant in European ancestry was rare or non-existent and absent in the Africans [30, 40]. A genome-wide association study of 5000 Africans from Ghana, Nigeria and Kenya identified a novel locus zinc finger RANBP2-type-containing 3 (ZRANB3) gene for T2DM [42]. ZRANB3 is a protein-coding gene with nucleic acid binding and endonuclease activity. The ZRANB3 transcript targets nonsense-mediated decay (NMD) and is expressed in tissues relevant to T2D, including adipose tissue, skeletal muscle, pancreas, and liver [42].

Studies among African Americans showed considerable differences in genetic and non-genetic risk factors (including lifestyle and behavioural factors) with the native African population. African Americans had approximately 20% European admixture [41]. Studies among African Americans showed 30% to 40% higher risk for T2DM among those with the highest tertile of African ancestry [43].

American continent

A study by Mercader & Florez, 2017 among the Latino population, solute carrier family 16 Member 11 (SLC16A11) (rs77086571), HNF1A (rs483353044) and insulin-like growth factor (rs149483638) was found to be significantly associated with T2DM [44]. This variant of insulin-like growth factor was present at approximately 17% in the Mexican population but was rare in European and other populations [44]. The rs483353044 of HNF1A gene was associated with T2DM and was found in 0.36% of individuals without T2D but in 2.1% of participants with the disease [24]. Among the Mexican Americans, ATP-binding cassette transporter (ABCA1), adrenoceptor beta 3 (ADRB3), calpain 10 (CAPN10), CDKAL1, CDKN2A/2B, C-reactive protein (CRP), engulfment and cell motility protein 1 (ELMO1), FTO, HHEX, IGF2BP2, insulin receptor substrate 1 (IRS1), zinc finger protein 1 (JAZF1), KCNQ1, LOC387761 (a hypothetical gene), lymphotoxin alpha (LTA), neurexophilin 1 (NXPH1), sirtuin 1 (SIRT1), SLC30A8, TCF7L2 and tumour necrosis factor-alpha (TNF-α) genes were found to be associated with T2DM [44]. A multi-ethnic study (European Americans, African Americans, Latinos, Hawaiians, Japanese Americans) in America identified rs7578597 of THADA as positively associated with European Americans and Native Hawaiians (OR = 1.65, 95% CI = 1.01–2.70) [45]. The rs1801282 of the PPARG gene was associated with African Americans. The rs4402960 of IGF2BP2 was associated with African Americans and Japanese Americans. rs10010131 of wolframin ER transmembrane glycoprotein (WFS1) was associated with Latin Americans and Hawaiians. The most commonly studied TCF7L2 (rs7903146) was associated with all ethnic groups except the Hawaiians [45].

Eastern Mediterranean

A study conducted in Jordan's Circassian and Chechen communities identified two novel SNPs at Jagged canonical Notch ligand 1 (JAGI) (rs6134031) and MLX-interacting protein-like (MLXIP) (rs4758690) [46]. These two were tested among the Europeans. The SNP, rs6134031 in the Jordan analysis, demonstrated a nominally significant association with T2DM among the Europeans (P = 0.012) and the same direction of effect. Serum adiponectin and SNPs in ADIPOQ gene were found to be associated with T2DM in Jordanian population in which the serum adiponectin lowered the risk for prediabetes. At the same time, the GT genotype of rs1501299 increased the risk of prediabetes as well as the TT genotype [47]. A recent study among the Arab population, where consanguineous marriages are more, has identified ribosomal protein S6 kinase B1 (RPS6KA1) gene, rs487321 (recessive, intronic, calcium-dependent secretion activator (CADPS)), rs707927 (additive, intronic in valyl-tRNA synthetase (VARS)) and rs12600570 (additive, intronic, DExH-Box Helicase 58 (DHX58)). Of these three suggestive markers, the CADPS and VARS are associated with increased fasting plasma glucose [48]. A systematic review of the Iranian population identified KCNJ11 and TCF7L2 which are strongly associated with T2DM [49].

Western Pacific region

A study analysed 14 SNPs at HHEX, CDKAL1, cyclin-dependent kinase inhibitor 2B CDKN2B, SLC30A8, KCNJ11, IGF2BP2, PPARG, TCF7L2, FTO, KCNQ1, insulin receptor substrate 1 (IRS1), GCKR, ubiquitin-conjugating enzyme E2 D2 (UBE2E2), C2 calcium-dependent domain-containing 4A (C2CD4A/B) in the Japanese population [50]. Among the 14 SNPs from 14 loci, 4 SNPs (rs7756992 in CDKAL1, rs10811661 near CDKN2B, rs13266634 in SLC30A8 and rs2237892 in KCNQ1) were found to be significantly associated with T2DM. The association of rs2237892 in KCNQ1 was the strongest in the Japanese sample, and rs4402960 in IGF2BP2, rs2943641 near IRS1, rs780094 in GCKR, rs7172432 in C2CD4A/B and rs5219 in KCNJ11 showed a positive association with T2DM. In contrast, no association was seen in rs7903146 (TCF7L2), rs1111875 (HHEX), rs1801282 (PPARG), rs8050136 (FTO) and rs7612463 (UBE2E2) [51], while a genome-wide study among the Australian aboriginals identified association with TCF7L2, potassium inwardly rectifying channel, subfamily J, member 6 (KCNJ6) and melanocortin 4 receptor (MC4R) [52].

Conclusions

Much of our efforts on diabetes prevention are focused on modifiable behavioural risk factors such as physical inactivity, unhealthy diet and tobacco use. In epidemiological studies, the high-risk population is identified at the community level through risk scores consisting of behavioural risk factors and anthropometric measures such as high body mass index and increased waist circumference. More often, the pathophysiological process would have begun once these risk factors set it. The primarily advocated lifestyle modification for T2DM prevention requires positive reinforcement and a conducive environment for implementation. Although lifestyle modification strategies have been shown to have moderate long-term effects on diabetes prevention, it often requires a favourable non-obesogenic environment for adherence. In this context, understanding the genetic determinants can identify the risk groups prior to the onset of these risk factors.

In this review, we found commonalities in marker–trait associations of specific genes to diabetes (e.g. PPARG, TCFL2) in specific geographical regions. However, it cannot be generalized to all populations as these were found to be population specific.

The SNP rs7903146 of the TCF7L2 gene is the most significant genetic marker associated with type 2 diabetes risk in all the ethnicities. This gene is a transcription factor that influences the transcription of several genes, thereby exerting a large variety of functions within the cell. This might be why the gene is significant in almost all the ethnic groups. PPARG is yet another gene found to be significant in all ethnic groups which regulates fatty acid storage and glucose metabolism. Studies have shown that free fatty acids mediate insulin resistance and impaired glucose tolerance associated with central obesity. PPARG has shown both protective and risk associations with T2DM in several regions. Animal studies have shown that PPARG protects from high-fat diet-induced insulin resistance. A Pro12Ala polymorphism has been detected in humans. This polymorphism might cause a reduction in the transcriptional activity of PPARgamma, leading to decreased insulin resistance and decreased risk of type 2 diabetes. This substantiates that the expression of genes is population specific. FTO gene is associated with obesity, and it has been identified as a risk for the development of T2DM in Indians, Europeans, Africans, Western Pacific and American regions. ABCC8 gene is risky in European as well as Indian populations. Genome-wide association studies have reported that IGF2BP2 disrupts insulin secretion. IGF2BP2 was a risk for T2DM in the Western Pacific, Americas and European ethnicities with no significant role in the Indian population. KCJN11 was associated with T2DM among Western Pacific, Africa and European region but not in Indian population.

India has the second-largest number of people living with diabetes, contributing to high mortality and disability adjusted life years. In our review, we could find evidence of marker–trait associations with type 2 diabetes mellitus from only nine states out of 29 states and seven Union Territories in India. The Indian State of Kerala, despite having the highest prevalence of diabetes in the country, has reported only one study on genetic traits of type 2 diabetes research [15]. This warrants future research on genetic markers of diabetes in India and other regions for developing and identifying biomarkers for screening, prevention and precision medicine.

One of the major limitations we found was that the studies were conducted in non-representative groups within geographical regions, with inherent limitations such as smaller sample sizes or looking into only specific marker–trait associations. Genome-wide association studies using data from large prospective studies are required worldwide to establish the genetic determinants of type 2 diabetes mellitus. This urgently needs to be done in regions with the highest burden of mortality and morbidity related to T2DM. We also need future research to understand genes' special effect and interaction in predicting diabetes mellitus and other comorbidities leading to the highest burden of diseases.