Introduction

India currently has ~50.8 million adults with type 2 diabetes (T2D) which is estimated to rise to 87.0 million by 2030 (IDF Diabetes Atlas 2009). Most of this increase is expected to be in the urban areas (Sicree et al. 2006; Mohan et al. 2008; Mohan and Pradeepa 2009; Ramachandran et al. 2010; Unwin et al. 2010). Moreover, another trend noticed is the earlier onset of T2D; shifting from ≥40 to 30 years of age (Mohan et al. 2007). The pathophysiology of T2D is complex, involving an array of molecular pathways that further interact with various environmental factors. Efforts to identify the genetic architecture of T2D using candidate gene analysis and genome-wide association studies (GWAS) have led to a dramatic expansion in T2D genes (~40) in the last few years. But none of the approaches have resulted in a complete understanding of T2D, and only 10% of the heritability could be explained by the genetic risk loci identified so far (McCarthy 2010). Genetic factors alone cannot explain the sudden outbreak of epidemic of T2D in India. This can be attributed to increasing mechanization and reduced physical activity that accompany the new urban lifestyle in Indian society (Mohan and Pradeepa 2009; Unwin et al. 2010). The combination of urbanization and sedentary lifestyle with an increase in energy intake due to easy availability of high-fat, energy-dense fast foods at reduced costs have been instrumental in increasing obesity and metabolic abnormalities in the population (Ramachandran 2003; Astrup et al. 2008). Urban residents have also been known to exhibit higher levels of stress hormone causing high fat, sugar, and insulin in the blood, insulin resistance and eventually T2D (Ozcan et al. 2004).

Environment factors like age, diet, maternal nutrition which are key modifiers of lifestyle diseases such as diabetes, obesity, dyslipidemia, hypertension, and cardiovascular diseases (Lillycrop and Burdge 2011) lead to epigenetic modifications of the genome. Epigenetic changes regulating gene expression has emerged in the last few years as a potentially important contributor to disease risks. In the absence of strong genetic predictors, epigenetic changes in the genome may be a better readout for the environmental influence and individual predisposition for T2D in population with different mutational and demographic histories.

India, encompassing one-sixth of the world population and the second largest pool of T2D patients, with unique risk phenotypes, and rapid socio-economic transitions, provides a unique resource for dissecting pathogenesis of T2D (Indian Genome Variation Consortium 2008) at an epigenetic level. Thus, we initiated a nation-wide collaborative effort by establishing the INdian DIabetes COnsortium (INDICO) for the systematic, comprehensive, and large-scale studies towards the understanding of T2D in India.

The primary goal of INDICO is to build resources and generate information to understand role of genetic, epigenetic as well as environmental factors in pathogenesis of T2D. The consortium deemed the resource fabrication pertinent to divulge the risk factors accountable for the high vulnerability of Indians to T2D by identifying pre-diagnostic markers at genetic, epigenetic, metabolite and proteomic levels for managing T2D and other related life style disorders. The collaborative work of the consortium has led to a repository of samples (DNA, plasma, and serum) of more than 17,000 subjects from two major ethnic groups in India (Indo-European and Dravidians) that have been well-characterized for anthropometric and clinical markers (Fig. 1). The subjects’ recruitment procedures and ascertainment criteria are detailed in supplementary material. The anthropometric and clinical characteristics of the recruited subjects are provided in Supplementary Table 1.

Fig. 1
figure 1

Bio-repository of INDICO, its recruitment and ascertainment process. DBP Diastolic blood pressure, DR Dravidian, FPG Fasting plasma glucose, HbA1c Glycosylated hemoglobin, HC Hip circumference, HDL-C High density lipoprotein cholesterol, hsCRP high sensitivity C-reactive protein, IE Indo-European, LDL-C Low density lipoprotein cholesterol, PD Pre-diabetic subjects, SBP Systolic blood pressure, T2D Type 2 diabetic patients, TC Total cholesterol, TG Triglycerides, WC Waist circumference, 2 hr PG 2 hr post-load plasma glucose

Future goals and developments

INDICO brings together the expertise and experience of clinicians, scientists, and young researchers involved in diabetes research from all over the country (Supplementary Figure 1), that will lead to capacity building for high-end genetic, epigenetic, and proteomic studies in India in terms of sample repository, infrastructure, and skilled manpower. The resources developed by INDICO can bestow: (1) GWAS to discover genes associated with T2D and related metabolic traits, (2) Effect of lifestyle transitions by re-sampling from the same areas at regular intervals, (3) Gene expression profiling of different tissues to identify differentially expressed genes in T2D, (4) Deep re-sequencing to discover rare variants with high penetrance, (5) Proteomic and metabolite studies to develop pre-diagnostic markers for T2D and related traits, (6) Metagenomics to understand the difference in gut micro-biota in T2D and normal individuals, and (7) Exchange of knowledge and experiences across the globe to counter disease burden.

Moreover, the consortium is also actively engaged in identification of novel genes and pathways for T2D through different strategies including in silico candidate gene prioritization, in silico function prediction and in vitro functional analyses (Sharma et al. 2010). The consortium monitors the effects of urbanization on a population-in-transition by re-sampling at regular intervals from the same geographical locations. The time range of sample collection (2003–2011) coincides with the rapid increase in the Gross Domestic Product (GDP) growth of the Indian economy (World Development Indicator 2011). We have already observed an apparent clinical shift in the major biochemical parameters (lipid profile & Hb1Ac) in congruence with the rise in GDP (Fig. 2) indicating an epidemiological transition in progress. This gives us an unprecedented opportunity to probe the correlation, if any, between genetic and epigenetic changes in the genome and rapid changes in lifestyle.

Fig. 2
figure 2

Change in average cholesterol levels and Hb1Ac in sample population during the time scale of INDICO sample collection and Gross Domestic Product (GDP) growth of Indian economy

The INDICO bio-repository opens up new avenues to initiate large-scale studies that can unravel new risk factors and pre-diagnostic markers at the genetic, epigenetic, metabolite, and proteomic levels for T2D, its associated and related traits. DNA repository will serve as a valuable resource to understand the synergistic role of genetic and epigenetic factors in the pathogenesis of T2D through large-scale studies. Our previous candidate gene studies which identified a couple of novel putative T2D genes, (Tabassum et al. 2008, 2010) suggest that exploration of genetic factors in Indian population can lead to better understanding of T2D. This huge DNA repository also provides opportunity to discover rare variants for T2D that will be a testimony for the “mosaic model” (Sharma et al. 2005) which propose that T2D results from interactions among large number of rare alleles, smaller number of common alleles and the environment. Recently we have completed the genotyping of the first type 2 diabetes GWAS in Indian population using the 610 Quad chips. We expect this GWAS to not only reveal novel loci for T2D but also serve as a dataset for GWA analysis for large number of anthropometric and quantitative metabolic traits such as BMI, blood pressure, plasma glucose, lipids, etc.

Apart from this, the whole genome genotyping data of the 1,250 controls which are well characterized can also be used as control dataset for number of other complex disorders such as type 1 diabetes, chronic pancreatitis, hypertension, polycystic kidney disorder etc. Proteome analysis at different pre-disease and disease states can lead to the identification of protein and metabolic biomarkers that may be predictive of the disease manifestation. Differential gene expression and methylation profiles would be a key to mechanistically dissecting the development of the disease and identifying novel drug targets. To aid this, the tissue repository will allow tissue specific epigenetic and gene expression studies for T2D. Moreover, the superimposition of GWAS data with these exhaustive epigenetic and proteomics profiling will help us understand the correlation between the genome, proteome and epigenome, thus revealing example of gene environment interaction. This will be crucial in developing a complete picture of the disease leading to better treatment and healthcare policy.

Summary

The consortium deemed the resource fabrication pertinent to divulge the risk factors accountable for the high vulnerability of Indians to T2D by identifying pre-diagnostic markers at genetic, epigenetic, metabolite and proteomic levels for managing T2D and other related life style disorders. We also hope that INDICO will eventually be of value to not just the diabetes research community but would be able to contribute towards an improved understanding, diagnosis and prevention of numerous complex human disorders.