Background

Type 2 diabetes mellitus (diabetes) is an international pandemic with high morbidity and mortality and a large economic impact. An alarming 80 % of the global burden lies in low- and middle-income countries (LMICs) [1]. While in urban areas of LMICs, diabetes is recognized as a public health priority and is often well studied; recent prevalence data suggest that diabetes is an increasing problem among rural populations as well [2]. In India, while regional differences exist, national rural prevalence has quadrupled in the past 25 years [3]. A recent large-scale study in a rural community in south India found a diabetes prevalence rate of 7.8 %. The same study found prevalence of pre-diabetes (impaired fasting glucose and/or impaired glucose tolerance) was 7.1 %, representing a population at heightened risk of conversion to overt diabetes [4]. Despite evidence suggesting the growing rural diabetes burden, few studies have examined putative risk factors and their associations with glucose intolerance and diabetes in rural areas. ‘Modernization’ and ‘globalization’ are often blamed, but precise pathways of causation are unclear and understudied [2]. Moreover, rising national prevalence of diabetes is often attributed to urbanization, which results in a disproportionate focus on urban areas and ignores the epidemiological environment of rural regions [3]. Given the inherent diversity of livelihoods, ethnicities, cultures, food habits, and lifestyles within India, there exists a need for research on diabetes and risk factors from different regions across the entire urban/rural continuum to broaden our understanding of the epidemic [5].

We report here the findings of a cross-sectional study in a rural population of northern Tamil Nadu. The study had two principal objectives: first, to establish prevalence rates of impaired fasting glucose (IFG), impaired glucose tolerance (IGT), and diabetes in the research site; and second, to assess associations between diabetes risk and a number of anthropometric, socioeconomic, lifestyle, and dietary factors through multivariate linear and logistic regressions. Such analyses identify potential risk factors that can be further analyzed in case–control and cohort studies. Identifying risk factors of diabetes is crucial for resource prioritization in public health efforts, especially considering the shortage of personnel and resources in rural areas.

Methods

Ethics, consent, and permissions

We obtained clearance for the study from a Canadian university ethics board. Permission for the study was granted by the High Commission of India in Ottawa, Canada. Upon arrival to the research site, and prior to the recruitment process, we approached local authorities (panchayat councils, local police officials, and hospital medical staff) and sought and obtained permission to carry out the study. Informed verbal consent was obtained from all research participants prior to enrollment.

Study design and sample

The sampling frame consisted of the entire adult population (>19 years old) of two rural panchayat wards (Anchetty panchayat and Madakkal panchayat), in the Krishnagiri District of Tamil Nadu. The region is comprised of several small villages surrounding the central market village of Anchetty.

All villages within the sampling area were included in the study. We mapped each village, numbered the houses consecutively, and then randomly selected 8 % of households from which to sample one adult individual. Our target was to sample 800 participants, based on a sample size calculation with an estimated diabetes prevalence of 8 %, a precision of 0.02, and a confidence of 95 %. Upon approaching a household, we employed World Health Organization’s (WHO) Kish method to recruit an individual for the study [6]. If an individual declined to participate, we removed them from the household list and employed the Kish method again. If all adults in the household refused to participate, we approached a neighbouring household and repeated the process. Upon receiving consent, we collected basic census details, including age, sex, education, and occupation.

Socioeconomic status (SES) was assessed using a wealth index comprised of a subset of 13 of 29 questions taken from the Standard of Living Index used by the second round of the National Health and Family Survey (NFHS-2) [7]. We selected those questions believed to be most relevant for our study population, including both household and village characteristics. Each attribute was weighted to give a maximum score of 36. Weights of items were developed by the International Institute of Population Sciences in India based on a priori knowledge about the relative significance of the items in determining SES [8]. Due to the rural subsistence nature of the local economy, this asset-based score is considered a more appropriate indicator of SES than education, income, or occupation [8].

We assessed dietary patterns using a validated interviewer-administered semi-quantitative food frequency questionnaire (FFQ) [9]. Individuals were asked to estimate the frequency of consumption (number of times per day/week/month/year) and usual serving size of dishes and food items in the FFQ. A ‘food atlas’ served as a visual prompt for foods and serving sizes. FFQ data was analyzed using EpiNu, a software program that combined frequency of food consumption with corresponding nutrition information to provide detailed data on caloric consumption and average daily macro- and micronutrient intake [10]. Detailed descriptions of this FFQ and data on reproducibility and validity, including a comparison of FFQ estimates to 24-h dietary recalls, have been published elsewhere [9]. All dietary variables except energy intake (kcal/day) were scaled to g/ 1000 kcal to account for differences in energy intake between participants.

Physical activity was assessed using the WHO Global Physical Activity Questionnaire (GPAQ) [11]. The GPAQ evaluates individuals’ physical activity behaviour in four domains: work, travel, and recreation, and sedentary time. The GPAQ was paired with locally relevant photographs depicting ‘moderate’- and ‘vigorous’-intensity work and recreation activities. Physical activity scores were calculated using WHO’s GPAQ Analysis Guide, which provided a total measure of Metabolic Equivalent of Task (MET min/week) [11]. We scaled values to 1 h/day of moderate physical activity for ease of interpretation. Sedentary time was calculated as h/day spent relaxing, watching television, and performing sedentary work duties (e.g. desk work). Television time was assessed as h/day spent watching television.

All previous epidemiology studies in India known to the authors define households as either rural or urban based on a dichotomous typology, which fails to adequately capture the variability inherent in the urban/rural continuum [12]. In order to better capture rurality, we created a rurality index based on Weinert and Boik’s (1995) and adapted for use in India [13]. The index was generated by summing the standardized values of (1) distance to primary health center (in kilometers, assigned half weight), and (2) number of households in home village (assigned full weight). Values were standardized to a mean of zero and a standard deviation of one. The rurality index was therefore a unitless quantitative value assigned to each individual, with a higher value indicating a greater degree of isolation and/or a lower population density of household location.

Height, weight, waist circumference, and hip circumference measurements were taken by a trained nurse using standardized techniques [14]. We calculated body mass index (BMI) and waist-to-hip circumference ratios (WHR). Blood pressure was recorded using a portable OMRON BP-760 electronic blood pressure machine (Omron Healthcare, Hoofddorp, Netherlands). We took two measurements within 5 min on the right arm in the sitting posture, and the mean of the two measurements was recorded.

Glucose tolerance and diabetes status was determined using an oral glucose tolerance test [15]. After an 8-h minimum overnight fast, we measured fasting capillary blood glucose (CBG) using a One Touch Ultra glucometer (Johnson and Johnson, Milpitas, CA, USA). Oral glucose (82.5 g glucose dissolved in 250 ml water, equivalent to 75 g anhydrous glucose) was administered and consumed within 5 min. A 2-h post-load CBG was then collected [14]. CBG was adopted instead of venous plasma glucose estimations due to unavailability of quality-controlled laboratories and difficulties associated with transporting refrigerated blood samples to a central laboratory. Studies have also found higher nonresponse rates associated with venous blood draws [16, 17]. Priya et al. (2011) compared CBG to venous plasma glucose estimation and determined the Pearson’s correlation coefficient was 0.681 (p < 0.001) in the fasting state and 0.897 (p < 0.001) 2-h post-load [16]. Thus, CBG was viewed as an accurate and acceptable alternative to venous plasma glucose estimation for diabetes screening.

Definitions

High blood pressure was defined as an individual with systolic blood pressure 140 mmHg or more and diastolic blood pressure 90 mmHg or more [18]. Diabetes was defined as individuals diagnosed by a physician who could provide proof of diagnosis and/or those who had a fasting CBG ≥7 mmol/L (≥126 mg/dl) and/or a 2-h post-load CBG value ≥12.2 mmol/L (≥220 mg/dL) [15, 19]. Impaired fasting glucose (IFG) was defined as a fasting CBG ≥6.1 mmol/L (≥110 mg/dL) and <7 mmol/L (<126 mg/dL) and a 2-h post-load value <8.9 mmol/L (<160 mg/dL) [15]. Impaired glucose tolerance (IGT) was defined as a fasting CBG <7 mmol/L and a 2-h post -load CBG ≥8.9 mmol/L (≥160 mg/dL) but <12.2 mmol/L (220 mg/dL) [15]. Pre-diabetes was defined as having either or both IFG and IGT [19]. Underweight was defined as BMI <18.5 kg/m2, overweight was defined as BMI ≥23.0 kg/m2 but <25 kg/m2, obesity class I was defined as BMI ≥25 and <30 kg/m2, and obesity class II was defined as BMI ≥30 kg/m2 [20]. Smoking status was categorized as ‘nonsmoker’ or ‘current smoker’. Tobacco consumers were defined as either or both current smokers or current consumers of paan (smokeless tobacco).

Statistical analysis

We identified outliers, cleaned the data, and completed summary statistics in Microsoft Excel 2010. All other statistical analyses were performed in STATA version 13.0 [21]. Prevalence rates were age- and sex-standardized using state-level age and sex data from the 2011 national census [22]. In separate models, univariate interactions between sex and glucose tolerance were analyzed. There were no significant associations between sex and outcome variables, and we therefore present the results for men and women combined. We calculated mean values of descriptive characteristics, socioeconomic position, and dietary intake across categories of glucose tolerance, including normal, pre-diabetes, newly-diagnosed diabetes, and pre-diagnosed diabetes. Values were expressed as the mean ± standard deviation or percentages. One-way analysis of variance (for continuous variables) and the χ 2 test (for binary variables) were used to test differences across outcome groups.

We employed a backwards elimination process to build two linear regression models assessing the associations between putative risk factors and 2-h post-load CBG and fasting CBG. All factors were first analyzed in univariate linear regressions (see Appendix 1 for full list of factors). Variables with significant associations (p-value <0.2) in univariate analyses were included in an initial multivariate linear regression model. We then methodically eliminated non-significant variables (p-value <0.05) from the multivariate model, assuming a lack of confounding if coefficients of all remaining variables did not change by more than 20 % after removal of the potential confounder. Quadratic terms and interaction terms were assessed if there were biological or practical reason to believe they may be significant. BMI and WHR datasets were standardized to a mean of zero and standard deviation of one for ease of comparability. Where linear regressions violated the assumptions of linear models, including normality of residuals and homoscedasticity, various outcome transformations were tested using the Box Cox function in STATA. Following this, models were adjusted if necessary [23]. If no transformations improved the assumptions of the model, we used robust standard errors when assessing the significance of each predictor, as suggested by Pires and Rodrigues (2007) [24].

In a second step, a multivariate multinomial logistic regression model was fitted to examine associations between putative risk factors and pre-diabetes and newly-diagnosed diabetes (i.e. those individuals diagnosed by our study). Normoglycemic individuals were the referent group. Individuals with pre-diagnosed diabetes were excluded to reduce potential reverse causation (i.e. behavioural changes resulting from diagnosis). We employed a backwards elimination model-building process that closely mirrored the methods described above. All putative risk factors were analyzed first in univariate logistic analyses and included in the initial multivariate model if an association existed (p-value <0.2). Variables were eliminated if, in the final model, they were not associated with either pre-diabetes or diabetes (p-value <0.05). Confounders were identified if they altered remaining coefficients by greater than 20 % after their removal from the model. If a variable was identified as a confounder, it was forced into the final model.

Results

Of the 812 individuals recruited for the study, 753 participated (341 men and 412 women), of whom 752 (92.6 %) completed a FFQ and 749 (92.2 %) consented to blood sampling. Response rate was 87.4 % among men and 99.2 % among women (p < 0.05). Disparity in the response rate was primarily due to migration among local men and thus unavailability at the time of sampling. The mean age was 47 ± 14.7 and the literacy rate was 35.1 %.

The unadjusted prevalence of diabetes was 11.7 %, of which 56.4 % were previously undiagnosed. The overall age- and sex-standardized prevalence of diabetes was 10.8 %. Age- and sex-standardized prevalence of IFG was 3.9 % and IGT was 5.6 %, and overall standardized prevalence of pre-diabetes was 9.5 %. None of the individuals with pre-diabetes were previously diagnosed as such.

Baseline characteristics for the study population by diagnostic category are displayed in Table 1. A significant difference was seen across categories in several descriptive characteristics and wealth attributes. However, the average wealth index and most dietary variables were not significantly different between diagnostic categories. Energy intake was significantly lower for individuals with diagnosed diabetes, perhaps indicating lower energy requirements (due to lower rates of physical activity), or post-diagnosis changes in diet due to doctor recommendations.

Table 1 Baseline characteristics of a sample of individuals in rural south India by diagnostic category following an oral glucose tolerance test

Table 2 displays the results of the multivariate linear regression model with 2-h post-load CBG as the dependent variable. BMI and WHR were positively associated with 2-h CBG. Physical activity, rurality index, and intake of polyunsaturated fatty acids were negatively associated with 2-h CBG. The untransformed model was heteroscedastic using a Cook-Weisberg test and lacked normality of standardized residuals using a Shapiro–Wilk statistic. We used a negative inverse transformation on the outcome variable, which solved the heteroscedasticity problem but residuals still lacked normality. We therefore used robust standard errors when assessing the significance of each predictor [24]. The adjusted R-squared value for the final model was 0.18.

Table 2 Factors associated with 2-h post-glucose-load capillary blood glucose during an oral glucose tolerance test in a sample of individuals in rural south India

Table 3 displays the results of the multivariate linear regression model with fasting CBG as the dependent variable. BMI, WHR, and tobacco consumption were positively associated with fasting CBG, while higher physical activity was associated with lower fasting CBG (Table 3). The untransformed model was heteroscedastic and lacked normality of standardized residuals. Using the Box Cox function in STATA and testing various transformations yielded no improvement. We therefore present the original untransformed model with robust standard errors [24]. The adjusted R-squared value for the model was 0.10.

Table 3 Factors associated with fasting capillary blood glucose for a sample of individuals from rural south India

The following variables were associated (p-value <0.05) with pre-diabetes (IFG and/or IGT) in univariate logistic regression analyses: standardized BMI [OR 1.62, 95 % CI 1.25, 2.10], standardized WHR [OR 1.55, 95 % CI 1.20, 2.01], physical activity [OR 0.90, 95 % CI 0.84, 0.98], rurality index [OR 0.67, 95 % CI 0.56, 0.79], sedentary time [OR 1.09, 95 % CI 1.01, 1.20], TV time [OR 1.21, 95 % CI 1.02, 1.45], hypertension [OR 1.88, 95 % CI 1.11, 3.19], livestock ownership [OR 0.55, 95 % CI 0.32, 0.94], family history of diabetes [OR 3.1, 95 % CI 1.53, 6.27], pucca house [OR 2.11, 95 % CI 1.11, 4.03], and fat intake [OR 0.92, 95 % CI 0.86, 0.99].

The following variables were associated (p-value <0.05) with newly-diagnosed diabetes in univariate logistic regression analyses: age [OR 1.02, 95 % CI 1.00, 1.04], standardized BMI [OR 2.18, 95 % CI 1.65, 2.89], standardized WHR [OR 2.02, 95 % CI 1.53, 2.65], physical activity [OR 0.78, 95 % CI 0.70, 0.87], rurality index [OR 0.71, 95 % CI 0.60, 0.85], sedentary time [OR 1.20, 95 % CI 1.09, 1.32], tobacco consumption [OR 2.08, 95 % CI 1.16, 3.71], in-house tap water [OR 2.5, 95 % CI 1.10, 5.71], hypertension [OR 2.8, 95 % CI 1.55, 4.95], livestock ownership [OR 0.45, 95 % CI 0.24, 0.86], and Muslim religion [OR 3.2, 95 % CI 1.03, 10.11].

After multivariate adjustment, several risk factors were associated with odds of pre-diabetes and newly-diagnosed diabetes (Table 4). Physical activity was associated with lower odds of pre-diabetes and diabetes. Family history of diabetes was associated with increased odds of pre-diabetes but not diabetes. BMI and WHR were both independently associated with increased odds of diabetes. Higher rurality indices were associated with lesser odds of both pre-diabetes and diabetes. Tobacco consumption was associated with increased odds of diabetes. Intake of polyunsaturated fatty acids (PUFA) was associated with decreased odds of pre-diabetes and diabetes. PUFA consumption also confounded the association between family history and both pre-diabetes and diabetes, based on a change in coefficient of greater than 20 % after its removal from the model. This confounding effect suggests that there may be an interaction with family history, possibly due to household clustering of dietary factors.

Table 4 Factors associated with pre-diabetes and type 2 diabetes among a sample of individuals in rural south India, normoglycemia as referent

Discussion and conclusion

The age- and sex-standardized prevalence of diabetes in the research site was 10.8 %, which is one of the highest recorded prevalence rates in rural India [3]. Vijayakumar et al. (2009) found higher prevalence of diabetes in a rural region of neighbouring Kerala based on fasting plasma glucose [25]. Chow et al. (2006) also found a higher prevalence in a rural area of neighbouring Andhra Pradesh, however the sample population comprised only of individuals >30 years of age [26]. Similar prevalence rates have been found in peri-urban villages in Tamil Nadu [27]. All recent cross-sectional studies in rural Tamil Nadu found prevalence rates between 5.1 and 8.4 %, slightly lower than the current study [19, 28, 29]. High prevalence in our sample may be indicative of local disparities and/or a continued increase of rural diabetes as predicted by Misra et al. [3]. Our study adds to the growing body of evidence suggesting that diabetes is no longer confined to urban areas of India and is a serious concern in rural regions as well. This is particularly troubling considering over 70 % of the population of India lives in rural regions and these areas are often characterized by widespread poverty and poor access to health care services [30].

Over half (56.4 %) of individuals with diabetes were previously undiagnosed, of which only 10 % were aware of the disease. The ratio of known to newly diagnosed diabetes was similar to a recent study in rural Tamil Nadu conducted by Anjana et al. (2011) (48 % undiagnosed) [19], but markedly lower than a different large-scale cross-sectional study by Misra et al. (2011) (25 % undiagnosed) [32], indicating a low level of diabetes awareness among the study population and discrepancies in the coverage of screening programs.

Age- and sex-standardized prevalence rates of impaired fasting glucose and impaired glucose tolerance were 3.9 and 5.6 % respectively, which sum to a pre-diabetes prevalence of 9.5 %, similar to other recent studies in rural Tamil Nadu [19, 28, 31]. Balagopal et al. (2008) found a markedly higher prevalence of pre-diabetes (13.5 % based on fasting glucose only), indicating regional disparities [29]. 50 % of individuals with pre-diabetes will develop overt diabetes within 10 years [32]. Consequently, high rates of pre-diabetes expose the potential for a continued rise of diabetes in the coming years, which may be of particular concern for already overburdened health systems in rural areas.

Previous studies have reported on various causes of diabetes in India, including urbanization, the ‘nutrition transition’, decreased physical activity, and emotional stress [33]. The changing dietary profile of Indian populations has received attention recently, including, most notably, increased consumption of ‘westernized’ foods and declining popularity of traditional coarse cereals [34, 35]. However, to our knowledge, this is the first cross-sectional study to assess associations between a wide range of dietary and lifestyle risk factors and diabetes outcomes among new diagnoses in a rural region. While other studies have attempted to correlate risk factors with diabetes outcomes, most have been in an urban environment [36, 37], only included self-reported diabetes [38], or failed to account for multiple variables and confounders in a multivariate statistical model [3941]. This research is therefore important to elucidate the unique epidemiological environment of rural areas and identify the distinct factors associated with rural diabetes.

We found that higher physical activity was associated with lower post-glucose-load CBG, lower fasting CBG, and lesser odds of diabetes [OR 0.79]. This finding is consistent with previous research on the protective effects of physical activity against obesity, cardiovascular disease, and the metabolic syndrome [42]. The associations remained even after controlling for anthropometric measures, indicating that physical activity may have a direct impact on risk of diabetes apart from its association through overweight or obesity [43].

Higher BMI and WHR were independently associated with higher post-glucose-load CBG, fasting CBG, and greater odds of diabetes. While overweight and obesity have been substantiated as risk factors for cardiovascular disease and the metabolic syndrome [44, 45], ours is the first cross-sectional study in rural India to identify significant associations of BMI and WHR independent of each other. Currently, there is debate as to whether BMI or WHR ratio is better at conferring risk of non-communicable diseases [46]. Some recent studies have found that the association between BMI and diabetes becomes non-significant when adjusting for WHR and other variables, indicating that WHR is a better predictor of risk for Asian populations [4749]. By contrast, a meta-analysis by Vazquez et al. (2005) found no significant difference in the abilities of BMI and WHR to predict incident diabetes, however they did find regional differences in pooled risk ratios, indicating disparity due to ethnicity [46]. In our analysis, both BMI and WHR were significant in all multivariate models and coefficients and ORs did not differ significantly. We therefore suggest that general obesity (i.e. BMI) and central obesity (i.e. WHR) may both be important risk factors for diabetes among Indian populations and should be considered independently and concurrently in research and clinical settings [50].

Tobacco (smoking and/or smokeless) consumption was prevalent among 48.7 % of men and 30.6 % of women. This corresponds with most previous studies that have found higher rates of use among men than women in India, but overall prevalence differs between regions [51]. Tobacco consumption was associated with higher fasting CBG and greater odds of diabetes. Smoking has long been associated with glucose intolerance and diabetes. A systematic review by Willi et al. (2007) found that among 25 observational studies, all but one identified a positive association between smoking and diabetes [52]. Cohort studies have also found a dose–response relationship between cigarettes/day and incidence of diabetes [53, 54]. Smokeless tobacco use among the study population is also concerning. A study in Sweden showed that smokeless tobacco users were three times more likely to develop diabetes than non-users [55]. In addition, most users in India mix smokeless tobacco with betel nut (areca catechu), which is independently associated with risk of diabetes [56]. With over one-third of the study population currently consuming tobacco, its contribution to chronic disease burdens cannot be underestimated. Reducing tobacco consumption must form an integral component of all prevention programs for diabetes, cardiovascular disease, cancers, and other non-communicable diseases [57].

Urban status is associated with diabetes risk in India [8, 28, 58]. Previous population-based chronic disease studies in India have maintained the dichotomous urban and rural definitions employed by the Census of India [59]. This dichotomous typology oversimplifies the urban/rural continuum and fails to capture the range of variation within rural areas [12]. While the entire study population was classified as ‘rural’ as per the Census of Indian definition, we aimed to examine the variability within the study region by employing a rurality index. Surprisingly, rurality was one of the strongest predictors of diabetes risk outcomes. Lower rurality was associated with higher post-glucose-load CBG and higher odds of pre-diabetes and diabetes. We therefore posit that the rurality index captured variability in socioeconomic, lifestyle, and dietary factors not fully represented by other variables. More specifically, we suggest that wealth, lifestyle, diet, and culture differs with proximity to the market village and size of home village, which in turn impacts an individual’s risk of diabetes. Moving forward, we recommend that researchers and policymakers examine the rural/urban interface more closely.

Another interesting finding was the association between intake of PUFAs and diabetes outcomes. PUFA intake appeared to confound the association between family history and both pre-diabetes and newly-diagnosed diabetes, likely due to clustering of dietary intake variables within households. We must therefore be careful in interpreting these variables in the regression models. Nonetheless, intake of PUFAs was associated with a lower fasting CBG and lesser odds of pre-diabetes and diabetes. This was the only dietary factor that had a significant association with any outcome in the multivariate models. PUFAs are primarily found in natural vegetable oils, whole grains, nuts, seeds, and fish. While more research is needed in this area, this finding corresponds with a review study done by Hu et al. (2001), which concluded that substituting unhydrogenated unsaturated fats for saturated fats and trans-fats could lower risk of diabetes and other chronic diseases [60]. Apart from this study, little research has examined the association between dietary fat intake and diabetes. Results suggest that further investigations that better control for household clustering of both genetic and dietary factors are necessary.

The lack of results showing significant associations between dietary variables and diabetes outcomes is notable, especially considering the overwhelming evidence of the importance of dietary risk factors in determining risk of obesity, diabetes, and the metabolic syndrome [3, 5, 8, 33, 35, 36, 38, 61]. This may expose limitations in the ability of the FFQ to adequately assess dietary factors due to recall bias, interviewer bias, or location-specific anomalies. However, it is more likely that results reflect a lack of sufficient variability in the rural diet to adequately assess the associations between dietary intakes and glucose tolerance within the study population. This is an important finding, as it may indicate that, while dietary changes are driving the diabetes epidemic in other locations, such as urban India, the rising rural prevalence of diabetes is due to other risk factors, such as physical activity, rising obesity, and genetic predisposition. Alternatively, individual dietary intakes may be changing uniformly among populations in rural India, perhaps due to effects of the nutrition transition, thus increasing risk of pre-diabetes and diabetes across the population as a whole [36, 37]. Thus, future observational studies examining dietary factors should ensure that the study population is heterogenous in its dietary intake, and should take into account the shifting nature of diet and nutrition in rural south India.

Our study has a number of limitations. Most importantly, the cross-sectional nature of the study precludes the ability to distinguish causes from effects. However, we hope that exclusion of individuals with pre-diagnosed diabetes from the models reduced reverse-causation. Due to limited access to laboratories and transportation constraints, we used CBG instead of venous plasma BG, which has a wider coefficient of variation [19]. However, there is good correlation between CBG and venous plasma estimations [16]. In addition, there has been an increasing emphasis on the associations between diabetes and stress, including co-morbid depression [62, 63]. Unfortunately, due to staffing and time constraints, we were unable to assess emotional stress or depression during the questionnaire, and therefore no measure of stress was included in the analyses. This is a notable exception, as stress may be a modifier or confounder for other putative risk factors such as socioeconomic position, obesity, rurality, and diet. Finally, our study sample comprised a small population of adults living in a rural region of Tamil Nadu. It is not nationally representative, and thus findings are not generalizable to the entire population of India.

Research and public health implications of the present study are considerable. Despite 70 % of India’s population living in rural areas, this portion of the population has largely been ignored by diabetes researchers, who focus their efforts on the urban ‘hot-spots’, where research is logistically easier and prevalence rates are higher. Our study contributes to the growing body of research suggesting that diabetes is rapidly growing in rural areas and must be addressed. In addition, we have identified a number of potential risk factors, including physical activity, family history, central obesity, abdominal obesity, rurality, polyunsaturated fat consumption, and tobacco consumption, which are all associated with diabetes indicators, pre-diabetes, and/or diabetes in the research site and warrant future investigation. Research such as this contributes to decision-making regarding the allocation of scarce public health resources and public health education programs to reduce diabetes burdens in rural regions of India.