Introduction

An estimated 371 million people worldwide have diabetes mellitus, and the number is rising every year [1]. Impaired insulin sensitivity is a key abnormality underlying the development of type 2 diabetes, and mean insulin sensitivity is lower in these individuals compared with healthy controls. Even if there is no cut-off value to distinguish healthy from insulin-resistant individuals, measuring insulin sensitivity is of major importance to identify individuals at risk of developing diabetes and to evaluate diabetes-focused interventions [2, 3]. ‘Insulin sensitivity’ is an umbrella term for many different physiological processes. The most important elements of insulin sensitivity are glucose clearance in peripheral tissues (i.e. peripheral insulin sensitivity) and insulin-mediated suppression of hepatic glucose production (i.e. hepatic insulin sensitivity).

The purpose of this meta-analysis was to compare different measurements of peripheral insulin sensitivity and determine which technique is the most appropriate for large-scale clinical studies. The hyperinsulinaemic–euglycaemic clamp (HEC) is considered the ‘gold standard’ measure of peripheral insulin sensitivity, although it does not simulate the physiological state of continuously changing glucose and insulin levels or hepatic insulin extraction, nor the feedback mechanism between glucose and insulin [4].

During the HEC, insulin is infused at a constant rate and the amount of glucose needed to maintain euglycaemia provides a measure of insulin sensitivity. Since muscle takes up over 70% of infused glucose, the HEC is the standard test for peripheral insulin sensitivity [5]. However, HEC is costly, time consuming and invasive, and requires trained staff; surrogate measures of insulin sensitivity are therefore necessary in epidemiological or large-scale intervention studies. These surrogate measures are based on insulin and/or glucose samples taken either in the fasting state or during the OGTT.

Surrogate indices based on fasting glucose and fasting insulin primarily reflect hepatic insulin sensitivity [6, 7]. In most individuals, hepatic insulin sensitivity is closely related to peripheral insulin sensitivity. Therefore, fasting surrogate measures show at least moderate correlations (r > 0.5) with the HEC [7, 8].

Surrogate indices based on changes in insulin and glucose during the OGTT incorporate both peripheral and hepatic insulin sensitivity: hepatic glucose production changes most during the first hour of the OGTT, and peripheral glucose uptake is best measured during the second hour [7]. Gastric emptying, glucose absorption, insulin secretion and incretin hormones also influence OGTT results.

We undertook a meta-analysis to determine which surrogate measure(s) of insulin sensitivity should be used in clinical studies. We also aimed to establish whether fasting surrogate measures can be used instead of the more time-consuming OGTT-based measures.

Methods

Data sources and searches

We searched Medline from 1979 until 2 March 2012 via PubMed. We chose 1979 as the start date because the HEC was first described by DeFronzo in that year [4]. We used the following Medical Subject Headings in our search: glucose clamp technique; insulin resistance; and humans. We identified other potentially eligible studies by browsing the reference lists of suitable articles.

Study selection

We read the abstracts of all articles and then read the entire article if the study seemed eligible. We included studies that reported bivariate correlation coefficients between the HEC reference method and surrogate measures for insulin sensitivity based on blood samples. The article was included in our analysis if the HEC glucose infusion rate M value was reported as M BW, M BW/I, M LBM, M LBM/I, M BW/(G × ΔI), M LBM/(G × ΔI), M BW/I, M BW/G or M BSA (see main abbreviations list for definitions). We included studies that reported surrogate measures based on the OGTT or on fasting samples (fasting glucose, fasting insulin or both). We excluded articles not published in English, animal studies and studies conducted in children. Some of the relevant articles were intervention studies. If the articles contained correlation coefficients before and after the intervention, we used the mean of both for our meta-analysis. If a study reported correlation coefficients for two different insulin doses, we used the stronger of the coefficients. In several articles, the authors reported correlation coefficients with an opposite sign to the one expected (e.g. r = 0.5 instead of r = −0.5). If the opposite correlation coefficients were significant or stronger than 0.2, we contacted the authors (n = 12) via email. Four authors replied that they wanted only to describe the strength of the correlation (i.e. without the sign) or that the reported sign was incorrect. In those cases, we changed the sign in the relevant meta-analyses. The remaining eight authors did not respond. We therefore also performed a separate subgroup analysis excluding the correlation coefficients with signs opposite to those expected.

Data extraction and quality assessment

We extracted the following information from each retrieved article: (1) name of the first author; (2) year of publication; (3) country in which the study was performed; (4) subject category (type 2 diabetes, impaired glucose tolerance, normal glucose tolerance, other disease, healthy); (5) number of study participants; (6) proportion of men and women; (7) age (mean and SD); (8) BMI (mean and SD); (9) insulin dose of the HEC; (10) target blood glucose of the HEC; (11) duration of the HEC; (12) how insulin sensitivity measured by the HEC was expressed (e.g. M LBM/I); (13) the method of determining correlation (Pearson’s r or Spearman’s ρ); and (14) the correlation coefficients between the HEC and the different surrogate measures (see electronic supplementary material [ESM] Table 1). We used the following questions to assess the quality and identify possible confounders of the different studies: (1) Is the study population evenly distributed between both sexes? (2) Are study participants with different levels of insulin sensitivity included? (3) Do study participants have medical conditions that might interfere with the measurement of insulin sensitivity? (4) Is diabetes diagnosed using OGTT, fasting glucose or medical history? (5) Is the insulin dose of the HEC appropriate for the study population? (6) How long was the HEC? (7) Was glucose during OGTT/fasting samples analysed as blood glucose or plasma/serum glucose (ESM Table 1)?

Data synthesis and analysis

Combining Pearson’s r with the more conservative Spearman’s ρ could have introduced bias into the meta-analysis. We therefore converted Spearman’s ρ into Pearson’s r according to Rupinski and Dunlap using the following formula: r = 2sin(ρ × π/6) [9]. If a study reported R 2 (together with a graph indicating the direction) or a standardised (beta) coefficient instead of a correlation coefficient, we calculated the correlation coefficient. We combined the following ways of measuring correlation into a single meta-analysis: (1) x; (2) logx; and (3) 1/x (the sign of the correlation coefficient was converted for the latter). The logarithmic base used was not known in all studies. Stumvoll and Mari used regression analysis to derive the Stumvoll insulin sensitivity index (Stumvoll ISI), the Stumvoll metabolic clearance rate (Stumvoll MCR) and oral glucose insulin sensitivity (OGIS) indices [10, 11]. We did not include these correlation coefficients in our meta-analysis because using the same population for both the development and validation of the index would introduce bias. Both of these papers report additional correlation coefficients based on a validation analysis, and we used these for our meta-analysis.

To assess the differences between types of clamp measurement (M BW, M LBM, M BW/I and M LBM/I), we performed a separate subgroup analysis for each. To compare these different measures, we identified all surrogate measures (fasting and OGTT based) that were reported in 15 or more articles and in which the pooled correlation with the HEC was at least moderate (r > 0.5 or <−0.5). For those surrogate measures (Matsuda, AUC insulin, QUICKI, fasting insulin and HOMA-IR), we conducted separate meta-analyses for all clamp measurements reported in at least two articles.

In another subgroup analysis, we compared surrogate measures in individuals with normal glucose tolerance (NGT), impaired glucose tolerance (IGT) and type 2 diabetes. For this comparison, we report surrogate measures only when correlation coefficients were available in at least one article for each of the subgroups.

We used the random effects model of DerSimonian and Laird for our random effects meta-analyses [12]. Before conducting the meta-analyses, we transformed all correlation coefficients to Fisher’s z scale (zr) to stabilise the variances [13]. After the meta-analyses, we transformed the z-values back to correlation coefficients before plotting them in the graphs. We considered the random effects model to be most appropriate because characteristics (e.g. age and weight) differed among the study participants. Therefore, we were unable to assume that the effect size would be the same across studies, which is a required assumption for the fixed effect meta-analysis.

We used the Q-statistic, which follows a χ 2 distribution, to test for heterogeneity [14]. If there is significant heterogeneity, then a fixed effect meta-analysis is not recommended. We next used the I2 statistic to quantify heterogeneity. A high I2 (>75%) indicates that a large proportion of the observed variance is caused by a genuine difference between the correlation coefficients of the included studies, and a low I2 (<25%) means that most of the observed variance is the result of random error.

We used Begg and Mazumdar’s rank correlation test [15] and funnel plots [16] to estimate publication bias. In the funnel plots, we plotted z-transformed correlation coefficients (zr) on the x-axis and SE on the y-axis. Asymmetry in a funnel plot reveals publication bias if small studies (i.e. with large SEs) exhibit strong correlation coefficients and there is a lack of small studies with weak correlation coefficients.

We conducted the meta-analysis, the heterogeneity test and Begg and Mazumdar’s rank correlation test in IBM SPSS Statistics for Mac, Version 20.0 (Armonk, NY, IBM Corp.) using a syntax written by Field and Gillett [17, 18]. The funnel plots were produced in R (version 2.15.1, R Foundation for Statistical Computing, Vienna, Austria) using a code written by Vevea and Woods [19] and applying a script from Field and Gillett to feed SPSS data into R [18]. We visualised meta-analysis results using PRISM 5.0b (GraphPad Software, San Diego, CA).

Results

We retrieved 1,753 articles from our database search and included 120 of these in our meta-analysis. ESM Fig. 1 shows details of the selection process. Characteristics of the studies included in the meta-analysis are summarised in ESM Table 1.

OGTT-based and fasting surrogate measures compared with the HEC

We pooled a total of 120 studies that included correlations between surrogate measures and the HEC. We report in detail only meta-analyses based on five or more articles. ESM Fig. 2 shows the random effects meta-analyses for correlations between the HEC and surrogate measures based on the OGTT. ESM Fig. 3 depicts correlations between fasting surrogate measures and the HEC. Details of study participants and their characteristics can be found in ESM Table 1. Results of meta-analyses of fewer than five articles are included in ESM Tables 2 and 3. Table 1 contains the mathematical formulas of the relevant surrogate indices.

Table 1 Mathematical formulas of surrogate measures of insulin sensitivity based on the oral glucose tolerance test and fasting blood samples

Each meta-analysis resulted in a pooled correlation coefficient. Of the OGTT-based surrogate measures, the Stumvoll MCR, OGIS, Matsuda, Stumvoll ISI and Gutt indices exhibited the strongest correlations with the HEC (ESM Fig. 2). Other OGTT-based indices exhibited weaker correlations with the HEC [e.g. insulin (120 min)]. Figure 1 summarises the results of all meta-analyses (based on more than five studies) in a single graph to compare the pooled correlation coefficients and their respective 95% CIs.

Fig. 1
figure 1

Summary of all meta-analyses. The figure shows the strength of the pooled correlations; negative pooled correlations are shown on the positive side to facilitate comparison. Figures in parentheses indicate the papers that first described the surrogate measure (e.g. Matsuda) or that first measured the correlation with the HEC (e.g. fasting insulin)

Of the fasting surrogate measures, the pooled correlation coefficients of the QUICKI, revised QUICKI, HOMA-IR, computer generated HOMA of insulin sensitivity (HOMA-%S) and fasting insulin exhibited the strongest correlations with the HEC and narrow 95% CIs (Fig. 1 and ESM Fig. 3). The QUICKI exhibited a stronger correlation than the HOMA-IR with the HEC. A separate meta-analysis revealed that the QUICKI and the log-transformed HOMA-IR had equally strong pooled correlations with the HEC (log HOMA-IR r = −0.60 [95% CI 0.66, −0.53], n = 22; QUICKI r = 0.61 [0.56, 0.66], n = 36). Similarly, the correlation between fasting insulin and the HEC was stronger if log-transformed insulin was used (log fasting insulin r = −0.56 [−0.60, −0.51], n = 17; fasting insulin r = −0.53 [−0.49, −0.56], n = 71).

When we compared fasting surrogate measures to OGTT-based measures, we observed that only the correlation between the revised QUICKI and the HEC was as strong as those of HEC with the OGTT-based surrogate indices Stumvoll MCR, OGIS, Matsuda, Stumvoll ISI and Gutt (Fig. 1).

Heterogeneity analysis

There was significant heterogeneity (p < 0.05) in all meta-analyses based on five or more articles, except for correlation coefficients between the Gutt index and the HEC (p = 0.85). We therefore did not continue with fixed effect meta-analyses, but relied on the random effects meta-analyses. The reasons for such high levels of heterogeneity may be the different insulin doses used for the HECs, heterogeneous study populations with different levels of insulin sensitivity and, in some meta-analyses, outliers with correlation coefficients with signs opposite to those expected.

Publication bias

The Begg and Mazumdar rank correlation test did not reveal a statistically significant publication bias, except for meta-analyses between the HEC and the fasting insulin/fasting glucose (I0/G0) and Gutt (p = 0.02 and p = 0.04, respectively). Although the funnel plot of the Gutt meta-analysis was slightly asymmetrical, this was caused by the inclusion of several small studies with weaker correlation coefficients, which argues against publication bias. Even the funnel plot for the meta-analysis between the HEC and I0/G0 exhibited clear asymmetry, which in this case was caused by three studies that reported positive correlation coefficients. Outliers of this type also caused clear asymmetry in other funnel plots [HOMA-IR, fasting insulin resistance index (FIRI), G0/I0, insulin (120 min)], but we did not interpret this as publication bias. There was some evidence of publication bias in the funnel plots of the revised QUICKI, HOMA-%S and glucose (120 min) meta-analyses. However, the Begg test did not reveal publication bias in these three meta-analyses. All funnel plots are shown in ESM Fig. 4.

Quality assessment

Because of interference with the measurement of insulin sensitivity, we removed study participants with type 1 diabetes, insulinoma and renal failure (ESM Fig. 1). Furthermore we found the insulin dose used during the HEC to substantially influence the study quality. In our opinion, eight studies report an insulin dose that is too low for the population studied (ESM Table 1). However, we decided to keep these studies in our meta-analysis because all other studies that examined individuals with different levels of insulin sensitivity had used an inappropriate insulin dose for part of their study population.

We assessed whether the quality of the eight studies included in the meta-analysis of the revised QUICKI differed from the quality of all 120 studies combined (ESM Table 1). The only difference was that more studies in the revised QUICKI meta-analysis had an even sex distribution in the study population (i.e. as many men as women in four of seven studies). Only 33% of all 120 studies had evenly distributed study populations. The studies included in the revised QUICKI meta-analysis were comparable with the combined group of 120 studies in all other aspects of quality assessment.

Subgroup analyses

After our attempts to contact the relevant authors, several meta-analyses still included outliers (the sign of the correlation coefficient was opposite to that expected, e.g. r = 0.5 instead of r = −0.5). We therefore conducted additional meta-analyses without these outliers (data not shown). The pooled correlation coefficients did not change for QUICKI or fasting insulin when the outliers were removed. However, the removal of outliers resulted in slightly larger pooled correlation coefficients for HOMA-IR, insulin (120 min), FIRI, fasting glucose, I0/G0 and G0/I0. The remaining meta-analyses did not contain outliers.

Another subgroup analysis showed that the Matsuda, AUC insulin, QUICKI and fasting insulin exhibited stronger correlations with M values normalised to insulin (M BW/I and M LBM/I) than with those not normalised to insulin (M BW and M LBM; ESM Figs 58). Only for HOMA-IR was the M LBM value correlation stronger than that of the M LBM/I estimate (ESM Fig. 9). However, when we conducted a meta-analysis without the outliers (i.e. positive signs for all correlation coefficients), both M values for HOMA-IR normalised to insulin exhibited stronger correlations with the HEC compared with M values not normalised to insulin (data not shown).

We also conducted meta-analyses to test our hypothesis that the HEC correlates more strongly with surrogate measures for insulin sensitivity in individuals with IGT than in those with NGT or type 2 diabetic patients (ESM Table 4). According to our analyses, this is the case for several surrogate measures (e.g. Matsuda, Stumvoll MCR, Stumvoll ISI, revised QUICKI), but not for others (e.g. OGIS, QUICKI, HOMA).

Discussion

Several surrogate markers for insulin sensitivity have been developed for use in clinical studies when the accepted reference method, HEC, is not practicable. We have correlated these surrogate measures of insulin sensitivity with the HEC in a meta-analysis. Based on this analysis, we recommend that the primary choice of surrogate index for estimating insulin sensitivity should be the revised QUICKI. This index is easy to perform because only fasting blood samples are needed and the correlation with the HEC is approximately as strong as that of OGTT-based indices. However, NEFA analyses are needed to construct this index. If NEFA analyses are not available, then any of the following OGTT-based indices could be employed as a second-choice measurement: Stumvoll MCR, OGIS, Matsuda, Stumvoll ISI and Gutt. If estimations must rely on fasting levels without NEFA analysis, then the best choice would be the QUICKI, the log-transformed HOMA-IR or the HOMA-%S.

However, the strength of correlation of the different surrogate measures with the HEC is moderate at best. This leads us directly to cost–benefit considerations. When conducting a small study, the HEC reference method should be used. However, in epidemiological or large-scale intervention studies, practical considerations will determine which of the aforementioned indices should be used. The overlapping CIs of the different meta-analyses suggest that there is no difference in correlation strength between these indices.

Some surrogate measures show a poor correlation with the HEC [e.g. I0/G0, G0/I0, insulin (120 min) and fasting glucose]. When using the I0/G0 or G0/I0 ratio, high insulin and high glucose levels will cancel each other out, so indices containing G0 × I0 or G0 + I0 are always preferred. Although insulin during the OGTT [e.g. insulin (120 min)] exhibits a fairly good correlation with the HEC in a healthy population, it is of less value in a diabetic population because it is strongly influenced by islet dysfunction (ESM Table 4). Fasting glucose shows only a very limited variation in a healthy population and is regulated by several factors besides insulin sensitivity, such as islet function and hepatic glucose release. It is therefore a poor index to distinguish between various degrees of insulin sensitivity among healthy subjects.

After excluding study participants with medical conditions that might interfere with the measurement of insulin sensitivity, the quality of the remaining studies was still variable. This was probably caused by a disproportionately low insulin dose being administered during the HEC. During the HEC, the aim is to completely suppress hepatic glucose production to enable an accurate estimation of peripheral insulin sensitivity. Incomplete suppression will lead to the underestimation of insulin sensitivity. If the same dose of insulin is used for healthy/normal weight and type 2 diabetic individuals, then some participants will be examined with an inappropriate dose (because overweight and diabetic individuals need higher insulin doses to suppress hepatic glucose production) [4, 20], which indirectly affects correlations with surrogate indices. Furthermore, despite the assumption of a steady state condition, the amount of infused glucose continues to rise even at the end of the examination [2123]. This means that the same participant may exhibit better insulin sensitivity during a longer HEC than during a shorter one, which makes comparisons difficult among clamp examinations of different durations. In addition, the glucose infusion rate is higher at the second examination than at the first within the same individual [22]. One possible explanation is that reduced stress causes the increase in insulin sensitivity.

Surrogate measures of insulin sensitivity (both fasting and OGTT based) have previously been validated mostly by determining their correlation coefficients to the HEC, in line with our analysis [24, 25]. However, it should be emphasised that fasting indices mainly measure hepatic insulin sensitivity and the HEC mainly measures muscle insulin sensitivity, while OGTT-based indices measure both types of insulin sensitivity [7, 26]. There are other differences among these three methods of measuring insulin sensitivity. For example, although the HEC is the accepted reference measurement for insulin sensitivity, it does not reflect physiological conditions. Furthermore, arterialised blood is used during the HEC to measure glucose, while venous blood samples are taken for the surrogate measures. The same amount of glucose is always administered during the OGTT, in contrast to the fixed insulin dose but variable glucose infusion administered during the HEC. Although some consider the OGTT to be a more physiological examination in which the glucose load mimics that of a meal, this is disputed [27]. All of these factors help to explain why the correlation strength between surrogate measures and HEC is moderate at best.

Furthermore, there are several sources of error in insulin concentrations, which form part of most surrogate indices. For example, insulin assays show varying degrees of cross-reactivity with proinsulin and its partially processed products. While this is an important source of error in a radioimmunoassay, cross-reactivity to proinsulin is as low as 5.3% in the newer, specific insulin assays [28]. Insulin assays seem to exhibit most variability at low insulin levels, which could cause lower correlation coefficients in healthy individuals vs type 2 diabetic patients [29].

Matthews and collaborators originally recommended that three blood samples should be obtained for insulin analyses (one every 5 min, to account for the periodicity in insulin secretion) [30]. However, few of the studies included in our meta-analysis collected more than one sample for insulin analysis. This methodology may therefore introduce an error, especially for normal weight subjects, because pulsatile insulin decreases with IGT and diabetes [31, 32]. Finally, insulin levels are also regulated by beta cell and liver function (via effects on insulin clearance), in addition to insulin sensitivity.

It has been suggested that surrogate measures of insulin sensitivity show weaker correlations with the HEC in healthy normal weight individuals than in insulin-resistant individuals [27]. In our meta-analysis, that was true for some of the surrogate indices but not for all. Generally, the strength of the correlation between a surrogate measure and the HEC in individuals with different levels of insulin sensitivity (NGT, IGT and type 2 diabetes) depends on the insulin dose administered during the clamp [29]. Lower insulin doses favour strong correlations for individuals with NGT, and higher insulin doses increase the correlation coefficient for insulin-resistant individuals. In many of the studies included in our meta-analysis, a lower insulin dose was used for all study participants. This favours stronger correlations for insulin-sensitive groups. Furthermore, lower insulin doses will measure hepatic insulin sensitivity instead of peripheral glucose uptake, which strengthens correlations with the fasting surrogate measures relative to OGTT-based indices [33]. We therefore cannot draw conclusions from our meta-analyses about the NGT, IGT and type 2 diabetes subgroups.

Our analysis did not reveal a difference between the M value normalised to body weight and the M value normalised to lean body mass. However, M values normalised to insulin (M BW/I and M LBM/I) exhibited stronger correlations with all tested surrogate measures compared with M values not normalised to insulin (M BW and M LBM). Related to this, Bokemark et al showed that clamp examinations 2 weeks apart in the same individual exhibit a stronger correlation if insulin sensitivity is calculated using M values normalised to insulin and not M values without normalisation [22]. In addition, the distribution of insulin sensitivity in the population studied increases if the M value is normalised to insulin because it then becomes possible to detect small differences in insulin sensitivity [2].

In summary, we recommend that either the revised QUICKI fasting surrogate measurement or the OGTT-based indices (Stumvoll, OGIS, Matsuda and Gutt) should be used in future clinical studies. However, these indices need further validation against the HEC and should be compared in a single study.