Introduction

The diagnosis and treatment of diabetes mellitus (DM) is based on measurement of glycated hemoglobin (HbA1c) in whole blood (B-HbA1c) or on venous plasma glucose concentrations (P-glucose). The International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) recommends that the glucose concentration be reported as P-glucose by measuring plasma directly or by measuring whole blood and correcting for hematocrit [1]. For the purpose of simplicity here we refer to both types as P-glucose. (In the study reported here the recommendations of the IFCC and the International Union of Pure and Applied Chemistry are used). In terms of DM diagnosis, The World Health Organization (WHO), the American Diabetes Association (ADA), and the German Diabetes Association (DDG) have each defined thresholds for fasting P-glucose and postprandial P-glucose at 2 h following glucose loading (2-h P-glucose) obtained through the oral glucose tolerance test (OGTT). The latter two organizations have also defined additional thresholds for the diagnosis of DM based on random P-glucose values, as shown in Table 1 [2,3,4].

Table 1 Reference intervals for plasma glucose levels and whole blood levels of glycated hemoglobin in terms of the diagnosis of diabetes and impaired glucose tolerance

The use of point-of-care testing (POCT) devices for the diagnosis of DM is not generally accepted at the present time. In the USA, the joint guideline of the National Academy of Clinical Biochemistry, American Association for Clinical Chemistry, and the ADA recommends that venous P-glucose should be measured in a laboratory using an automated analytical procedure for diagnosing DM [5]. In contrast, regardless of whether P-glucose concentrations are determined by POCT or the core laboratory, the values are accepted for the diagnosis of DM by the DDG provided (1) the measuring system is claimed adequate for this purpose, (2) the system complies with the Guideline of the German Medical Association on Quality Assurance in Medical Laboratory Examinations (Rili-BAEK), and (3) the system shows a minimal difference (MD) for P-glucose of < 0.7 mmol/L at 7.0 mmol/L. Additionally, external quality assurance is mandatory for POCT for glucose used for the diagnosis of DM. In the case of gestational diabetes, handheld POCT devices are also explicitly allowed within accuracy limits defined by German Institute for Standardization (Deutsches Institut fur Normung) ISO 15197:2013 [6].

When the diagnosis relies on method-independent cutoffs, i.e., global recommendations as in the diagnosis of DM, even a slight bias between methods has the potential to affect the diagnosis. Not only will different measurement methods, such as unit-use reagents of POCT or core laboratory procedures, account for discrepant results, but other factors, such as sample type and pre-analytical processing, may also influence the measurements [7]. Consequently, it is of particular importance to comply with guidelines, such as the Clinical and Laboratory Standards Institute (CLSI) recommendation on study design [8] (see Electronic Supplementary Material Table S1 and Fig. S1).

In patient-centered care, glucose measurements for an individual patient obtained using POCT devices are generally used in addition to or alternating with glucose measurements from the core laboratory using different sample types: capillary whole blood and venous plasma. Therefore, for an individual patient, glucose results from different methods and sample types have to be assessed by the treating physician. In this study glucose concentrations obtained from POCT devices and core laboratory instruments for both sample types were assessed to match typical scenarios experienced in patient care.

In the study reported here we investigated two patient cohorts that had undergone OGTT. Samples from the individuals were analyzed for P-glucose concentration by two POCT methods, a handheld unit-use glucometer, and a small bench-top analyzer with batch reagents often used near patients instead of core laboratories, and two core laboratory procedures. The aim of the study was twofold: (1) to elucidate the diagnostic performance of different P-glucose procedures and (2) to demonstrate the effects of these different procedures on glucose concentrations in clinical terms by reporting the resultant diagnosis rates of impaired glucose tolerance (IGT) and DM based on the given fixed and method-independent cutoffs in relation to B-HbA1c diagnosis.

Methods

Study Population

Two cohorts were examined. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Participation was voluntary after written informed consent had been provided. Exclusion criteria were known DM and/or contraindications to OGTT. The study was approved by the Ethics Committee of the University Medicine of Greifswald.

An in-patient cohort consisting of 176 probands was recruited from patients admitted to the cardiology department within a 3-month period in 2009. The median age of the cohort was 65 years; 73 were women and 103 were men.

A random sample from the general population consisting of 55 probands was recruited from participants of the second follow-up of the Study of Health in Pomerania (SHIP) [9]; all tested within a 1-month period in 2011. This cohort was considered to resemble an outpatient setting and is referred to as outpatient cohort throughout the text. The median age of this cohort was 41 years; 37 were women and 18 were men.

Laboratory Methods and Instruments

P-glucose was measured in the core laboratory using the Dimension Vista (Vista) and Dimension RxL (RxL) (Siemens Healthcare Diagnostics, Eschborn, Germany) instruments. Both devices use a reaction catalyzed by hexokinase/glucose-6-phosphate dehydrogenase.

The two POCT devices used to measure P-glucose concentration were the StatStrip glucometer (StatStrip) (Nova Biomedical, Rödermark, Germany), which uses unit-use reagents, and the YSI 2300 STAT plus (YSI) device (Kreienbaum, Lagenfeld, Germany), which also utilizes batch reagents. Both devices employ a glucose oxidase-catalyzed reaction. The StatStrip is a handheld device, and the YSI is a small stationary benchtop analyzer designed for the use in laboratories or POCT settings, such as physicians’ offices. The StatStrip meter automatically performs a correction for hematocrit when calculating P-glucose concentrations from whole blood samples utilizing the hematocrit measured simultaneously on the test strip. In contrast, the P-glucose values obtained using the YSI device were corrected using hematocrit values which were separately measured in ethylenediaminetetraacetic acid (EDTA) whole blood samples on a model XN 9000 hematology meter (Sysmex, Norderstedt, Germany).

B-HbA1c was measured using the Tosoh G8 Analyzer (Tosoh Europe N.V., Berlin, Germany) by ion-exchange high performance liquid chromatography.

All measurements were carried out by trained operators. Quality controls were performed according to the Rili-BAEK guideline [10].

Sample Collection, Handling, and Measurement

Sample collection and handling followed the CLSI (National Committee for Clinical Laboratory Standards [NCCLS]) C30-A2 recommendations for monitoring evaluation studies, with the exception of duplicate measurements in the inpatient cohort [8]. The recommendations by Stahl et al. [11] and those of the WHO for performing P-glucose testing were also followed [12]. A flow chart of the identical processes for both cohorts and the add-ons for the outpatient cohort can be found in Fig. 1. Venous samples were taken using the butterfly catheter Safety-Lok™ and vacutainers (Becton–Dickinson, Heidelberg, Germany). Lithium heparin and fluoride plasma were used for the glucose concentration measurement and EDTA whole blood was used for the B-HbA1c measurement. Capillary samples were obtained by finger prick. The inpatient cohort used the automated single-use Microtainer Contact-Activated Lancet (Becton–Dickinson), and the outpatient cohort used Keller Medical Tenderlett incision devices (Keller Medical, Bad Soden, Germany). Capillary samples were collected in lithium heparin capillary collection tubes (KABE, Nümbrecht-Eisenroth, Germany).

Fig. 1
figure 1

Flow chart of the study sample processing. Steps in blue boxes were carried out in both cohorts; steps in orange boxes were add-ons for the outpatient cohort. In the inpatient cohort, plasma glucose (P-glucose) measurements using the StatStrip glucometer in venous blood samples were performed with blood from the collection system since no venous lithium heparin samples were collected, as indicated by the superscript *1. At the time the outpatients samples were collected, the Dimension RxL device was no longer available in the core laboratory, as indicated by superscript *2. Processing steps inside the box were carried out immediately and in close proximity of the patient. EDTA Ethylenediaminetetraacetic acid, HbA1c glycated hemoglobin. See text for more details on the devices (StatStrip glucometer [StatStrip], YSI 2300 STAT plus [YSI], Dimension Vista [Vista], Tosoh G8 analyzer [Tosoh G8], model XN 9000 hematology meter [XN 9000])

Venous whole blood from the collection system was used in the inpatient cohort to measure P-glucose concentration on the StatStrip directly after venous puncture. Venous whole blood fluoride and EDTA samples were transported to the core laboratory within 2 h after sampling. Fluoride samples were centrifuged prior to the P-glucose measurement on the Vista and RxL devices. B-HbA1c was measured in EDTA whole blood upon arrival in the laboratory. All samples from the inpatient cohort were measured in singlets.

P-glucose in the outpatient cohort was measured in lithium heparin whole blood venous samples on the YSI and StatStrip in duplicate. Immediately thereafter, these samples were centrifuged and plasma aliquoted into 1.5-mL samples tubes (Biozym, Hessisch Oldendorf, Germany) at the on-campus study site. Cooled plasma aliquots were transported to the core laboratory within 1 h to measure P-glucose on the Vista device, in duplicate. Sodium fluoride and EDTA (B-HbA1c) whole blood samples were cooled and transported within 3 h to the core laboratory. All samples from the outpatient cohort were measured in duplicate, with the exception of B-HbA1c and P-glucose in sodium fluoride.

In both cohorts capillary samples were collected from participants’ fingers. The first drop of blood was discarded. Glucose measurements with the StatStrip device were performed in duplicate for the outpatient cohort. YSI capillary samples for this cohort were subsequently obtained using the aforementioned capillary collection tubes (duplicate measurements).

The 2-h OGTT according to standard WHO recommendations was performed after at least 8 h of fasting with an oral intake of 75 g of glucose dissolved in 200 mL of water [12].

Calculations

Our comparisons of instruments (Vista, RxL, StatStrip, YSI) and sample types (venous vs. capillary) are presented as scatterplots and difference plots, as described by Freckmann et al., with modifications [13]. Performance criteria defined by the U.S. Food and Drug Administration (FDA), CLSI, and Rili-BAEK guidelines were included in the difference plots. Scatterplots were based on either a single measurement for the inpatient cohort or the first measurement of duplicate measurements for the outpatient cohort. Duplicate measurements were assumed to be identical; however, this is usually not the case, and the differences can be used to calculate method variability (e.g. in terms of a standard deviation or coefficient of variation [%CV]). This strategy was applied in our comparisons of duplicate measurements from different measuring systems. The glucose measurements which were carried out on duplicate measurement for this purpose are shown in Fig. 5.

For each instrument, diagnosis rates of IGT and DM were compared based on the P-glucose and B-HbA1c concentrations, respectively. Box plots for P-glucose and B-HbA1c concentrations were related to the recommended decision concentrations (cutoffs) for each instrument and sample type.

All plots and calculations were performed using Prism 7 (GraphPad Software, La Jolla, CA USA).

Results

The imprecision of the methods used to determine the P-glucose concentration measurements, obtained from internal quality controls and from duplicate samples in the outpatient cohort, was comparable and ranged from 2.3 to 4.1% in venous samples and from 1.6 to 5.0% in capillary samples. This translates to a minimal difference of between 0.24 and 0.46 mmol/L in venous samples, which is below the allowable maximum of 0.7 mmol/L recommended by the DDG at a concentration of 7.0 mmol/L. The distribution of P-glucose and B-HbA1c results are shown as box-plots in Fig. 2 and Tables 2 and 3. Figure 2 also shows the recommended decision concentrations (cutoffs) for the diagnosis of IGT and DM.

Fig. 2
figure 2

Box-plots of P-glucose and glycated hemoglobin (HbA1c) in whole blood (B-HbA1c) measured at 2 h after glucose loading by oral glucose tolerance testing (OGTT). a Inpatient cohort, b outpatient cohort. Horizontal lines represent the cutoff for diagnosis of impaired glucose tolerance (IGT) and diabetes mellitus (DM). X-axis: c capillary samples, v venous samples

Table 2 Absolute and relative numbers of normal glucose tolerance, impaired glucose tolerance, and diabetes mellitus based on B-HbA1c and P-glucose measurements in the inpatient cohort at 2 h after glucose loading, as determined by the oral glucose tolerance test (relative numbers in parenthesis)
Table 3 Absolute and relative numbers of normal glucose tolerance, impaired glucose tolerance, and diabetes mellitus based on B-HbA1c and P-glucose measurements in the outpatient cohort at 2 h after glucose loading, as determined by the oral glucose test (relative numbers in parenthesis)

The relative number of IGT and DM diagnoses based on 2-h OGTT P-glucose concentration in venous plasma samples differed between the cohorts. In the inpatient cohort, the rates of IGT ranged from 34 to 51% and those of DM ranged from 7 to 27% (Table 2). In the outpatient cohort, the rates of IGT diagnosis ranged from 6 to 20% depending on the measurement system and sample type, and the rates of DM diagnosis ranged from 0 to 2% (Table 3). As illustrated in the box-plots of Fig. 2, in both cohorts the diagnosis rates of IGT and DM were consistently higher in capillary samples than in venous samples.

Diagnosis rates for IGT and DM in both cohorts were lowest when based on B-HbA1c, with no diagnosis of IGT or DM in the outpatient cohort. Based on B-HbA1c measurements, 42% of the inpatient cohort were diagnosed with IGT and 7% with DM.

Scatterplots and difference plots in Figs. 3 and 4 illustrate the comparisons of P-glucose results from different methods and sample types. Differences due to method imprecision are illustrated in Fig. 5 by comparing duplicate measurements in venous and capillary samples.

Fig. 3
figure 3

Difference plots are constructed according to Freckmann et al., with modification

Scatterplots and absolute difference plots comparing P-glucose in venous samples (Vista device) and point-of-care testing (POCT) unit use (StatStrip device) using capillary (a) and venous (b) samples from both cohorts. CLSI Clinical and Laboratory Standards Institute, FDA Food and Drug Administration, Rili-BAEK German Medical Association on Quality Assurance in Medical Laboratory Examinations.

Fig. 4
figure 4

Difference plots according to Freckmann et al. [13], with modification

Scatterplots and absolute difference plots comparing P-glucose in inpatient venous samples measured using the Vista and XN 9000 [RxL] core laboratory systems.

Fig. 5
figure 5

Comparison of outpatient venous duplicate measurements displayed as scatter plots and accompanied by difference plots according to Freckmann et al. and additionally modified

In patient care, a core laboratory method (e.g., Vista) that utilizes plasma from venous samples will typically run in parallel with a unit-use POCT method (e.g., StatStrip) utilizing whole blood from capillary samples. The correlation between glucose concentration using Vista and StatStrip was r2 = 0.865 (Fig. 3), and a positive bias of 0.8 mmol/L was observed in the capillary samples. The bias was reduced to 0.2 mmol/L (r2 = 0.954) when venous whole blood was used with the unit-use POCT method. In both comparisons, the modified difference plots show violations of all given limits.

In medical laboratories, changes in methods and/or instruments are commonly recurring events; these changes may have an influence on the measurement results. This is illustrated by our comparison of the venous P-glucose measurement results from the two core laboratory methods, namely, Vista (new method) and RxL (old method), both from the same manufacturer. The RxL method showed a negative bias of > 6% (0.5 mmol/L; r2 = 0.993), exceeding all but the widest limits (CLSI) and thus risking underdiagnoses (Fig. 4). After the change in methods a P-glucose concentration of 7.0 mmol/L using the Vista device was reported as 6.5 mmol/L on the RxL device.

The YSI device utilizes whole blood capillary samples which are corrected for the measured hematocrit into a plasma-referenced P-glucose value. The YSI device does not employ unit-use reagents and instead uses batch reagents, similar to large core laboratory instruments. The P-glucose value obtained using this device correlated well with P-glucose concentrations in venous plasma obtained by the core laboratory method Vista (r2 = 0.9571). A positive bias of 1.0 mmol/L was found for capillary samples when compared to Vista venous samples which decreased to 0.3 mmol/L when venous whole blood was used.

The duplicate results from the outpatient cohort were used to calculate the method variability and also to illustrate their differences in scatterplots and difference plots (Fig. 5). The agreement, expressed as R2 (coefficient of determination), ranged from 0.94 (StatStrip duplicates in venous samples) to 0.98 (Vista duplicates in venous samples). The regression functions are shown in the scatterplots of Fig. 5.

Discussion

In patient care, POCT for glucose levels is generally used in parallel with other testing methods. As such, the results are compared to glucose results obtained from core laboratory methods using different sample types, namely, capillary whole blood and venous plasma. In the present study we compared P-glucose concentrations obtained from different instruments, both core laboratory and POCT, using venous and capillary samples in an inpatient and an outpatient cohort. To illustrate the impact of these different devices on patient care, using international guidelines and cutoffs, we derived the diagnosis rates of IGT and DM based on the glucose level determined by the 2-h OGTT. These were also compared to diagnosis rates based on B-HbA1c measurements.

We observed biases in glucose concentrations between the studied measuring systems which would cause differences in the diagnosis rates of IGT and DM. Glucose concentrations and, consequently, diagnosis rates were highest when capillary samples were used, independent of the test system used, and lowest when based on B-HbA1c. This result is in line with the described overestimation of P-glucose concentrations, especially in postprandial capillary samples [14, 15]. However, when venous samples were used, the diagnosis rates were of the same magnitude on all instruments, including POCT, and differed only slightly among these instruments.

A physician may encounter the situation in which for an individual patient P-glucose concentration results are obtained from both POCT and the core laboratory. In such a case, it is important that the physician be aware of a possible bias that may be large enough to cause discrepancies in the diagnosis. This typical patient care setting is simulated in this study, and we found a positive bias of approximately 0.8 mmol/L for capillary samples when using StatStrip as a POCT unit-use device and Vista as a core laboratory instrument using plasma from venous samples. The bias was reduced to approximately 0.2 mmol/L when venous blood was used for both devices, namely, POCT and the core laboratory system. This bias was of the same magnitude as that of the YSI (capillary samples) compared to Vista (venous samples). Accordingly, the 2-h OGTT P-glucose concentrations between StatStrip and YSI were well comparable and without bias.

The StatStrip measures hematocrit simultaneously with glucose and calculates a correction to achieve a P-glucose concentration. In the present study, the YSI results were corrected with the hematocrit result from the core laboratory for plasma-referenced glucose. From a practical point of view, results from the YSI would depend on the availability of an external hematocrit value or, alternatively, rely on a fixed factor of 11%, as recommended by the IFCC [1] which may not result in optimally adjusted P-glucose values.

In the process of maintaining and developing analytical quality, laboratories regularly modify and adjust measuring procedures, both which may introduce a bias in relation to previous methods. As an example, we measured the P-glucose on two instruments from the same manufacturer (Vista and RxL) and found a bias of 0.5 mmol/L. Even though reference preparations (e.g., from the National Institute of Science and Technology [NIST]) and reference methods (isotope dilution mass spectrometry) are available for P-glucose, external quality assessment schemes judge participants by rather wide limits of between ± 15 and ± 20% allowable deviations. According to Rili-BAEK, a deviation of up to ± 11% is allowed for internal quality controls for P-glucose measurements, and a deviation of ± 15% is allowed for external quality assessment, regardless of the method used, i.e. core laboratory or POCT. Of note, all measurements of this study comply with the national regulation of Rili-BAEK [10]. However, the observed differences between methods impacted diagnosis rates, indicating that regulations should be tightened to meet medical needs in the diagnosis of IGT and DM.

Our data suggest that for the investigated measuring systems the difference in sample type, i.e., capillary or venous, impacts the diagnosis of IGT and DM to a greater extent than the choice of the measuring system. Furthermore, venous samples measured directly after sampling at the point-of-care appeared to be the optimal sample type for P-glucose measurements. The results show that the investigated unit-use POCT device (StatStrip) may qualify for diagnosis and achieves diagnosis rates comparable to those of other well-accepted core laboratory systems or the YSI when venous samples are used.

Different core laboratory methods can show a noticeable bias without violating quality assurance rules and programs, as illustrated by out findings that both of the core laboratory instruments investigated, Vista and RxL, passed external quality assessment programs, as required by Rili-BAEK. However, he external quality assessment program samples based on reference method values used for Vista and RxL were not suitable for POCT. Current rules therefore need to be reconsidered and tightened to stimulate the harmonization of results. Given that POCT methods are accepted for the diagnosis of IGT and DM when venous whole blood samples are used, two challenges need to be addressed. The first is that collection systems need to be developed that allow venous samples to be easily introduced into POCT instruments. The second is that for a wide variety of POCT devices to function appropriately, whole blood external quality assessment materials for allowing evaluation based on a reference method value need to be identified [6].

Study limitations

The experimental set-up was established for the first cohort and added to for the second, smaller cohort which later included the YSI device as well as duplicate measurements. Consequently, YSI measurements and duplicate measurements are only available for the smaller cohort. The study was limited to these two cohorts, with a total of 231 participants, due to the demanding experimental set-up that included different sample and instrument types for P-glucose concentration measurements. Due to the study design and as a consequence of the limited number of participants, the range of glucose concentrations covered is also limited and may need further consideration in future studies. Additional POCT instruments need to be investigated, since only one was included in the present study. To elucidate bias of all methods, more comprehensive reference method values would be required. The study design does not include modern instant glucose stabilizers; therefore, the effect of glycolysis cannot be excluded completely. Although precautions to standardize and tightly control pre-analytical procedures were in place, such as centrifugation immediately after sample collection, separation and cooling of plasma samples, venous plasma samples were measured later than the StatStrip and YSI samples. The glycolysis inhibitor used, sodium fluoride, is known to be insufficient during the first 2 h after sampling.

Conclusion

Venous samples measured directly at the point-of-care represent the most suitable sample type for P-glucose concentration measurement in the 2-h OGTT. Our results demonstrate that the investigated P-glucose unit-use POCT method is suitable for the diagnosis of IGT and DM and comparable to core laboratory P-glucose methods when venous samples are used. We reached this conclusion this without taking into account that sample handling and transport conditions in patient care are usually less ideal than in our study. Therefore, it is reasonable to assume that this study underestimates errors introduced by pre-analytical processes, emphasizing the recommendation to measure venous samples at the point-of-care. More important, changes in measurement procedures in core laboratories can introduce a bias in the P-glucose concentration measurement that could affect the diagnosis rates of IGT and DM, thereby emphasizing the need for tighter external quality assessment criteria.