Introduction

Estimating cardiac output at the bedside is a common preoccupation in critically ill patients. Many methods are available; some invasive and others not, some operator dependent and others not. The thermodilution cardiac output obtained through right heart catheterization has been the clinical standard for decades [1,2,3]. However, various kinds of metrological limitations are the source of inaccuracies [4,5,6]. The direct Fick method, or metabolic method, relies on the calculation of cardiac output as the ratio of oxygen uptake (V'O2) to the arteriovenous difference in oxygen content. It was originally used to validate the thermodilution method [7] and is often considered the 'physiological' gold standard. It cannot be taken as a clinical gold standard in intensive care practice because, although this has not been precisely assessed, there are possible causes of error specific to this setting, such as an increased oxygen consumption in the lungs in the presence of the acute respiratory distress syndrome or in the presence of pneumonia [8]. In addition, measuring V'O2 is not easy when the inspired fraction of oxygen is high. Another means to estimate cardiac output at the bedside is the echocardiographic approach, particularly from the transesophageal route. By visualizing the heart directly, the echocardiographic approach alleviates several drawbacks of other methods, but it is strongly operator dependent and thus may not always be readily available.

Comparisons of thermodilution cardiac output and metabolic cardiac output have demonstrated statistically significant correlations [7,9,10,11,12,13,14,15], but this does not mean 'agreement' or 'clinical interchangeability'. More recently, a satisfactory agreement has been found between the two methods in stable children [16] and in stable patients with pulmonary hypertension [17]. Other studies, however, have suggested that discrepancies could appear in less stable situations such as exercise [18] or critical illness [19,20].

In the present study, we re-examined the concordance between thermodilution cardiac output and metabolic cardiac output, for several reasons. First, ventilatory management in the intensive care unit has evolved; low tidal volume strategies currently being much more common than a few years ago. The corresponding permissive hypercapnia can have hemodynamic effects [21,22] and can interfere with the results of both the thermodilution and the metabolic methods. A second reason is that the controversy on the risk–benefit balance of right heart catheterization in critically ill patients [23,24] makes it important to gather knowledge about possible alternative methods. Finally, we wished to obtain data in a population of critically ill patients exhibiting indices of extreme severity, in whom cardiac output determination and manipulation are likely to be a more frequent issue than in other subsets of patients.

Materials and methods

Patients

Eighteen mechanically ventilated patients were studied (Table 1). Criteria for inclusion were: criteria 1, the presence of a flow-directed, balloon-tipped pulmonary artery catheter placed after the decision of the physician in charge of the patient; criteria 2, controlled mechanical ventilation without spontaneous respiratory activity; criteria 3, a stable level of inspired oxygen fraction and positive expiratory pressure when present; and criteria 4, spontaneous or drug-induced clinical unresponsiveness. Criteria 2–4 were set to minimize the risk of variations in oxygen consumption due to nonhemodynamic factors. When the patients received vascular expansion or when a change in the infusion rate of catecholamines was decided, a 10 min period of stability (<10% changes in cardiac frequency and arterial pressure) was required before the measurements were taken.

Table 1 Characteristics of the patients

The patients were recruited on a consecutive basis. The study was a byproduct of another study, relying on the same methods, and that fulfilled the French legal criteria for patient studies. With approval of the appropriate authority, informed consent was not sought because the study-related intervention was noninvasive and bore no risk of interference with the clinical management of the patients.

Measurements and calculations

Metabolic method

Oxygen consumption was determined from the measurements of carbon dioxide and oxygen concentrations in the inspired and expired gases, using a standard portable metabolic monitor (Deltatrac Metabolic Monitor™; Datex Instrumentation Corp., Helsinki, Finland) calibrated prior to each set of measurements with a 96% oxygen–4% carbon dioxide gas mixture. This monitor has been validated for accuracy, sensitivity and reproducibility over a wide range of conditions [18,25]. To retain a given measure for analysis, a 10 min 'metabolic' steady state was required (<5% change in the respiratory quotient [R], in V'O2, and in carbon dioxide production).

Blood gas analysis was performed on simultaneously drawn arterial and mixed venous samples (5 ml aliquots, with an AVL Omni 9™ analyzer; AVL Medical Instruments, Shaffhausen, Switzerland). The hemoglobin concentration and oxygen saturation were measured using the corresponding co-oximeter, as well as arterial and mixed venous oxygen contents (CaO2 and CO2, respectively). Cardiac output determined using the metabolic (Fick) method (QT FICK) was calculated as the ratio of V'O2 to the CaO2 – CO2 difference. Each of the QT FICK values used in the subsequent comparisons corresponded to 10 min measures of V'O2.

Thermodilution

From a flow-directed, balloon-tipped pulmonary artery catheter positioned in a nondependent zone of the lung [26], the cardiac output determined using the thermodilution method (QT THERM) was measured by fast injections of a 10 ml bolus of 5% dextrose solution, at room temperature. All the measurements were performed by the same operator. Each injection was performed at end expiration. The thermal decay curve was visually inspected extemporaneously, and the data were rejected if the curves were obviously aberrant and in the presence of waveform irregularities suggesting technical artifacts. Each of the QT THERM values used in the subsequent comparisons derives from three successive measures normalized according to Poon [27].

Tricuspid regurgitation

Pulsed Doppler echocardiography (parasternal short-axis view) was used to qualitatively detect a regurgitant signal in the right atrium.

Data analysis

Forty-nine paired measurements of QT FICK and of QT THERM were performed either at baseline or after a therapeutic intervention, with a minimum of two sets of measurements in each patient. The statistical association between QT FICK and QT THERM was expressed in terms of the Z coefficient of correlation with the 95% confidence interval. The agreement between the two techniques was studied using a graphical analysis according to Bland and Altman [28] and using the regression method described by Passing and Bablok [29]. This regression was first calculated using the whole data set. Data points lying far off the regression line were then tested for outlier status (data point considered outlier if value above mean + 3 SD of the data set not including this data point). Outliers so defined were removed from the data set and the regression recomputed. The analysis was conducted in the whole study population (18 patients, 49 pairs of measurements), over restricted ranges of cardiac output, and after exclusion of the patients with tricuspid regurgitation (14 patients remaining, 41 pairs of measurements).

The data are expressed as the mean ± SD.

Results

Whole population

The values for QT FICK ranged from 2.2 to 11 l/min (mean ± SD = 5.2 ± 2.0 l/min), whereas the values for QT THERM ranged from 2.8 to 11.2 l/min (mean ± SD = 5.8 ± 1.9 l/min) (R = 0.84, 95% confidence interval = 0.73–0.91, P < 0.0001). After the removal of one data point meeting the outlier definition (see Materials and methods), the mean difference between QT FICK and QT THERM was -0.8 l/min, with a lower limit of agreement (magnitude of underestimation of QT FICK by QT THERM) at -2.3 l/min and an upper limit (magnitude of overestimation of QT FICK by QT THERM) at 0.8 l/min (Fig. 1a). The results of the Passing and Bablok regression of QT FICK against QT THERM are shown in Figure 1b. The 95% confidence interval of the intercept did not include 0(-0.70 to -0.06) and the upper limit of the 95% confidence interval of the slope was equal to 1(0.87–1.00), indicating the existence of a systematic difference between the two techniques [29].

Figure 1
figure 1

Comparison of cardiac output determined using the thermodilution method (QT THERM) and cardiac output determined using the metabolic (Fick) method (QT FICK) according to (a) the Bland and Altman graphic method [28], and (b) the Passing and Bablok regression method [29]. Determined using the whole set of data after removal of one data point identified as an outlier (48 pairs obtained in the 18 patients), irrespective of the cardiac output value and of the presence of a tricuspid regurgitation. CI, confidence interval; SD, standard deviation.

For QT THERM values ≤ 5 l/min (n = 17, range = 2.8–5 l/min, mean ± SD = 3.8 ± 0.7), the correlation between the two methods was extremely strong (R = 0.93, 95% confidence interval = 0.81–0.97, P < 0.0001). The mean difference between QT FICK and QT THERM was -0.6 l/min, with a lower limit of agreement at -1.2 l/min and an upper limit at -0.1 l/min. The 95% confidence interval of the QT THERM versus QT FICK regression intercept included 0 (-0.89 to 0.32) and the 95% confidence interval of the slope included 1(0.77–1.07), indicating the absence of a systematic difference between the two techniques over that range of values (Fig. 2a) [29]. The QT FICK values never exceeded the QT THERM values.

Figure 2
figure 2

Passing and Bablok regression of cardiac output determined using the metabolic (Fick) method (QT FICK) against cardiac output determined using the thermodilution method (QT THERM) [29] restricted to (a) QT THERM values <5 l/min and (b) QT THERM values >5 l/min (after removal of one outlier). CI, confidence interval.

For QT THERM values >5 l/min (n = 34, range = 5.1–11.2 l/min, mean ± SD = 6.8 ± 1.3), the correlation between the two methods was weaker (R = 0.61, 95% confidence interval = 0.34–0.79, P < 0.0001) (Fig. 2b). After removal of the outlier, the mean difference between QT FICK and QT THERM was -0.9 l/min, with a lower limit of agreement at -2.7 l/min and an upper limit at 1.0 l/min.

Population restricted to patients without tricuspid regurgitation (n = 14)

The values for QT FICK ranged from 2.2 to 11 l/min (mean ± SD = 5.0 ± 1.9 l/min), whereas the values for QT THERM ranged from 2.8 to 11.2 l/min (mean ± SD = 5.8 ± 1.8 l/min) (R = 0.83, 95% confidence interval = 0.71–0.91, P < 0.0001). The mean difference between QT FICK and QT THERM was -0.8 l/min, with a lower limit of agreement (magnitude of underestimation of QT FICK by QT THERM) at -2.2 l/min and an upper limit (magnitude of overestimation of QT FICK by QT THERM) at 0.7 l/min (Fig. 3a). The Passing and Bablok regression of QT FICK against QT THERM (Fig. 3b) indicated a systematic difference between the techniques (confidence interval of the intercept =-0.70 to -0.21; confidence interval of the slope = 0.9–1.0). For QT THERM values <5 l/min, the mean difference between QT FICK and QT THERM was -0.6 l/min (range, -1.2 to -0.02 l/min). For QT THERM values >5 l/min, the mean difference between QT FICK and QT THERM was -0.8 l/min (-2.5 to 0.9 l/min).

Figure 3
figure 3

Comparison of cardiac output determined using the metabolic (Fick) method (QT FICK) and cardiac output determined using the thermodilution method (QT THERM) according to (a) the Bland and Altman graphic method [28], and (b) the Passing and Bablok regression method [29]. Restricted to the patients in whom cardiac echography ruled out tricuspid regurgitation (14 patients, 40 pairs of measurements, after removal of one outlier). CI, confidence interval; SD, standard deviation.

Discussion

The present study, conducted in a pragmatic manner to stay close to the clinical practice, shows that the bolus thermodilution method and the metabolic method can provide clinically interchangeable measures of low cardiac output values in mechanically ventilated, critically ill patients. Conversely, there are marked discrepancies between the two approaches for high cardiac output values.

Divergences between methods to estimate cardiac output in critically ill patients have been reported. Sherman et al. [19] found in 10 septic patients (average Acute Physiology and Chronic Health Evaluation [APACHE]II score = 18), as opposed to 10 nonseptic patients (average APACHEII score = 12), that the thermodilution cardiac output could overestimate the metabolic cardiac output by more than 6 l, or underestimate it by more than 3 l. In the study of Sherman et al., 17 out of 20 of the cardiac output values were >5 l/min.

Axler et al. [20] compared 45 pairs of measurements obtained in 13 patients of moderate severity (10 discharged alive from the intensive care unit, 3 deceased). In this series, transesophageal echocardiography, bolus thermodilution and the Fick method provided substantially different results. Although the thermodilution cardiac output values and the metabolic cardiac output values were not statistically different, their limits of agreement ranged from -2.7 to 4.8 l/min. From this, the authors insisted on the notion that clinical decision making could not rely on a cardiac output measurement alone, whatever the technique used to obtain it. In this series, only six metabolic cardiac output data points were <5 l/min.

The present study differs from the previous two studies by the extreme severity of the clinical status of the patients, as illustrated by high simplified acute physiology IIscores and a calamitous outcome (Table 1). Such clinical contexts are generally associated with complex hemodynamical situations, which may serve as a justification to the decision of right heart catheterization. Preliminary data obtained in a cohort of about 600 such patients [30] suggest that this procedure is not associated with an increased mortality, as opposed to what has been suspected in less severe patients [23,24]. Dhingra et al. [31] recently published a study similar to the present one regarding motives, design and methods. In 18 mechanically ventilated, critically ill patients with high APACHEII scores, these investigators showed that the thermodilution method and the metabolic method had limits of agreement ranging from -3.30 to 2.96 l/min. For cardiac output values >7 l/min, these limits were -5.67 to 1.87 l/min.

As compared with the data of Sherman et al. [19] and those of Axler et al. [20], the extreme severity of the patients' condition probably explains the relatively large proportion of low cardiac output values in the present data (Fig. 1) and in the data of Dhingra et al. [31]. Although splitting the data set in two parts carries the risks inherent to all post hoc analyses, it can clearly be seen from Figures 1 and 2 that the discrepancies between QT THERM and QT FICK become major only for high cardiac outputs. The agreement between QT THERM and QT FICK at cardiac output values <5 l/min was almost as good as that reported by Capderou et al. in normal individuals [16] (range -0.8 to-0.3 l/min), and QT THERM never underestimated QT FICK. In the study by Dhingra et al. [31], looking at the data suggests that the thermodilution method and the metabolic method were probably interchangeable up to 6 l/min. From a set of 105 measurements, among which 90 provided values <5 l/min, Hoeper et al. [17] reported limits of agreement between -1 and 1.2 l/min.

It appears that, in severely ill patients and in stable patients, a thermodilution cardiac output value <5 l/min probably reflects 'adequately' what this value would have been with the metabolic method, and vice versa. It must be noted that the meaning of 'adequately' here is arbitrary. The Bland and Altman graphical approach to compare two methods of measurements of a given biological value does not determine whether the agreement found between these two methods is 'good'. This depends on the error magnitude that is, arbitrarily, considered clinically acceptable. It seems to us that the degree of agreement reported by ourselves and others is sufficient to render reasonable a decision making process relying on a low cardiac output value, whatever the method used to obtain it. This is clinically relevant because, as emphasized by Dhingra et al. [31], "cardiac output manipulation is likely to have the greatest impact on outcome when cardiac output is low". It must be borne in mind, however, that the thermodilution method is notoriously unreliable when the cardiac output is very low. van Grondelle et al. [15] reported overestimates of cardiac output, with the thermodilution method reaching 35% of the measured value when the cardiac output was <2.5 l/min. Of note, we did not observe such low values in the present patients (Fig. 2).

The situation is different regarding the higher values of the cardiac output range that we observed. The acceptable agreement found at low values is clearly lost (Fig. 2). This is in line with the data of Sherman et al. [19], of Axler et al. [20] and of Dhingra et al. [31]. This is also in line with the results reported for cardiac output values >5 l/min by Koobi et al. [32] in stable adults in the context of a coronary artery bypass, and in line with the observations of Hsia et al. [33] in dogs and of Espersen et al. [18] in healthy humans, who described a dramatic decrease in agreement between the thermodilution method and the metabolic method when going from rest to exercise. The discrepancies between the thermodilution method and the metabolic method may be due to metrological limitations affecting both techniques, particularly in the intensive care setting. Of note, the presence of tricuspid regurgitation did not seem to have a major impact on the present results (Fig. 3), but it was relatively rare in our series.

We wish to emphasize that finding a low level of agreement between the thermodilution method and the metabolic method when the cardiac output is high does not necessarily mean that either of the two methods is closer than the other to the reality. Indeed, many sources of errors have been identified regarding the thermodilution method, and many publications have warned clinicians against them [6,15,34,35]. The metabolic method is also far from being free of criticism. In spite of the availability of easy-to-use metabolic carts, it remains difficult to use at the bedside. There is a risk to cumulate measurement errors (respiratory gas sampling and blood gas analysis). The reliability of the measurement of oxygen consumption can be decreased by metabolic instability, patient–ventilator dyssynchrony, high inspired oxygen fraction, circuit leaks, and so on. In addition, the metabolic method provides an accurate estimate of cardiac output only if the pulmonary artery flow, the mixed venous oxygen content, and the arterial oxygen content are reasonably constant [36], a condition that may not be fulfilled in hemodynamically compromised, mechanically ventilated patients. It is therefore not possible from the available data to designate a gold standard.

In summary, the present data concur with those of Dhingra et al. [31] to suggest that, in daily practice, a low thermodilution or metabolic cardiac output can reasonably be relied on to build a clinical decision, which is novel information. Conversely, both the present study and that of Dhingra et al. [31] confirm that, in critically ill patients, as in other types of patients, the methodological approach chosen to evaluate the cardiac output has an important influence on the result when cardiac output is high. High cardiac output values should thus be treated and used cautiously.

Key messages

  • This study confirms that the method chosen to evaluate cardiac output in critically ill patients can influence the results, and that this metrological dimension must be taken into account when interpreting clinical data

  • The good level of agreement between thermodilution measurement and metabolic measurement at low cardiac output suggests that such a value can be relied on to build a clinical decision, whatever the method used to determine it. This is novel information

  • Conversely, the divergence between methods for high cardiac output values prompts caution in the presence of such results