Introduction

Measurements of cardiac output (CO) are frequently done in surgical and critically ill patients as part of optimization strategies [1]. Since the 1970s the use of pulmonary artery or Swan–Ganz catheter together with intermittent thermodilution technique based on the Stewart–Hamilton equation has been considered the clinical standard in measurements of CO. It has been extensively investigated in many different clinical settings and has demonstrated a precision of less than 10 % when used in scientific investigations [26]. On the other hand, the technique has its limitations, including variability in serial measurements of CO and rare, but potentially serious, complications [7, 8].

Two-dimensional echocardiography is a non-invasive [transthoracic (TTE)] or semi-invasive [transesophageal (TEE)] method for the assessment cardiac anatomy and function [912]. When Doppler technique or the volumetric method (difference of end-systolic and end-diastolic volume) is added, it is possible to measure stroke volume (SV) and CO. The anatomic sites for measuring velocity likely affect CO measurement and in the guidelines from the American Society of Echocardiography it is recommended that the left ventricular outflow tract (LVOT) is used for measurements of CO [13].

It is complex to validate a new technique for the measurement of CO. The new method has to be compared with an accepted clinical standard technique and the results analyzed using the Bland–Altman method, in which the agreement is evaluated between the new and the established technique [14, 15]. As no true gold standard exists for CO measurement the Bland–Altman method applies the mean of both the new and the reference technique. One plots the means of all the differences between the techniques (bias) and plots the variance, being ±1.96 standard deviations (SD) around the bias, also called limits of agreement (LOA) [14]. However, the acceptable LOA varies considerably, depending on the precision of the methods tested. Therefore, in any study attempting to validate a new method, it is imperative to determine the precision of the methods tested. Without knowledge of the precision of each method, it is impossible to determine the acceptable LOA and make a sound and unbiased conclusion of the interchangeability between the two methods.

Erroneous CO measurements could potentially lead to wrong clinical decisions including inappropriate use of fluid or inotropic drugs. This may potentially harm patients. As echocardiography is widely used in the clinical setting and promoted by experts [16], it is imperative to ensure that its validation is based on high-quality studies using the Bland–Altman analysis when assessing the agreement. The aim of this systematic review was to examine the evidence of echocardiography for measurements of CO. The thermodilution technique was chosen as clinical standard, and data regarding the reproducibility (precision) of each method and agreement and trending ability between them were obtained.

Methods

We used the methodology of the PRISMA guidelines for the conduct of systematic reviews [17]. The objective was a systematic review of the literature investigating echocardiographic measurements of CO compared with thermodilution technique with use of the Bland–Altman method. Potentially eligible studies were those assessing echocardiography-derived measurements of CO with thermodilution technique as the reference method in adult hospitalized patients. We performed an electronic literature search in Pubmed, Embase, and the Cochrane Library. The following search strategy was used to identify relevant studies:

  1. 1.

    echocard* OR echo OR cardiac ultrasound OR TTE OR TEE OR transtho* OR transeso*.

  2. 2.

    CO OR cardiac output OR SV OR stroke volume OR hemodynamic OR monitoring.

  3. 3.

    #1 AND #2.

Study selection

The inclusion criteria were (1) hospitalized patients, (2) prospective clinical studies, systematic reviews, or meta-analysis, (3) adults (age >17 years), (4) the use of echocardiography-derived data to assess SV or CO, (5) the use of thermodilution technique as the reference technique, and (6) the use of the Bland–Altman method to compare them. Two authors (M.W. and R.R.J.) independently screened the articles for inclusion. All potentially relevant studies were reviewed in full text. If a study did not fulfill all inclusion criteria, it was excluded. The search was performed March 2014 and updated in August 2015.

We did not have predefined limitations regarding year of publication, language, populations of hospitalized patients, or predefined outcomes. Studies only published in abstract form were excluded if data allowing sufficient data assessment could not be obtained. Any disagreements with respect to inclusion/exclusion of studies were resolved with the co-authors (H.M.S., A.P.).

Data extraction

Two authors (M.W. and R.R.J.) extracted relevant data from included articles. The extracted data included the year of publication, patient population, number of patients, use of mechanical ventilation, ventilator mode, heart rhythm, type of monitor devices, echocardiographic view(s), the presence of valvular heart diseases, and echocardiographic- and thermodilution-derived data.

Because the different ways of measuring CO by echocardiography may not be interchangeable, we registered the method used (imaging performed from two-dimensional echocardiography combined with Doppler or volumetric technique). In the Doppler technique, CO is estimated by multiplying the velocity of time integral (VTi) of the blood flow by the cross-sectional area of an anatomic site (e.g., valve). In the volumetric method, SV is assesses by calculating the difference between end-systolic and end-diastolic volume [13, 18]. If more than one site was used, we obtained data from all the sites that were compared with thermodilution technique and analyzed using the Bland–Altman method.

We extracted available data from the Bland–Altman analyses, including the precision of each technique, bias, LOA, percentage error (PE), and trending ability between the two techniques. The PE was defined as 1.96 SD divided by the mean CO of the two methods. If PE was not reported, we a posteriori calculated PE if the data were available.

Results

The search resulted in 13,834 unique citations; 13,720 citations were excluded on the basis of the title. From the remaining 114 citations, 26 citations were excluded on the basis of the abstract and 88 articles were evaluated in full text. Full-text studies were excluded if the Bland–Altman method was not used, thermodilution was not used, animals were studied, or the research question was irrelevant (not comparative CO measurements studies, studies evaluating fluid responsiveness, no use of echocardiography, and echocardiographic assessment of cardiac anatomy only) (Fig. 1). Twenty-four studies were included.

Fig. 1
figure 1

Screening, inclusion, and exclusion of studies. TEE transesophageal echocardiography, TTE transthoracic echocardiography

Characteristics of included studies

Of the 24 included studies, CO measurements was done with TEE in 19 studies [5, 1936] and with TTE in five studies [3741]. The median number of included patients was 32 (range 8–65). The majority of studies were performed in cardiac and intensive care unit (ICU) populations. Fifteen of the TEE studies were performed during elective cardiac surgery under hemodynamically stable conditions and in patients undergoing mechanical ventilation [5, 1926, 29, 3133, 35, 36], while eight studies were performed in ICU populations including septic shock, acute myocardial infarction, pulmonary hypertension, subarachnoid hemorrhage, and after liver transplantation [27, 28, 30, 3741]. In none of the included studies were details given regarding ventilation mode or settings.

The TEE studies most frequently assessed the LVOT [5, 26, 27, 3235], the aortic valve [23, 31, 33, 34, 36], and the mitral valve [19, 21, 22, 30, 34]. In the TTE studies, LVOT [38, 40, 41] and the aortic valve [37, 39] were the most often assessed.

The number of heartbeats used to calculate CO varied between 3 and 5 in the majority of studies [5, 1922, 27, 29, 30, 3238, 40]. The CO measurements with thermodilution were the mean of three measurements in most of the studies [20, 2229, 3137, 39, 41], while 4–5 measurements were used in four studies [5, 21, 30, 38].

Measurements of CO were performed in patients with sinus rhythm in the main part of the studies [5, 1923, 2527, 2936, 39, 40]; in five studies it was not described if patients with arrhythmia were included [24, 28, 37, 38, 41]. The majority of studies excluded patients with significant valvular diseases; however, mitral regurgitation [19, 22, 26, 38, 40] and tricuspid regurgitation [21, 26, 29, 34, 40] were observed in some studies.

Assessment of changes in cardiac output were mainly performed at fixed time periods intra- and postoperatively [5, 19, 20, 2224, 29, 3137]. Other studies induced changes in CO by fluid loading or vasoactive drugs [5, 21, 28, 30, 36, 37]. In one study preload was corrected prior to CO measurements; patients received fluid pre- and postoperatively to increase cardiac filling pressure (defined by pulmonary artery diastolic pressure and right atrial pressure). Subsequently, the groups were randomized to receive 5 % albumin to maintain (normovolemic) or increase cardiac filling pressure (hypervolemic) before follow-up CO measurements [37].

In the majority of studies the investigations were blinded and off-line analyses were performed without randomization of the sequence of measurement [5, 19, 20, 2327, 29, 30, 3237, 39]; in three studies a random allocation process was used [19, 34, 36].

Echocardiographic assessment of cardiac output

Overall, the majority of studies showed small bias, wide LOA, and high PE between CO measured by echocardiography and thermodilution (Table 1; Figs. 2, 3). Percentage error was specified in four studies, and we could calculate PE a posteriori in an additional 11 studies. The PEs (specified and calculated a posteriori) ranged from approximately 16 to 69 % for LVOT [5, 26, 27, 3135, 41], approximately 16 to 43 % for aortic valve [23, 33, 34, 36, 39], approximately 41 to 50 % for mitral valve [21, 30, 34], and from approximately 11 to 61 % for the right ventricular outflow tract (RVOT), pulmonary artery (PA), or pulmonary valve (PV) [29, 33, 34].

Table 1 Characteristics of the included studies
Fig. 2
figure 2

Forest plot including bias, limits of agreement, and percentage error (PE) of data from studies comparing cardiac output as assessed by echocardiography vs. thermodilution. The data were derived from the Bland–Altman analysis of the included studies. Only studies reporting both bias and limits of agreement are represented in the forest plot. Percentage errors are given if these were reported or could be calculated from the reported data. If not, not assessable (NA) is given

Fig. 3
figure 3

Assessment of cardiac output by echocardiography using Doppler. Estimation of stroke volume is done by the measuring of a diameter of a cardiac structure to assess the cross-sectional area (CSA) and multiplying it by the velocity time integral (VTI) obtained by the Doppler. The cardiac output is calculated by multiplying stroke volume by heart rate. aA ascending aorta, CO cardiac output, CSA cross-sectional area, d diameter, HR heart rate, LA left atrium, LV left ventricle, RV right ventricle, SV stroke volume, VTI velocity time integral

In only two studies [5, 28] were the precision assessed of both echocardiographic CO and thermodilution CO. Both studies were done using TEE and volumetric assessments were done in one [28]; in the other study Doppler was used at the LVOT [5]. The precision of thermodilution was 8 and 4.8 % in the two studies [5, 28]; that of TEE was 8.6 % (LOA ± 4.3 l/min) [28] and 16 % (±1.51 l/min) [5]. In these two studies the PE between CO measured by echocardiography and thermodilution was 53 [28] and 39 % [5].

Left ventricular outflow tract

In 11 studies the LVOT was used to measure CO by Doppler (Table 1); most of these had bias less than 0.5 l/min and LOA less than ±1.5 l/min, but PE was over 20 % in the majority of studies [26, 3235].

The seven TEE studies measuring CO at the LVOT (Table 1) showed a bias ranging from −0.42 to 3.39 l/min and LOA varying from ±0.86 to ±1.51 l/min [5, 26, 27, 3235]. Percentage error ranged from 16 to 48 %.

The four TTE studies measuring CO at the LVOT (Table 1) showed bias ranging from −0.75 to 0.4 l/min and LOA varying from ±0.83 to ±2.87 l/min [37, 38, 40, 41]. The only available PE from TTE-derived LVOT was 69 % [41].

Aortic valve

In six studies the aortic valve was used to measure CO (Table 1); most of the studies had bias below 0.5 l/min and LOA around ±1 l/min. The PE ranged from 16 to 43 % for both the TEE and TTE studies [23, 33, 34, 36, 39].

The five TEE studies measuring CO at the aortic valve showed bias ranging from −0.21 to 0.45 l/min [23, 31, 33, 34, 36] and LOA varying from ±0.83 to ±1.93 l/min. The PE of the TEE studies ranged from 19 to 43 % [23, 31, 33, 34, 36].

The one TTE study measuring CO at the aortic valve showed bias of −0.6 l/min and LOA ±1.2 l/min with PE of 16 % [39].

Mitral valve

In five studies the mitral valve was used to measure CO. All were done using TEE and bias varied from −0.86 to 4.29 l/min with LOA over 2 l/min in the majority of studies. The PE ranged from 41 to 50 % [19, 21, 22, 30, 34].

Right-sided heart structures and vessels

In six studies CO was measured using echocardiographic sites of the right-sided chambers and vessels of the heart, including the RVOT, the pulmonary artery, or the pulmonary valve (Table 1). In all studies TEE was used and bias ranged from −0.01 to 1.96 l/min, LOA from ±0.45 to ±2 l/min [19, 20, 25, 29, 33, 34], and PE from 11 to 61 % [29, 33, 34].

Volumetric assessment of cardiac output

In three studies of TEE [24, 25, 28] the measurement of CO was performed by the use of the volumetric method. One study assessed the precision of both TEE (9 %) and thermodilution (5 %); the bias between TEE and thermodilution was −0.5 l/min, LOA over ±4 l/min, and PE 53 % [28]. The studies evaluating the left ventricular areas found bias ranging from −0.2 to 0.47 and LOA between ±2.6 to 4.34 l/min [24, 25]. Percentage errors were not assessable for these two studies.

Trending ability

In 11 studies trending ability was investigated; in 10 studies TEE was used [5, 19, 20, 23, 24, 2831, 35] and in one TTE was used [40]. In the majority of studies correlation analysis or receiver operating characteristic curves were used to assess changes in CO between echocardiography and thermodilution technique. In five studies controlled therapeutic interventions were performed to induce changes in CO to investigate trending ability [5, 21, 28, 30, 36].

In only three studies trending ability was evaluated by four-quadrant plot including concordance analysis or polar plot [5, 23, 31], but in two of these no therapeutic intervention was performed to assess trending ability [23, 31].

Both the precision of the two techniques and trending ability were evaluated in only one study [5]. This study showed concordance between echocardiography (TEE using Doppler of LVOT) and thermodilution regarding the ability to detect serial changes in CO. However, the magnitude of changes in CO varied considerably. Furthermore, the polar plot showed a radial degree of 53.5°, corresponding to poor trending ability despite an acceptable precision of each of the techniques [5]. The study by Perrino et al. [31] showed comparable findings, but the precision of both methods was not assessed [31]. In this study, Doppler technique at the aortic valve detected directional changes in 97 % of the cases, but tended to underestimate the magnitude of changes in CO compared with thermodilution [31].

Overall, in the three TEE studies where four-quadrant or polar plot was used, agreement was observed between echocardiography and thermodilution in tracking directional changes in CO [5, 23, 31].

None of the TTE studies assessed the trending ability with use of four-quadrant plot or polar plot. In one TTE study changes in CO were evaluated but only with the use of correlation analysis [40].

Discussion

In this systematic review we included 24 studies comparing echocardiography with thermodilution as the clinical standard technique in measuring CO, but only two studies had a design that allowed a fully unbiased comparison of the two techniques. The PE was 39 and 53 % between CO measured by echocardiography and thermodilution in these two studies [5, 28]. In the majority of the remaining studies, PE was above 20 % [21, 26, 3036, 40, 41]. Only one study had a design that allowed a fully unbiased comparison of trending ability between echocardiography and thermodilution, and this showed agreement between the two techniques in the directional changes in CO [5]. This observation was supported by those of two other studies with analyses allowing reasonable comparison regarding directional changes in CO [23, 31].

The LVOT and the aortic valve are generally considered to be the most accurate echocardiographic sites for the measurement of SV and CO, because these are easily visualized and have low flow turbulence under normal physiological conditions [13]. This notion was supported in our results. LVOT and the aortic valve appeared to be the echo sites with the lowest bias; however, LOA and PE were still high. The accuracy may be affected by aortic regurgitation and obstruction of the LVOT. This may increase flow turbulence and changing blood flow, which will affect the accuracy of VTi measurements. However, these conditions were not recognized in the populations included in our review.

Imprecision in echocardiography CO measurements may be induced by patient, technical, or operator factors [42, 43]. Arrhythmias, low sedation levels, and changes in ventilator pressures and tidal volumes may affect the measurement of CO [44]. Furthermore, the assumption of laminar flow, constant cross-sectional area, and angle dependency of the probe may contribute to erroneous measurements. It has been shown by Hansen et al. [45, 46] that flow estimates done on complex in vivo blood flow, such as cardiac flow, is inaccurate when done with two-dimensional ultrasound methods. The main bias is the assumption of axisymmetrical and parabolic flow profiles, which is not met in in vivo vessel geometry. Thus, the lacking third dimension of in vivo volume flow estimation is likely the most important confounder for a precise estimation. Depending on the approach used for volume flow estimation from either Vmax or Vmean obtained with TEE or TTE, assumptions of parabolic and axisymmetical flow patterns are necessary, and therefore these assumptions create major biases. Therefore, measurements of CO with Doppler ultrasound carry a high risk of inaccuracy in estimation of CO as the assumptions applied in the calculation do not apply in complex cardiac flow. However, this also gives a plausible explanation of the apparent ability of Doppler ultrasound to track directional changes in CO, but not the magnitude, as changes in VTi may be maintained across the complex flow.

The variability in CO measured by echocardiography may be minimized by the use of mean values of repeated measurements [13]. In a study by Dubrey and Falk [47] it was necessary to measure around 13 heartbeats in patients with atrial fibrillation to obtain mean values of CO that varied less than 2 % compared with 3–5 heartbeats in patients with sinus rhythm. The studies included in our review measured CO in hemodynamically stable populations with sinus rhythm measured in 3–5 heartbeats.

Like any other CO measurement device, the thermodilution methods have variations in accuracy and precision. Errors in measurement of CO may be induced by the pathophysiology of the patient, biological variation, and technical problems [4851]. Thermodilution is sensitive to tricuspid regurgitation and intracardiac shunts, which may result in falsely low CO measurements because of the recycling of indicator fluid across the tricuspid valve [52, 53]. The frequency of tricuspid regurgitation may be higher in a mechanically ventilated as compared with spontaneously breathing patients. A study by Balik et al. [54] showed that a high degree of tricuspid regurgitation resulted in lowered accuracy of thermodilution and underestimated CO in mechanically ventilated patients [54]. This could potentially have affected our results; however, one might assume that in studies comparing thermodilution with echocardiography the presence of tricuspid regurgitation or significant intracardiac shunts would be described. We only identified a few studies including mechanically ventilated patients with mild to moderate tricuspid regurgitation or intracardiac shunt [21, 26, 29, 34, 40]. Unfortunately, ventilation settings in these studies were not described further. Thermodilution is also sensitive to changes in injectate volume, fluid status, body temperature, and volume of injected fluid bolus [48, 55], which all may be sources of measurement error and variability. The inherent error was reduced in the studies we included by averaging serial measurements of CO for both thermodilution and echocardiography. Pulmonary artery catheter thermodilution can either be measured as intermittent or continuous measurement of CO.

Intermittent bolus thermodilution has been thoroughly investigated in many clinical settings and has been performed with a precision below 10 % at least in scientific investigations [2, 5, 6, 56] and continuous thermodilution has shown equally good reproducibility [57]. The difference between the two methods is the sampling time: intermittent bolus thermodilution is most often performed as the mean of three consecutive measurements, and continuous thermodilution is an average value for CO measured over the previous 5–15 min [58, 59]. This makes the continuous method less sensitive for variations in stroke volume caused by tricuspid regurgitation but also less time-responsive when sudden changes in CO occur. In all the studies included in this review intermittent bolus thermodilution was used.

Knowing that the precision of the methods varies makes it even more important to determine the precision of the two techniques before comparing them [60]. The majority of studies included in our review did not assess the precision of the two techniques compared (echocardiography and thermodilution), so it is not possible to evaluate the acceptable agreement between them or to determine the interchangeability [14]. Some of the included studies accepted echocardiographic CO based on the often-used PE cutoff value of ±30 % proposed by Critchley and Critchley [61]. However, the conclusions of these studies are less valid, because the precision of the reference technique could be more or less than that assumed. Furthermore, we observed wide variations in the identified or calculated PE from the included studies. It is possible that PE varies depending on the CO value (e.g., lower CO may have a higher PE than higher CO), which could result in erroneous interpretation.

It is important that changes in SV or CO can be measured because clinicians not only work with single point values but also with the changes, in particular following interventions. Using the Bland–Altman method the precision and accuracy is assessed, but the trending ability is not. Currently, the most accepted analytic approach in evaluating trending ability is the four-quadrant plot with concordance analysis or the polar plot [62, 63]. In only three studies in our review were trending ability evaluated and analyzed with the use of four-quadrant plot or polar plot [5, 23, 31]. The included studies showed that the directional changes between echocardiography and thermodilution were in agreement. However, the magnitude between the techniques varied and echocardiography was insensitive in detecting small changes in CO measured.

The strengths of this systematic review include the extensive literature search done systematically in multiple databases without language restriction of identified articles, and two authors structured the inclusions and exclusions independently. However, the search string did only include English words and we cannot be sure that our search strategy was adequate and identified all relevant studies.

The limitations of this study were the heterogeneity of the included studies, the low sample size, the sparse clinical data, and insufficient use of the Bland–Altman analysis. We did not perform meta-analyses of the results because of the limited amount of valid data. Furthermore, the results cannot be used to assess the effects of echocardiography-based strategies on outcome. Our aim was restricted to the validation of echocardiography in measurements of CO compared to thermodilution as the clinical standard. However, the overall assessment of hemodynamics using echocardiography is complex and the extensive literature research in this field was beyond the scope of our review.

Ideally new hemodynamic tools should perform with short time responsiveness, accurate amplitude responsiveness, and the ability to track directional changes in CO [64]. For this echocardiography may hold some promise, but it has to be confirmed in larger high-quality studies. Until then we suggest that clinicians do not make treatment decisions based on CO measured by echocardiography alone. Also it may be that VTi should be measured and used clinically rather than SV or CO, because it is less likely affected by the assumptions of laminar flow and constant cross-sectional area as described above. When evaluating echocardiography it is important to bear in mind that it provides valuable information about the cardiac anatomy, cardiac function, valve pathology, and imaging of blood flow and assessment of systolic and diastolic function of the heart [49, 65, 66]. For assessments of these quantities and qualities echocardiography is highly valuable in critically ill patients.

Conclusions

We systematically evaluated the interchangeability of echocardiography with thermodilution in assessing CO and its changes. However, methodologically proper evaluation with estimation of the precision of each technique before comparing them was only performed in few studies. The small sample size, heterogeneity of the studies, and inadequate statistical assessment do not allow us to make any definitive statement, but current evidence does not support interchangeability between these techniques in measuring cardiac output. Thermodilution and echocardiography may be interchangeable in tracking directional changes in cardiac output, but this has to be confirmed in large high-quality studies.