A pulmonary artery catheter (PAC) is a device utilized in intensive care units (ICU) to measure the pressures in the superior vena cava, right heart, and pulmonary artery. It also enables the invasive assessment of cardiac output (COPAC) or stroke volume (SV) by thermodilution (TD). The use of a PAC is declining1 as significant complications have been associated with the procedure2,3 which have resulted in an increase in mortality4,5 and have raised doubts about its possible benefits.5 In contrast, a recent report concluded that the use of a PAC did not alter the mortality, general ICU or hospital length of stay, or cost for adult patients in intensive care.6 Furthermore, it has been emphasized that inappropriate clinical decisions and/or inaccurate hemodynamic data may well constitute a greater risk to the patient than all other PAC-related complications.7 Thus, for many investigators, measuring cardiac output (CO) using a PAC still represents the clinical reference method of choice8-11 when evaluating the accuracy or trending capability of less invasive techniques for measurement of CO.

Less invasive CO techniques are mostly based on arterial pulse contour analysis (PCA), which has been investigated for more than a century12 as a method for estimating and monitoring the SV on a beat-to-beat basis. In 1904,13 it was pointed out that SV is proportional to pulse pressure (the difference between systolic and diastolic blood pressure). At present, systems based on the pulse contour concept14,15 are far from being generally accepted as a reference method because other factors influence the pulse wave (e.g., underdamping/resonance artifacts frequently affect blood pressure measurement)16 and because of technical problems (e.g., proper calibration).17

For the assessment of CO by arterial pulse contour analysis (COPCA), an arterial catheter is required (usually already in place in critically ill patients). The invasiveness of these systems depends on the different calibration requirements.18 So-called calibrated pulse pressure analysis systems have to be referenced to another accepted (invasive or non-invasive) method. Calibration via transpulmonary (TP) TD (PiCCO/PiCCOplus),11 lithium indicator dilution (LiDCO), or bolus TD (Modelflow) requires central venous access. The Edwards FloTrac/Vigileo needs no invasive calibration but refers to an autocalibration algorithm based on the patient’s demographic data, as detailed in patent applications,Footnote 1 , Footnote 2 with the aim of adjusting for different hemodynamic situations. With the LiDCO system, the new LiDCOrapid also offers the possibility of autocalibration via a patient-specific scaling factor.Footnote 3 In contrast, the PRAM/MostCare system provides a quasi continuous cardiac output (CCO) readout requiring only a catheter in the radial or femoral artery without any calibration. An overview is presented in Table 1 (see Appendix 1 for further technical details).

Table 1 Competing pulse contour-based technologies in clinical cardiac output assessment
Table 2 Studies included in the pooled weighted analysis comparing different systems for measuring cardiac output with the intermittent bolus TD as reference

In this work, we present an extensive review of five semi-invasive systems, tested over a span of 20 years, their underlying technologies, and how they correspond with COPAC. Other recent reviews9,10,18-26 focused on only a single system or excluded at least one of the systems based on arterial pulse contour analysis. This review includes all of the five most popular commercially available systems and also provides technical details (based on their underlying patents) of the individual CO measurement systems. Furthermore a comprehensive pooled weighted analysis of their precision in various patient groups and clinical settings was performed and compared with that of COPAC. In previously published studies, meta-analyses were performed for only a single system,22 or the data of different pulse contour systems were analysed as a pooled unit.23 Our systematic analysis also explores possible differences between calibrated and non-calibrated systems, software generations, and performance differences during hemodynamically stable and unstable conditions. Nevertheless, because of incomplete data in the studies, not all of the reviewed studies were included in the analysis.

Methods

This systematic review was carried out in accordance with recommended methods as established by the Cochrane Methods Group on Screening and Diagnostic Tests, and this review also fulfils the criteria as set by the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) group (http://www.prisma-statement.org/).

A literature search covering the topic of semi-invasive CO measurement was performed using the keywords “cardiac output, (pulmonary) thermodilution CO, semi-invasive and minimally invasive CO, Vigileo, FloTrac, PiCCO, PRAM, LiDCO, PulseCO, Modelflow, and CO gold standard”. We searched electronic data bases up to August 2013, including MEDLINE (from 1990), Web of Knowledge (v.5.11) (from 1990), and Google Scholar. The search strategy included the following free-text and index terms: “arterial pressure-based cardiac output” or “arterial pressure waveform cardiac output” or “cardiac output” or “FloTrac” or “pulmonary artery thermodilution” or “thermodilution” and not “experimental” and not “pediatric” and not “animal”. In review articles, the bibliography was screened additionally for clinical reports and investigations of COPAC vs COPCA.

Two of the authors (T.S. and H.G.) carefully evaluated the search results (n = 416) to select the eligible articles for inclusion (see Appendix 2). First, obviously irrelevant items were excluded by reviewing the title and/or abstract of the records. Next, the full-text articles of the remaining papers (n = 238) were retrieved and checked to determine if they met the following eligibility criteria: 1) The study was published in a peer-reviewed journal written in English or German; 2) It was not retracted for any reason (n = 3); 3) It was performed in adults; 4) The study described a clinical investigation using one or more semi-invasive CO measurement systems to compare simultaneous measurements of CO or cardiac index with measurements using intermittent bolus right heart TD; and 5) Studies that did not use continuous CO measurements (e.g., Vigilance, Edwards Lifesciences) instead of COPAC as the reference method. After additionally screening the full-text articles as described, 108 clinical studies were selected for the review (see Fig. 1).

Fig. 1
figure 1

Flow diagram describing the search strategy to identify papers suitable for analysis

As the intention of this work was to focus on CO data based on arterial pulse contour analysis, we did not analyse derived parameters (e.g., systemic vascular resistance) or volumetric parameters (e.g., extravascular lung water) offered by the EV1000/Volume View from Edwards Lifesciences or by the PiCCO systems or LiDCOrapid for perioperative SV optimization and fluid administration. Other methods, like the Fick principle applied to carbon dioxide re-breathing techniques, esophageal Doppler velocimetry, or CO measured by bioimpedance, were excluded as well. The newly introduced Nexfin (BMEYE, The Netherlands), a photoplethysmographic technology which also offers the ability to measure CO noninvasively, was excluded because only two studies27,28 were found that supplied adequate data. In addition, noninvasive blood pressure monitoring with Nexfin did not seem to be sufficiently accurate to replace intra-arterial invasive blood pressure measurements in critically ill patients,29 a result that a priori questions its usefulness for noninvasive CO assessment.

Finally, out of these 108 studies, 80 publications with multiple (93) comparisons were analysed to assess the agreement of any of the five semi-invasive systems with intermittent bolus TD CO. In five publications, two or more systems were simultaneously compared with COPAC, and in five publications, two different software versions/generations were used. The five systems, PiCCO, LiDCO, Modelflow, PRAM and FloTrac, contributed 25, 12, 7, 9, and 40 trials, respectively, to the 93 comparisons. The following data were collected from the 80 publications: number of patients, age range and data points for each study, mean CO (SD), CO range, bias (SD) (semi-invasive system vs intermittent bolus TD), percentage error (PE), correlation coefficient (r), software version, study population, arterial access site, study design (blinded or non-blinded observers), and study limitations reported by the authors of the publications. In addition, we collected our own observations of study limitations. In case certain values (e.g., PE) were not reported, they were calculated from other values where possible. To fulfil the Critchley and Critchley criterion (C&Cc),30 a PE of ≤ 30% between the new CO measurement technique and COPAC had to be achieved. The PE was calculated as twice the SD of the bias divided by the mean CO.30 If the mean CO or the range of CO measurements was not stated explicitly in tables or text, it was estimated from the graphs. In seven studies, only the cardiac index was quoted, and we calculated CO from the body surface area (BSA). If BSA was not provided by the authors, a value of 1.9 m2 was assumed.

Statistical analysis

For each of the five semi-invasive CO measuring systems, mean CO, bias, SD of the bias, and correlation coefficient (r) were included in a pooled weighted analysis and weighted according to equation 1 23 and equation 2 31 on the number of measurements in each trial (see Appendix 3).

The pooled weighted PE was calculated as twice the pooled weighted SD of the bias over the mean pooled weighted CO. The pooled weighted analysis was done for all semi-invasive systems and separately for each system. In the FloTrac/Vigileo (COFT) studies, sub-group analysis of the three different software releases – first generation (V1.0-V1.03), second generation (V1.07-V1.14), and third generation (V3.0 and higher) – was performed to investigate whether software modifications are reflected in performance improvements. The PiCCO system is initially calibrated with TP TD. The performance of the PiCCO system strongly depends on the re-calibration interval;32,33 on the one hand, the interval is not always given by the authors, and on the other hand, different intervals have been suggested depending on the investigating group.34-36 Therefore, studies comparing PiCCO with TP TD as the reference method were excluded to avoid false positive distortion of the results relating to precision.

To verify whether the studies selected for the pooled weighted analysis are a representative selection of all 93 studies, the PE distribution of the studies in the pooled weighted analysis and that of all studies (if reported or at least calculable) were compared with a two-sample Kolmogorov-Smirnov test.

Additionally, a forest plot was drawn in order to provide further information for 14 studies dealing with hemodynamically unstable conditions. The 14 studies could not be included in the pooled weighted analysis because of incomplete data.

The statistical analysis was performed with SPSS® for Windows Release 20.0.0 (SPSS Inc., Chicago, IL, USA). Data are presented as mean (SD) or bias (SD) with a value of P < 0.05 considered significant.

Results

All 93 trials investigating the agreement of the five semi-invasive CO systems with intermittent bolus TD are listed in Appendix 4. The systems are grouped according to their different calibration methods (auto-calibrated, calibrated, and non-calibrated). Studies examining the same system are sorted by publication date in descending order.

FloTrac/Vigileo system

First-generation software (N = 10)

Nine out of ten studies investigated the performance of the first FloTrac generation (COFTg1) in cardiac surgery patients during fairly stable hemodynamic conditions. Although eight trials (80%) referred to the C&Cc, only four authors stated the mean or range of CO measurements. In five studies, different arterial access sites were used and the data were pooled.

Six studies37-42 classified the performance of the COFTg1 as not satisfactory and demonstrated poor accuracy, with the PE (40-55%) clearly exceeding the 30% limit of acceptability. Only three studies43-45 reported a PE < 40%, and the smallest PE of 33% with a bias of 0.55 (0.98) L·min−1 was reported in a study of 50 postoperative cardiac surgery patients.43 The only study46 using solely femoral arterial access found a bias of −0.15 (0.33) L·min−1 with COFTg1, and neither mean CO nor PE was mentioned. None of the ten studies fulfilled the C&Cc.

Second-generation software (n = 24)

Most of the FloTrac studies (n = 24) used the second-generation software (COFTg2). In 21 (88%) of the studies, PE was presented or calculable. In contrast with the COFTg1 evaluations, the second-generation studies were performed in various patient cohorts. Two authors45,47 consider modifications between the first- and second-generation software to have resulted in better accuracy in the CO measurements. Only six studies (four studies in cardiac surgery, one in liver transplant, and one in septic shock patients)45,48-52 using the second-generation software reported acceptable precision with a PE < 30%. During/after cardiac surgery,53-57 liver transplantation,58 and during septic shock,59 PE was < 50% (32-48%) with correlation coefficients ranging from r = 0.32-0.90. On the other hand, a high PE > 60% during cardiac surgery,60,61 in hyperdynamic cirrhotics,62 and in patients undergoing liver transplantation63 points to the fact that COFTg2 may deviate considerably from COPAC.

Up to now, four studies51,58,62,64 have reported a (logarithmic) relationship between the bias of COFTg2 and systemic vascular resistance (SVR), with the observation, the higher the bias, the lower the SVR.

Third-generation software (n = 6)

In two studies evaluating the FloTrac third-generation software (COFTg3), only poor agreement with COPAC was found during liver transplantation65,66 and in one study with septic shock patients.67 In contrast, in another study with septic patients51 and with cardiac surgery,68 COFTg3 and the COPAC reference agreed, with a PE of 29% and 22%, respectively.

When compared with the second generation, the third-generation software seems to be less sensitive to a changing SVR, thus resulting in improved overall precision and trending ability.51,66 Nevertheless, after living-donor liver transplantation, the bias between COFTg3 and COPAC still became apparent when SVR was < 1,000 dyne·sec·cm−5.69

According to the manufacturer,Footnote 4 the site of arterial access55 should not affect FloTrac/Vigileo results. Almost all studies investigated FloTrac performance via radial artery access (see Appendix 4). Five studies compared the radial vs the femoral access site. The results of two studies43,60 point to a modest but not negligible influence of the arterial access site. With a PE difference < 5%,51,55,68 arterial site-independent results were observed with COFTg2 and COFTg3. Two other studies using femoral access70,71 reported only limited agreement with COPAC during cardiac surgery.

PiCCO/PiCCOplus system (n = 25)

Twenty-five studies were identified that supplied adequate data in terms of bias and precision, and 21 of them were in cardiac surgery patients. The PE was revealed by the authors or calculable on the basis of other values in only 14 trials (58%). Range and mean CO were quoted in eight trials (32%). In 21 (88%) trials, the PiCCO catheter was inserted via the femoral artery.

The recalibration interval and the influence of the SVR on PiCCO-derived CO (COPiCCO) are still discussed controversially in the literature. According to two studies,70,72 changes in SVR do not affect the accuracy of COPiCCO if a recalibration is performed every four hours. Another study in hemodynamically stable patients73 emphasizes that recalibration of PiCCO is not necessary more often than every three hours and that COPiCCO is clinically acceptable (PE not stated). Nevertheless, the same authors recommend additional studies with PiCCO in septic shock patients or during the use of vasoactive drugs. Three studies34-36 concluded that recalibration of the PiCCO is necessary at least after marked changes in SVR. The requirement of frequent recalibration, especially in the presence of vasopressors, is also discussed by other authors.74,75 Remarkably, excellent results were found when COPiCCO and COPAC were compared in stable cardiac surgery patients,76 as long as there were no significant changes in SVR36 [bias (SD) of 0.23 (0.50) L·min−1 and PE 20%]. When the whole study period was evaluated, however, the PE of 36% exceeded clinical acceptability. Without any recalibration, a high bias > 1.0 L·min−1 and SD > 2.0 L·min−1 of COPiCCO was observed.77,78 When initial calibration was performed with COPAC instead of TP TD CO (COTPTD), PiCCO results were not comparable with the reference method: COPiCCO was underestimated and low correlation coefficients (r < 0.40) were found and, if calculable, PE was beyond clinical acceptability.21,46,71,79

In hemodynamically stable cardiac surgery patients, comparable but not interchangeable results (PE 34-43%) were observed. The PiCCO system was acknowledged to be useful to monitor trends, but intermittent bolus TD remained the method of choice for measuring CO.55,80,81 In similar patients,82,83 COPAC and COPiCCO did not agree and showed large discrepancies (PE > 50%). Just a few authors reported a PE < 30%, indicating interchangeable results of COPiCCO and COPAC.45,76,84-86

Several studies35,46,47,87-89 performed only in cardiac surgery patients reported a small bias < 0.5 L·min−1 with a SD > 0.5 L·min−1 and correlation coefficients up to r = 0.93. Although the authors argue that COPiCCO is a reliable alternative to COPAC, it has to be emphasized that important information (PE and mean) is not given.

LiDCO/PulseCO system (n = 12)

Nine of 12 studies comparing LiDCO-derived CO (COLI) with COPAC reported the PE. Eighty-three percent of the investigators used radial artery access to measure the arterial lithium concentration. Up to now, the new LiDCOrapid system has been evaluated only in animal studies or compared with other CO measurement methods but not with bolus COPAC, therefore, the studies were not included in our analysis. COLI showed good agreement with COPAC during hemodynamically stable conditions post cardiac surgery,90-93 after liver transplantation,94 and in patients with severe pre-eclampsia.95 Three studies showed clinical acceptability of LiDCO (PE < 30%), although initial calibration was performed with intermittent bolus TD instead of the manufacturer recommended lithium dilution technique.21,71,96 Nevertheless, with initial COPAC calibration and without any recalibration, COLI overestimated COPAC during cardiac surgery.97 Two studies (22%) postulated that LiDCO cannot be used interchangeably with COPAC in liver transplant patients63 or in a mixed study population, including septic patients42 COLI clearly failed to show acceptable accuracy (PE of 76% and 40%, respectively).

Modelflow system (n = 7)

In six of the studies evaluating CO with the Modelflow system (COMF), the PE was stated or at least calculable, and met the 30% limit. All studies but two98,99 were performed in rather small patient groups (n < 30 patients). After calibration with COPAC, COMF showed high accuracy with pressure signals obtained from a radial or femoral artery and was able to replace intermittent bolus TD during cardiac surgery21,99,100 and in septic shock patients.101 Nevertheless, the C&Cc was not fulfilled during liver transplantation.98 After aortic diameter calibration102 instead of TD calibration, COMF showed clinical acceptability (PE = 12%). Interestingly, even with noninvasive pressure signal monitoring after ultrasound calibration, a small bias and small SD was reported in critically ill ICU patients.103

PRAM/MostCare system (n = 9)

The nine studies suitable for analysis can be divided into studies with excellent and comparable results for CO measured by PRAM (COPRAM) and COPAC and into studies which show only poor agreement between the two methods. The PRAM technique was reliable in patients undergoing left or right heart catheterization.104,105 Pressure in both studies was recorded via an aortic catheter and not from a peripheral arterial line. Excellent performance of COPRAM was also reported during106,107 and after cardiac surgery108 and in patients with an intra-aortic balloon pump.109 Despite these findings, differences between COPAC and COPRAM became evident at extremely high or low CO values.105,106 In septic shock patients,110 there appeared to be no correlation between SVR and bias, and the C&Cc was met (PE = 25%). The results of two post cardiac surgery studies111,112 are in clear contrast with those of other studies.104-110 It should be pointed out that the latter studies were performed either by the same group or by authors cooperating with this group. The reason for the enormous discrepancy between these two groups of studies (PE > 73%) is not clear, especially since study sizes and participants were comparable.

Pooled weighted analysis

Forty-three (46%) of 93 trials listed in Appendix 4 provided adequate data for a pooled weighted analysis of mean CO, bias (SD), and PE: eight (32%) studies on PiCCO, five (42%) studies on LiDCO/PulseCO, seven studies (100%) on Modelflow, five studies (56%) on PRAM, and 18 studies (45%) on FloTrac/Vigileo (n = 4/9/5 trials with the first/second/third-generation software, respectively).

The PE distribution of the 43 selected studies for the pooled analysis (Table 1) and of all studies compiled in Appendix 4 showed no significant differences (P = 0.96) across the percentile ranking (two-sample Kolmogorov-Smirnov test).

The calculated mean weighted pooled data are presented in Table 3. The 43 studies (5,780 measurements in total) resulted in a pooled weighted bias of −0.28 (1.25) L·min−1 and a pooled weighted PE of 40%. Thus, our findings are in concordance with another meta-analysis23 reporting a pooled PE of 42.1% in 21 studies with pulse contour systems. The pooled bias points to underestimation of COPAC in all systems with the exception of PRAM (Fig. 2A). Worth highlighting, the widest range in bias was observed with COFTg3. The pooled PE was lowest for COLI (27%) and highest for COFT (52%; in subgroup FTg2 59%). Only LiDCO fulfilled the C&Cc; PiCCO and Modelflow exceeded it marginally (PE = 32%), FloTrac/Vigileo (third-generation software) and PRAM grossly exceeded the 30% limit (PE 47% and 44%, respectively), as also shown in Fig. 2B. In the COFT subgroup analysis (see Table 3 and Fig. 3), the lowest bias of 0.06 (1.31) L·min−1 and the lowest PE (45%) in this group were found in the first-generation software.

Table 3 Pooled weighted data showing agreement between the five semi-invasive CO systems and intermittent bolus thermodilution
Fig. 2
figure 2

Pooled weighted bias (A) and percentage error (B) showing agreement of cardiac output measured by five semi-invasive systems (FTg3: n = 5; LiDCO: n = 5 Modelflow: n = 7; PiCCO: n = 8; PRAM: n = 5) and intermittent bolus thermodilution. º Mean pooled weighted bias and PE (cardiac output [CO]method vs COPAC); bars indicate range of bias and PE, respectively. Broken lines represent zero bias (A) and the 30% Critchley & Critchley criterion (C&Cc) (B). COPAC = cardiac output assessed using a pulmonary artery catheter; PE = percentage error

Fig. 3
figure 3

Pooled weighted bias (A) and percentage error (B) showing agreement of cardiac output measured by FloTrac, first, second, and third (n = 4/9/5, respectively) software generation and intermittent bolus thermodilution. º Mean pooled weighted bias and percentage error (PE) (COFT vs COPAC); bars indicate range of bias and PE, respectively. Broken lines represent zero bias (A) and the 30% Critchley & Critchley criterion (C&Cc) (B). COFT = cardiac output assessed using the FloTrac system; COPAC = cardiac output assessed using a pulmonary artery catheter

Eight of these 43 studies were performed in liver transplant and septic shock patients and used for a sub-analysis to investigate the differences in performance in hemodynamically unstable situations (Fig. 4). With 1,911 measurements in total, the five semi-invasive systems (PiCCO/ LiDCO/ Modelflow/ PRAM/ FloTrac) contributed with n = 0/1/2/0/5 trials, respectively, to the hemodynamically unstable cohort. This cohort yielded a pooled weighted bias of −0.54 (1.64) L·min−1 (Fig. 4A) and a pooled weighted PE of 45.3% (Fig. 4B) with r = 0.75. Compared with all studies included in the analysis, hemodynamic instability results in a slightly higher PE (5% higher) and bias. The exclusion of the eight studies performed in unstable patients yielded a smaller bias of −0.15 (1.04) L·min−1 and a smaller PE (38%) compared with all studies in the pooled analysis (Table 2).

Fig. 4
figure 4

Pooled weighted bias (A) and percentage error (B) showing agreement of all studies included in the analysis (n = 43); studies excluding hemodynamically unstable conditions (n = 35); and those studies referring to hemodynamically unstable conditions (n = 8). º Mean pooled weighted bias and percentage error (PE) (cardiac output [CO]method vs COPAC); bars indicate range of bias and PE, respectively. Broken lines represent zero bias (A) and the 30% Critchley & Critchley criterion. COPAC = cardiac output assessed with a pulmonary artery catheter

Thirty-nine studies (Table 4) met the criteria for pooled weighted analysis of the correlation between the five systems and bolus TD. The highest correlation was found for COLI (r = 0.88) and the lowest for COFT (r = 0.54; in the subgroup FTg1 r = 0.50). A correlation coefficient was given in only one study with COFTg3 (r = 0.67). For all semi-invasive studies, the pooled weighted correlation resulted in r = 0.71 and was slightly lower than in a recently published analysis including only 12 pulse contour studies (r = 0.75).23

Table 4 Pooled weighted correlation between the five semi-invasive CO systems and intermittent bolus thermodilution

In order to show the results obtained in hemodynamically unstable patients, we also analysed the bias and confidence intervals in those studies; however, because of incomplete data, the results could not be included in the pooled analysis. These results are compiled in the forest plot (Fig. 5) covering FloTrac (n = 5, second generation and n = 4, third generation), PiCCO (n = 1), LiDCO (n = 2), and Modelflow (n = 2). All but two pulse contour systems underestimated CO compared with COPAC.

Fig. 5
figure 5

Forest plot showing the agreement of cardiac output measured by five semi-invasive systems with intermittent bolus thermodilution in 14 studies referring to hemodynamically unstable conditions. ■ bias (cardiac output [CO]method vs COPAC); bars indicate the 95% confidence interval. COPAC = cardiac output assessed with a pulmonary artery catheter. *Cardiac index converted to cardiac output with body surface area of 1.9 (L·min−1·m−2). The 14 selected studies include the eight from Fig. 4 designated as unstable plus those six studies in which neither the mean cardiac output (CO) nor the number of data points were stated. Notice that studies with septic patients and with liver transplant patients characterized as “hemodynamically stable” by the author or studies in which the bias was given in % are excluded (see Appendix 4)

Discussion

For monitoring in the perioperative period and in the critical care setting, systems based on pulse contour measurement have recently been offered as a more-or-less accurate and safe alternative113 to the highly invasive Swan-Ganz PAC. Despite continued efforts to introduce improved products to the market, the main outcome of our analysis is that a clear recommendation cannot be given for any single system that can accurately monitor hemodynamically unstable patients. This limitation also applies to reliable intraoperative monitoring during surgery accompanied by hemodynamic instability. The informative value of COPCA-based monitoring during hemodynamically stable conditions should be questioned, since CO data provided by these monitors parallel the arterial pressure as long as the compliance and resistance remain unaffected.

From the technical point of view, it is important to be aware of the inherent limitations of the mathematical models/algorithms implemented. Important model parameters might have been derived from patient cohorts that might not always fully match the critical care patients to be monitored. It is therefore necessary to readjust these parameters, especially during hemodynamic instability. We found no explicit evidence that suggested calibration intervals were strictly followed. If this were the case, it seems clear that the calibrated systems would provide more accurate CO data than the non-calibrated or auto-calibrated systems.

This is in line with our results showing the calibrated systems to be more accurate (LiDCO, Modelflow, and PiCCO) than the auto-calibrated FloTrac or the non-calibrated PRAM (see Fig. 2). It is noteworthy that almost all systems failed to fulfil the C&Cc in both hemodynamically stable and hemodynamically unstable scenarios (Table 3).

COPAC as reference method of choice

Although COPAC was long the “gold standard” and is still widely accepted as the reference method of choice for CO determination,114,115 the method itself suffers from several limitations. Besides its invasiveness and the concomitant risks, the accuracy of the method also depends on external factors, e.g., overestimates have been reported at low CO levels.116 Other factors that may influence the accuracy of bolus TD are valve insufficiency, fluid discontinuation and shunting,117 ventilation,118 transition from cardiopulmonary bypass,119 and operator experience. Triplicate injections are recommended to achieve acceptable accuracy,117,120 although it has also been shown that four CO measurements in series must be averaged in order to be 95% confident that the result is within 5% of the “true” CO.121 When all these factors are taken into account, the overall accuracy of the TD reference COPAC may be ± 15% at best (in a recent in vitro study, the PE was shown to range from 13-15.3%).122 In light of this basic limitation, the question of clinically acceptable error has to be raised. When C&C analysed 34 studies (23 bioimpedance vs COPAC, 11 Doppler vs either COPAC or Fick CO2 rebreathing),30 they found differences between the methods, i.e., up to 37% in the PE for PAC/Fick and up to 65% higher for Doppler measurements. The authors considered an error of 20% acceptable for clinical practice. When methods with a 20% error are compared, a deviation of up to 28.3% will result. Therefore, C&C30 concluded that a deviation of < 30% would still be acceptable when comparing a new CO measurement system with COPAC. This position has also been challenged123 because quoting the PE as an adequate criterion without reporting the precision of the reference technique124 or the confidence intervals125 could lead to erroneous conclusions. It has been proposed to enlarge the acceptable PE to 45%,123 which would mean that the tested method would show a precision of only 42.4% and 40.3%, respectively, when assuming a precision of 15% or 20% for the reference method.

Limitations with respect to the accuracy of the chosen reference method

When aiming at a sufficiently close estimate of the “true” precision of the tested method, it is important to be clear about the accuracy of the reference method. We were not able to define the averaged precision of the reference method for the pooled 43 studies, as the relevant data on the reference were only sparsely described or not reported. If the reference technique had been performed with less precision than the generally accepted 20%, then this would have resulted in a smaller PE for the tested semi-invasive method124 and in the acceptance of the studied technique based on a questionable level of precision. None of the investigators stated the predicted level of precision for the tested technique at the start of their study.

Limitations in our analysis with regard to available data

First, with respect to our analysis, we appreciate that the number of studies varied considerably for the different systems (from seven Modelflow up to 40 FloTrac/Vigileo studies). No more than 43 reports (46%) out of 93 trials in our extensive literature search provided adequate data for a pooled weighted analysis, a fact which considerably reduced the available data pool for a thorough evaluation and thus weakened the statistical power. Furthermore, due to shortage of data we could not perform a detailed sub-analysis regarding the influence of vasoactive drugs, reasons for hemodynamic instability, or differences with respect to peri-, intra-, and postoperative CO conditions.

Second, the significant heterogeneity in the number of data pairs evaluating the different CO devices impairs the strength of our analysis.

Third, in seven papers cardiac index but not CO data were reported. Assuming a body surface area of 1.9 m2 could possibly have modified our overall results; however, we consider such modification to be insignificant.

Fourth, studies that compare these systems with other reference methods were explicitly excluded (as outlined in our Methods section), reducing the available body of knowledge on the performance of COPCA methods. For example, we excluded several studies comparing the FloTrac/Vigileo with CCO69,123,126-130 as well as with TP TD131-133 or esophageal Doppler.134 We also excluded the few available studies comparing LiDCO with TP TD135 or CCO136 as well as an evaluation of the PRAM system vs CCO.109 A single study evaluated the Modelflow device using graded lower body negative pressure.137

Comparison of systems

For the FloTrac/Vigileo system, 18 applicable studies using different software versions were selected, and only two studies48,65 met the C&Cc. If the software version was not stated, we inferred the version from another study.22 Remarkably, the smallest PE (45%) in the pooled analysis of FloTrac data was found in the studies using devices with first-generation software but in hemodynamically stable conditions (see Fig. 3B). The highest pooled PE (59%) was found in studies using the second-generation software, but these investigations were performed in patients in hemodynamically less stable conditions. When the manufacturer introduced the third-generation software, it was claimed to take enhanced account of changing hemodynamic conditions.Footnote 5 Though there is a modestly smaller bias in the third-generation software than in the second (see Figs. 3 and 5); nevertheless, it is important to be aware that COFTg3 may grossly deviate from COPAC or CCO during hemodynamic instability138 and particularly in extreme conditions of vasoconstriction or vasodilation.123 As yet, the FloTrac/Vigileo algorithm for autocalibration apparently adjusts insufficiently for gross changes.

For the PiCCO system, only eight of 25 studies included sufficient data to be included in the pooled weighted analysis. The lowest reported PE was 20%;36 however, this was measured in the pre-induction phase of anesthesia. In the pooled analysis, PiCCO exceeded the PE criterion only marginally (PE = 32%). Since almost all data were obtained in hemodynamically stable conditions, it must be concluded, based on the available data, that it is not possible to judge the reliability of PiCCO under hemodynamically unstable conditions.

Many studies assessing the three other CO measurement systems (LiDCO, PRAM, and Modelflow) show a PE of 30%; however, one should note that most of these studies were performed in only three centres (Modelflow as well as PRAM). For the PRAM system, two studies from external centres report high PEs of 87%111 and 73%,112 respectively, yielding a pooled weighted PE of 44%. The PRAM device was the only system showing a pooled bias overestimation (0.14 L·min−1), while all other devices underestimated COPAC. Remarkably, with a pooled PE of 27% (LiDCO), just one of the five semi-invasive systems fulfilled the C&Cc, and the highest pooled correlation coefficient was found with LiDCO (r = 0.88). On the other hand, a most recent LiDCO study performed in animals139 highlights a large bias between COLI and COPAC and identifies a number of drugs used in perioperative medicine that influence the accuracy of the LiDCO sensor in vitro.140 As we found no comparisons with COPAC in humans, LiDCOrapid studies were not included in our analysis. This auto-calibrated systemC was validated against the commonly used LiDCO indicator dilution-based calibration and a correlation of r = 0.88 was reported. According to the manufacturer, the scaling factor estimate may not be as precise as an independent calibration with a well-performed indicator dilution method. It therefore remains highly questionable whether the auto-calibrated LiDCOrapid system would successfully replace the lithium indicator calibrated measurement. Special care should be taken when using LiDCOrapid, especially in patients with severe peripheral vasoconstriction with the particular requirement of high-fidelity pressure recording.C

Tracking changes

With respect to measuring trends in CO, the capabilities of various CO measurement devices (Vigileo, PiCCO, bioimpedance, Doppler sound, and pulse contour) were carefully analysed in a recent review.141 If these devices are used to track changes in CO, induced for instance by preload changes, care must be taken to ensure there are no additional influences from altered vascular tone.24 A most recent study142 emphasizes the rather poor performance of the Vigileo system in tracking changes in CO induced by increased vasomotor tone: the concordance rates between COPAC- and COPCO-changes were 67.5%, 28.8%, and 7.7% in the low, normal, and high SVRI states, respectively.

A recent report143 emphasizes that, in clinical practice, the dynamic response (trending) to interventions is more important and critical than absolute values of CO. More serious consideration should be given to the ability to track (induced) CO changes144 as well as the impacts of time and repetitive measurements over time.145 Accordingly, future studies should include the analysis of trending ability using three different statistical techniques:66 by correlation coefficients between the system under evaluation and the particular reference method, by a modified Bland and Altman analysis using ΔCO data (ΔCO representing the change between sequential readings), and by plotting Δsemi-invasive CO against ΔCOPAC on a four-quadrant plot.146

When to use semi-invasive PCA systems?

Unstable hemodynamics appears to be a general problem for pulse contour analysis.38 In unstable conditions, intraoperatively, and in the ICU, our results show a 7% higher PE and a larger bias (−0.54 vs −0.15 L·min−1) than in the hemodynamically stable cohort (Fig. 4). In such situations, a more reliable and invasive technology (COPAC)143 or CCO123 should be considered.

The pulse contour measurement of CO is strongly influenced by factors independent of true changes in CO such as those affecting the arterial pressure (e.g., vascular tone, compliance, and the arterial site).24 Further validation studies, particularly covering a wide CO range, are required147 to assess the reliability of the currently implemented algorithms which tend to either under- or overcompensate for prominent increases (or decreases) in vascular tone and compliance. The algorithms implemented in these devices are primarily based on the model described by Wesseling.100 Besides age, sex, and body mass index, this model is based on a strict mathematical relationship between (aortic) compliance and pressure and can hardly take into account real changes in vessel compliance due to vasoactive drugs or mediators. This rather inflexible model will fail during hemodynamic instability. The deficiency in the model can be compensated by repeated calibration. To date, studies are lacking that explicitly provide the calibration intervals needed to maintain the accuracy of the COPCA measurements. This information would be helpful for proper analysis, particularly since the producers of semi-invasive monitoring systems market them as having signal stability over time.

Physicians should keep in mind the limitations of these technologies, especially in unstable critically ill patients. Although a recent study concluded that only 39% of patients undergoing surgical procedures met the criteria for semi-invasive hemodynamic monitoring,148 COPCA systems may have their place in postoperative intensive care medicine when the administration of fluids and vasopressors is guided to specific therapeutic endpoints (“goal-directed therapy”). Nevertheless, only a few studies showed reduced mortality and morbidity149,150 or reduced length of hospital stay151,152 (but not reduced ICU stay)152 when hemodynamic monitoring and therapy were coordinated.

Positive reports on the clinical suitability of presently available semi-invasive pulse contour systems for continuous CO measurement are increasingly found in the literature. These systems are gaining in popularity despite the fact that the measured CO in various clinical situations shows only limited agreement with intermittent bolus TD. Further improvements and validation studies are required. There is also a need to show whether there is a resulting healthcare benefit if these monitors are used in regular clinical practice. In the interim, the physician should be aware of the inaccuracy of currently available CO monitoring devices based on PCA and should not be guided solely by CO data. The physician providing care must also adhere to a hemodynamic optimization strategy that includes all relevant clinical parameters for secure therapeutic decision-making.