Accuracy and precision of non-invasive cardiac output monitoring by electrical cardiometry: a systematic review and meta-analysis

Cardiac output monitoring is used in critically ill and high-risk surgical patients. Intermittent pulmonary artery thermodilution and transpulmonary thermodilution, considered the gold standard, are invasive and linked to complications. Therefore, many non-invasive cardiac output devices have been developed and studied. One of those is electrical cardiometry. The results of validation studies are conflicting, which emphasize the need for definitive validation of accuracy and precision. We performed a database search of PubMed, Embase, Web of Science and the Cochrane Library of Clinical Trials to identify studies comparing cardiac output measurement by electrical cardiometry and a reference method. Pooled bias, limits of agreement (LoA) and mean percentage error (MPE) were calculated using a random-effects model. A pooled MPE of less than 30% was considered clinically acceptable. A total of 13 studies in adults (620 patients) and 11 studies in pediatrics (603 patients) were included. For adults, pooled bias was 0.03 L min−1 [95% CI − 0.23; 0.29], LoA − 2.78 to 2.84 L min−1 and MPE 48.0%. For pediatrics, pooled bias was − 0.02 L min−1 [95% CI − 0.09; 0.05], LoA − 1.22 to 1.18 L min−1 and MPE 42.0%. Inter-study heterogeneity was high for both adults (I2 = 93%, p < 0.0001) and pediatrics (I2 = 86%, p < 0.0001). Despite the low bias for both adults and pediatrics, the MPE was not clinically acceptable. Electrical cardiometry cannot replace thermodilution and transthoracic echocardiography for the measurement of absolute cardiac output values. Future research should explore it’s clinical use and indications.


Rationale
Information about the hemodynamic status of patients plays an important role in daily clinical practise in the emergency department, the intensive care unit (ICU) and operating room (OR). Heart rate, blood pressure and pulse-oximetry monitoring is generally applied. Advanced hemodynamic monitoring is used in critically ill and high-risk surgical patients. Many studies, including meta-analyses [1][2][3][4][5], have shown that optimization of hemodynamic parameters reduces mortality, morbidity, post-operative complication rates, duration of hospital stay and improves functional recovery in high-risk surgical patients.
In adults intermittent pulmonary artery thermodilution (intermittent PAC) and transpulmonary thermodilution (TPTD) are considered the gold standard for the measurement of cardiac output (CO). However, these methods are invasive and linked to complications [6][7][8][9]. In neonates and pediatric patients transthoracic echocardiography (TTE) is the most commonly used technique. This technique has several limitations as it requires an experienced operator, is technically demanding and is obtained intermittently.

3
method is based on changes in thoracic resistance as a result of changes in blood velocity during the cardiac cycle and uses an algorithm to calculate the CO. Sramek and Bernstein (1986) modified the algorithm [14]. The most recent modification is the Bernstein-Osypka Eq. (2003), also called electrical velocimetry or electrical cardiometry (EC) [15,16]. The latter name will be used in this manuscript.
EC measures alteration in thoracic resistance or impedance, using four skin electrodes. EC is able to isolate the changes in impedance created by the circulation, partly caused by the change in orientation of the erythrocytes during the cardiac cycle (Fig. 1). Impedance cardiography can be affected by the remaining thoracic tissue or fluid [17]. Two electrodes are placed on the left base of the neck and two on the left inferior side of the thorax at the level of the xiphoid process (Fig. 1). Exact placement of the electrodes is important because measurements can vary when placement is incorrect. The inter-electrode gap of the lower electrodes should be 15 cm in adults [18]. The electrodes are connected to either the Aesculon ® monitor (Osypka Medical GmbH, Berlin, Germany) or the ICON ® monitor (Osypka Medical GmbH, Berlin, Germany), which is smaller in size and portable. Both devices derive stroke volume, heart rate and CO from the impedance values. Further details of the devices are described elsewhere [15,16,19].
This safe and easy applicable method could be a suitable candidate to complement or replace invasive CO monitoring. Several studies tried to validate EC using different reference methods, leading to conflicting results. EC was part of three meta-analyses with limited studies only [10][11][12]. So, its place between all existing hemodynamic monitoring devices has yet to be determined. Our meta-analysis focuses exclusively on EC, for definitive validation of accuracy and precision in both adults and pediatrics.

Objective
We conducted a systematic review to assess the accuracy and precision of CO measurement by EC compared to a reference method, in both adults and pediatrics. The primary outcome measures were (i) accuracy, defined as the bias between the CO measured by EC and the reference methods, (ii) precision, defined as the standard deviation (SD) of the bias, (iii) the limits of agreement (LoA) defined as [bias ± 1.96*SD], and (iv) the mean percentage error (MPE) derived from the SD and mean CO. A pooled MPE of less and systole (right) explaining the difference in thoracic impedance. Figure reproduced from Osypka Medical GmbH, an introduction to Electrical Cardiometry [19] than 30% was considered clinically acceptable, as described by Critchley and Critchley [20].

Methods
This systematic review was conducted using Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) approach (See Table 5 in Appendix 1) [21].

Eligibility criteria
Eligibility criteria were (1) studies comparing CO measurement by EC and a reference method, (2) studies using Bland-Altman analysis to report bias, SD of the bias and MPE or for which those data could be extracted [22], (3) studies performed in humans and (4) studies published as a full paper in English. Studies involving participants of any age and under any clinical circumstances were included. No restriction in publication date was applied.

Information sources and search
Two independent investigators (MS and SS) performed an electronical database search of PubMed, Embase, Web of Science and the Cochrane Library of Clinical Trials. The last date of search was January 4, 2019. Studies that were not published as full journal articles (e.g. letters, editorials, conference papers) and retracted publications were excluded. The search strategy conducted in PubMed is shown in Appendix 2. The search strategies for the other databases were comparable and are available on request. The manufacturer of ICON ® /Aesculon ® (Osypka Medical GmbH, Berlin, Germany) and the website were consulted to identify additional studies. The reference lists of all included studies were screened for additional studies. EndNote ® software, version X8.1 (Thomson Reuters, New York, USA) was used to arrange all articles and to filter the duplicates between databases.

Study selection
Two independent investigators (MS and SS) identified the potentially relevant studies. The first selection was based on title and abstract. The remaining full text articles were reviewed for eligibility. After including an article we arranged them in the category adult or pediatric patients. Conflicts were resolved by consensus or after consultation with the third investigator (CS). The flow diagram of this study selection process is shown in Fig. 2.

Data collection process
A customized data form was developed by three investigators (MS, SS and CS), using Microsoft Excel (Microsoft Office, Washington, USA). The data extraction form was pilot-tested on five randomly-selected included studies and refined. Data were extracted independently by two investigators (MS and SS). Patient characteristics, clinical setting, age, reference method and device, number of patients, total number of measurements, and financial support were considered relevant (Tables 1, 2). For the statistical analyses we extracted mean CO, CO range, bias, SD of the bias, LoA and MPE (See Tables 6, 7 in Appendix 3, 4). Precision of the reference and tested method and assessment of trending ability were added to the data extraction form after the pilot-test. Disagreements in data extraction were resolved by consensus or by consultation of CS.
Mean CO, bias, LoA, SD, MPE and precision of the reference or tested method were defined according to the following equations: Missing information was calculated using the equations above. If the data could not be calculated, data was extracted from the Bland-Altman plot. If both options could not be applied, the authors were contacted. Duplicate publication of data was assessed by juxtaposing author names, reference methods, sample sizes, outcome measures mean CO, bias, MPE and data points in Bland-Altman plots.

Risk of bias assessment in individual studies
To assess the risk of bias for individual studies we used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) guidelines [23]. The original QUADAS-2 tool consists of the four domains patient selection, index test, reference test, flow and timing. Signalling questions

3
are used to assess the risk of bias in each domain. The first three domains are also assessed in terms of concerns about applicability. Kim et al. modified these guidelines to make them more suitable for method-comparison studies [24]. We modified Kim's QUADAS-2 tool and pilot-tested it on five randomly-selected included studies and refined it accordingly. After the pilot-test, we developed a fifth domain, to assess the statistical analysis and implemented the recommendations of Cecconi [25]. The modified QUADAS-2 tool is available in Table 8 in Appendix 5. MS and SS independently assessed the risk of bias. Conflicts were resolved by consensus or by consultation of CS.

Summary measures
The primary outcome measures were (i) accuracy, defined as the bias between the CO measured by EC and the reference methods, (ii) precision, defined as the SD of the bias, (iii) the LoA and (iv) the MPE. A pooled MPE of less than 30% was considered clinically acceptable, as described by Critchley and Critchley [20].

Synthesis of results
Pooled bias, LoA and MPE for both adults and pediatrics were calculated using a random-effects model, as heterogeneity could be present, and forest plots were created. The weight given to the results of the independent studies was determined according to the inverse variance method. Inter-study heterogeneity was calculated using a Q test and described as an I 2 index (0% no heterogeneity, 25% low heterogeneity, 50% moderate heterogeneity, 75% high heterogeneity) [26]. If an individual study led to multiple outcome measures for bias, LoA and MPE, the outcomes of those studies were presented in different rows in the forest plot.

Subgroup analyses
Subgroup analyses of the gold standard thermodilution (TD) in adults and most commonly used method TTE in pediatrics were pre-specified for definitive validation of EC. For adults, we distinguished between intermittent TD and continuous TD, as continuous TD averages CO over a longer time period. This led to the subgroups intermittent TD, continuous TD and other reference method. For pediatrics we distinguished between children and neonates, which led to the subgroups TTE children, TTE neonates and other reference method children. A test for subgroup differences was applied. Subgroup analysis for clinical setting was conducted

Risk of publication bias across studies
Risk of publication bias across studies was assessed for both adults and pediatrics using funnel plots, showing the bias versus it's standard error. The symmetry of the funnel plots was assessed visually and by Egger's regression test using a significance level of 0.1 [27]. The statistical analyses were conducted using R, version 3.4.2 (R Foundation for Statistical Computing, Vienna, Austria), Rstudio (RStudio, Inc., Boston, USA) and SPSS Statistics, version 25.0 (IBM Business Analytics, New York, USA). The lay-out of the forest and funnel plots was customized using Adobe Photoshop CS4 (Adobe Systems, California, USA).

Contacting authors
We contacted three authors concerning the direction of the bias (reference-tested method or tested-reference method) [28,40,47]. One of them responded [47] and for the other two studies we interpreted the direction of the bias ourselves [28,40]. We contacted one author concerning the mean CO and MPE [67]. As the mean CO was not described in the manuscript and could not be extracted from the Bland-Altman plot, the MPE could not be calculated. The author did not respond and therefore the study was excluded.

Risk of bias in individual studies
The assessment of the risk of bias for adult studies is provided in Table 3 and for pediatrics in Table 4. The majority of the included studies was judged low risk of bias with respect to patient selection, tested method, reference method and flow timing. For six studies potential for bias existed in more than one of those four domains, but were considered low risk [30,33,34,37,38,47]. Concerning the statistical analysis domain, all studies were judged high risk, except for two studies [46,49]. Concerns on applicability were assessed low for all studies, which is not shown in Tables 3  and 4.

Synthesis of results, adults
The pooled results for the adult studies are shown in Fig. 3. The overall random effects pooled bias was 0.03 L min −1 [95% CI − 0.23; 0.29], LoA − 2.78 to 2.84 L min −1 and MPE 48.0%. Inter-study heterogeneity was high (I 2 = 93%, p < 0.0001). For two studies multiple data for a patient is presented in two or three different rows in the forest plot, as those studies presented multiple outcome measures for different clinical circumstances [30,34]. Therefore, the number of patients in the forest plot for adults (N = 667) differs from the actual number of adult patients (N = 620). Heterogeneity was high (I 2 = 96%, p < 0.0001). There was no statistically significant difference in subgroup effects (p = 0.82). The study by Mekis et al. was conducted during cardiac surgery and in the ICU [33]. Therefore, we divided the data of this study in three rows, namely before and immediately post cardiac surgery and in the ICU. As three rows in the subgroup analysis for clinical setting replace one row in the forest plot for adults (Fig. 3), the number of patients and pooled data presented in the subgroup analysis for clinical setting slightly differ from the actual pooled data presented in the forest plot for adults.

Risk of publication bias across studies
To detect risk of bias across studies, funnel plots are shown in Figs. 5 and 6. Egger's regression test showed no significant p value for both adults (p = 0.4147) and pediatrics (p = 0.6572), indicating a low risk of publication bias [27]. However, for both groups asymmetry could be detected, which could be caused by publication bias or high heterogeneity. The latter is most likely the explanation. However, publication bias cannot be excluded.

Trending ability
Seven of the thirteen studies in adults assessed trending ability, applying several statistical analyses [28-31, 33, 34, 39]. Magliocca et al. and Wang et al. analysed trending ability using a 4-quadrant plot, showing a concordance rate of respectively 100% and 56.5% [31,39]. Other statistical methods were a time plot [29], a receiver operator characteristic curve [28], descriptive analyses of changes in CO for the whole study population [33,34] or individuals [30].
None of the studies in pediatrics evaluated trending. Due to a lack of agreement on the statistical methodology, no pooled results can be calculated. Although the pooled bias in both adult and pediatric studies was close to zero, high accuracy cannot be assumed, as the range of the bias in the studies was wide. The direction of the bias (positive or negative) is inconsistent and cannot be predicted in the clinical setting, which corresponds with the high inter-study heterogeneity. Pooled MPE in all subgroups were above the recommended 30% [20]. Therefore, EC cannot replace TD and TTE for the measurement of absolute CO values.

Summary of evidence
The ICON ® and Aesculon ® monitors were included in three other meta-analyses [10][11][12]. Importantly, the data of the three other meta-analyses are the result of subgroup analyses for TEB, including EC but also other devices based on other algorithms. Therefore, no conclusions may be drawn for EC only. Peyton and Chong found a bias of − 0.10 L   [11]. We found similar bias, but could not confirm the low MPE. In contrast to above mentioned reviews, our results are derived from EC studies only. Furthermore, subgroup analyses of the gold standard in adults (TD) and most commonly used technique in pediatrics (TTE) were applied in our metaanalysis. This leads to definitive validation of EC compared to these methods. Besides, our meta-analysis includes more studies, and therefore more patients and more clinical settings than previous meta-analyses. So in numbers and diversity our study contributes and elaborates on the topic.
When compared to other minimally or non-invasive techniques used in clinical practice, most devices show a MPE of more than 30% [10,12,[68][69][70][71][72][73][74][75][76]. Therefore, Peyton and Chong have suggested to change the acceptable MPE to 45%, ensuring a higher rate of agreement in new methods [12]. MPE is determined by the reference and tested method and highly influenced by the clinical condition. The lowest bias and MPE are found in validation studies during cardiac surgery [68,77,78]. The worst results are found during sepsis and septic shock as the bias of most non-invasive devices is negatively influenced by a low systemic vascular resistance (SVR) [68,74,75,[79][80][81]. Which device should be the reference method and under which clinical condition the validation needs to be performed, remains subject of discussion.
The subgroup analysis for reference method in adults ( Fig. 7 in Appendix 7) showed a relatively high MPE (53.5%) for intermittent TD and a relatively low MPE (31.1%) for continuous TD. The high MPE for intermittent TD can be explained by the high MPE of the included studies. As the subgroup continuous TD consists of only two studies, the low MPE can be explained by the extremely low MPE (4.7%) of one included study [29].
The subgroup analysis for clinical setting in adults (Fig. 8 in Appendix 8) showed a low bias (0.01 L min −1 ) and a relatively low MPE (33.3%) during cardiac surgery, probably due to the hypodynamic status with low CO and high SVR. The studies in this subgroup showed a mean CO of 4.1 ± 0.2 L min −1 . The other included adult studies showed a statistical higher (p < 0.05) mean CO of 6.3 ± 1.7 L min −1 . The OR subgroup, consisting of two studies during liver transplantation [31,39], showed a relatively high bias (1.00 L min −1 ) and high MPE (67.7%), this could be explained by the hyperdynamic status (high CO and low SVR) which is often seen during these procedures [31,68]. The patient characteristics in the ICU subgroup differed too much to draw conclusions for this subgroup, as it concerned post cardiac surgery patients [30,33], patients suffering from systemic inflammatory response syndrome or sepsis post-surgery [35] or critically ill patients post-surgery [40] (Table 1). The same accounts for the studies included in the other clinical setting subgroup, which concerned pregnant women [32], hemodynamically stable cardiac patients [37,38] or took partly place during exercise or NO inhalation [34] ( Table 1).
The results for the subgroup TTE children were comparable to the pooled results for pediatric studies. The subgroup TTE neonates showed a relatively low MPE (35.1%) (Fig. 9 in Appendix 9).
Although a subgroup analysis for clinical setting in adults was performed post hoc, we decided not to perform the same subgroup analysis in pediatric studies, as the clinical settings differed too much (Table 2), which should lead to very small subgroups. No subgroup analyses for age were performed, as the age ranged too much in the individual adult and pediatric studies (Tables 1, 2).

Recommendations for clinicians
EC cannot replace TD and TTE for the measurement of absolute CO values. However, as the MPE is comparable to clinically used minimally or non-invasive hemodynamic monitors, EC could complement monitoring in the ICU and NICU, providing continuous monitoring, relevant for goaldirected therapy and clinical decision-making. This should be further investigated. In the OR, monopolar electrocauterization interferes with the EC measurement [82]. Bipolar electrocauterization does not.

Limitations
This study has multiple limitations. Firstly, population selection bias could be present. Most studies took place in cardiac surgical setting [28-30, 33, 36, 44]. Although hemodynamic instability can be present, cardiac surgery is characterized by low CO and high SVR [68,77,78], which could be an explanation for the low bias and relatively low MPE in the cardiac surgery subgroup. The low bias and MPE influence the pooled data in adults.
Another limitation is the LoA and MPE as outcome measures. Both are influenced by the error of the reference method. All reference methods have their own inherent error and do not provide an accurate and precise measurement of CO. For example, the precision of different TD devices is proved to be 13% by Stetz et al. [83]. Slagt et al. showed a precision of 6.7% for TPTD [81]. For intermittent PAC, precisions of 6.4% [84] Table 6 in Appendix 3) [38]. Critchley and Critchley proved that the MPE depends on both the precision of the reference and tested method, according to the following equation [20]: To draw conclusions from the MPE concerning the precision of the tested method, Cecconi recommends to measure the precision of the reference method within the study using repeated measurements and according to the following equation: The precision of the tested method can then be calculated, according to Eq. (7) [25]. Hapfelmeier proved that Eq. (7) is not completely true, as the overall precision and MPE depend on the method's variability about the true values as well [87]. In spite of its inaccuracy, Eq. (7) indicates that the precision of both reference and tested method influence the MPE and should therefore be calculated for proper interpretation of the LoA and MPE. Only a few studies measured both (Tables 3, 4) [28,38,40,49].
In addition to the latter described limitation, the different reference methods should be described as another limitation. It is questionable whether the included studies, based on different reference methods, are comparable. This could be an explanation for the high inter-study heterogeneity found in our review. Therefore, we applied subgroup analyses of the gold standard TD in adults and most commonly used technique TTE in pediatrics. The results of the subgroup analyses are discussed earlier. Inter-study heterogeneity decreased, but remained high. The subgroup TTE in neonates showed no heterogeneity (I 2 = 0%), as the two included studies showed comparable results.
To assess the statistical analysis in the included studies, we developed an additional domain for the modified QUA-DAS-2 tool. This has not been done previously. The risk of bias in individual studies was high in the statistical analysis domain (Tables 3, 4), which is a limitation of this review too. First, in some studies, the direction of the bias was unclear [28,29,40,44,47]. Second, the SD described in the manuscript did not correspond with the LoA in the figure [28,29,43]. Third, the recalculated MPE differed from the value presented in five studies [29,37,43,44,50]. For those studies, the differences in MPE (defined as recalculated MPEpresented MPE) were 1.1% [29], 2.9% [37], 26.8% [43], 58.4% [44], − 5.1% [50] (See Tables 6, 7 in Appendix 3, 4). In many cases, the MPE could not be recalculated [30, 31, 33-35, 38, 40, 41, 45, 47-49]. Fourth, the Bland-Altman analysis may only be applied for independent observations. In case of multiple observations per individual and in the absence of major hemodynamic changes, a modification of the Bland-Altman analysis for repeated measurements should be applied [88][89][90]. Many of the included studies used multiple observations per individual, but did not apply the modified Bland-Altman analysis [28, 32-34, 37-39, 43, 45, 50, 51]. This can lead to narrower LoA and a lower MPE in the individual studies [88,89]. Lastly, only a few studies assessed the precision of both reference and tested method [28,38,40,49], which is discussed earlier. Overall, the high risk of bias in the statistical domain causes the pooled data in this review to be less reliable. Besides, for two adult studies multiple data for a patient is presented in two or three different rows in the forest plot, as those studies presented multiple outcome measures for different clinical circumstances [30,34]. As the clinical conditions of both measurement points are different, the data can be considered as independent. Therefore it is statistically justified to assess these data separate.
Furthermore, some studies were excluded from our metaanalysis because of assessment of cardiac index, stroke volume or CO presented as mL kg −1 min −1 , instead of CO as L min −1 [52-56, 58-60, 62-66]. These studies could have been a contribution to our results.

Trending ability
Monitoring changes in CO is relevant in clinical practice to measure the effect of an intervention. Despite its inability to measure absolute CO values, which is assessed by the Bland-Altman analysis, EC could still be applicable as trend monitor. To achieve acceptable trending ability, good precision is required, independent of the accuracy [91]. For the assessment of trending ability different methods are described, of which the for-quadrant plot and the polar plot are recommended [92][93][94]. Seven of the thirteen studies in adults assessed trending ability, applying several statistical analyses [28-31, 33, 34, 39]. None of the studies in pediatrics evaluated trending. Due to a lack of agreement on the statistical methodology, it is difficult to compare results and draw conclusions, which is a limitation of this review.

Future research
Our study focuses on the ICON ® /Aesculon ® monitor for evaluating EC. The ICON ® /Aesculon ® monitor is a device in development and future research should clarify its place between existing hemodynamic monitoring devices. The high risk of bias in the statistical analysis domain of the modified QUADAS-2 tool emphasizes the lack of consensus how to present data in validation studies, despite the fact that good proposals have been published [20,25,87,91]. Consensus is required to interpret results of different studies and draw conclusions. Future validation studies with regard to EC, should also focus on trending ability [92][93][94]. Combined with studies on the applicability of EC for continuous CO monitoring and goal-directed therapy, this will provide useful clinical advice.

Conclusion
This meta-analysis of 24 studies, which assesses the accuracy and precision of non-invasive CO measurement by EC compared to a reference method, shows a pooled bias of 0.03 Lmin¯1 [95% CI − 0.23; 0.29], LoA − 2.78 to 2.84 L min −1 and MPE was 48.0% in adult studies. In pediatric studies the pooled bias was − 0.02 L min −1 [95% CI − 0.09; 0.05], LoA − 1.22 to 1.18 L min −1 and MPE 42.0%. Inter-study heterogeneity was high for both adults (I 2 = 93%, p < 0.0001) and pediatrics (I 2 = 86%, p < 0.0001). Despite the low bias in both adults and pediatrics, the pooled MPE were above the recommended 30%. Therefore, EC cannot replace TD and TTE for the measurement of absolute CO values. The trending ability of EC could not be assessed in this meta-analysis, due to a lack of agreement on the statistical methodology in the included studies. So, EC might still be applicable as a trend monitor to measure acute changes in CO, which is relevant for clinical decision-making. This should be an important part of future research, especially as EC is safe and easy to apply.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix 1
See Table 5. Specify study characteristics (e.g., PICOS, length of follow-up) and report characteristics (e.g., years considered, language, publication status) used as criteria for eligibility, giving rationale Information sources 7. Describe all information sources (e.g., databases with dates of coverage, contact with study authors to identify additional studies) in the search and date last searched Search 8. Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated Study selection 9. State the process for selecting studies (i.e., screening, eligibility, included in systematic review, and, if applicable, included in the meta-analysis) Data collection process 10. Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators