The International Journal of Cardiovascular Imaging

, Volume 27, Issue 4, pp 563–569

A comparison of visual and quantitative assessment of left ventricular ejection fraction by cardiac magnetic resonance

Authors

    • Department of Physiology, Anatomy and GeneticsUniversity of Oxford
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
  • Lindsay M. Edwards
    • School of MedicineUniversity of Tasmania
  • Oliver J. Rider
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
  • Angela Fast
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
  • Kieran Clarke
    • Department of Physiology, Anatomy and GeneticsUniversity of Oxford
  • Jane M. Francis
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
  • Saul G. Myerson
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
  • Stefan Neubauer
    • University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR), Level 0, John Radcliffe Hospital
Original Paper

DOI: 10.1007/s10554-010-9706-0

Cite this article as:
Holloway, C.J., Edwards, L.M., Rider, O.J. et al. Int J Cardiovasc Imaging (2011) 27: 563. doi:10.1007/s10554-010-9706-0

Abstract

To determine the accuracy of visual analysis of left ventricular (LV) function in comparison with the accepted quantitative gold standard method, Cardiac Magnetic Resonance (CMR). Cine CMR imaging was performed at 1.5 T on 44 patients with a range of ejection fractions (EF, 5–80%). Clinicians (n = 18) were asked to visually assess EF after sequentially being shown cine images of a four chamber (horizontal long axis; HLA), two chamber (vertical long axis; VLA) and a short axis stack (SAS) and results were compared to a commercially available analysis package. There were strong correlations between visual and quantitative assessment. However, the EF was underestimated in all categories (by 8.4% for HLA, 8.4% for HLA + VLA and 7.9% for HLA + VLA + SAS, P all < 0.01) and particularly underestimated in mild LV impairment (17.4%, P < 0.01), less so for moderate (4.9%) and not for severe impairment (1%). Assessing more than one view of the heart improved visual assessment of LV, EF, however, clinicians underestimated EF by 8.4% on average, with particular inaccuracy in those with mild dysfunction. Given the important clinical information provided by LV assessment, quantitative analysis is recommended for accurate assessment.

Keywords

Cardiac magnetic resonanceLeft ventricular functionLeft ventricular analyses

Abbreviations

EF

Ejection fraction

HLA

Horizontal long axis

SA

Short axis

LV

Left ventricle

VLA

Vertical long axis

CMR

Cardiac magnetic resonance

Introduction

Assessment of left ventricular function is one of the most common measurements in cardiology and critically important for the evaluation and management of patients with heart disease. Left ventricular ejection fraction (LVEF) is the clinically accepted standard for measuring left ventricular function. Impaired LVEF is not only associated with symptoms and prognosis, but also guides therapies, including implantation of cardiac devices, such as biventricular pacing systems and implantable cardioverter defibrillators [13].

Clinical methods to formally quantify LVEF have included echocardiography, nuclear imaging, cardiac computed tomography, angiography and cardiac magnetic resonance (CMR), with the latter now being considered the gold standard [46]. Traditional 2D echocardiographic methods for the determination of LVEF can be limited by technical factors, including poor image resolution and associated difficulties of endocardial contouring, limiting the reproducibility of the technique. CMR allows 3D image acquisition, which obviates the need for geometric assumptions necessary for the estimation of ejection fraction with non-contrast 2D echocardiography. Although providing what is now considered to be the most reliable and reproducible method of determining ejection fraction, CMR analysis requires time-consuming manual post-processing. Visual assessment of cardiac images is a quick method of determining LVEF and is widely used with all imaging modalities in cardiology, though the accuracy of this subjective method has been questioned [79]. Therefore, we aimed to determine the reliability of visual (eyeball) assessment of LVEF, compared to formal quantitative analysis. Furthermore, we aimed to determine whether multiple views of the left ventricle or varying clinical experience affect the accuracy of “eyeball” assessment of LVEF.

Methods

Study design

Cine CMR images were obtained from 44 clinical subjects with a range of ejection fractions, from severely impaired to supra-normal (5–80%). All clinicians (n = 18) had been clinically active in cardiology practice for at least 5 years, with varying degrees of CMR experience, (from 1 to 144 months). Clinicians were further categorised into those who had greater than 2 years of CMR experience (n = 7) and those with less than 2 years experience (n = 11). Normal (n = 16), ischaemic (n = 11) and non-ischaemic cardiomyopathies (n = 17), were presented in random order. With each subject, a three stage series of cine images was presented. Firstly, ten cardiac cycles of a four chamber plane (horizontal long axis; HLA), were presented and then LVEF estimated from visual assessment. Subsequently, ten cycles of a two chamber plane (vertical long axis; VLA), were presented from the same subject and LVEF estimated using knowledge of both images. Finally, a short axis stack (SAS) images were presented, with two cardiac cycles per slice. Clinicians were asked to visually assess the LVEF after each set of images, using previous left ventricular views to determine the incremental value of the additional views.

CMR image acquisition

Left ventricular volumes and function were assessed using CMR on a 1.5 T Siemens Sonata clinical scanner (Erlangen, Germany). With the subjects in a supine position, pilot images were acquired, followed by horizontal and vertical long axis cine images. A stack of steady-state free precession (SSFP) short axis cine images was subsequently obtained using breath hold and cardiac gating, as previously described [10]. The short axis images were obtained from the base to the apex, at 1 cm intervals (7 mm slice with a 3 mm gap).

Image analyses

For planimetry, left ventricular endocardial surfaces of the short axis cine images were manually contoured in end-diastole and end-systole using Argus® processing (Siemens Medical Solutions, Erlangen, Germany). Left ventricular volumes and ejections fractions were measured by two CMR clinicians, with over 6 years experience, with the average from the two calculated LVEF considered the reference value. There was excellent agreement using this quantitative planimetry processing, with a correlation of 0.99 (P < 0.001) between clinician assessments and an average difference of ejection fraction per subject of 1 ± 0.4%.

Statistics

Agreement was assessed in two ways. Where the range of data was the same across a comparison (for example, comparing HLA to HLA + VLA), a Pearson’s correlation coefficient was used as an index of each clinician’s performance. In addition, and in all cases where range would be a confounding factor (for example, comparing disease states), the analysis suggested by Bland and Altman was used [11]. Thus, for each clinician, mean standard deviation of the differences between eyeball estimates and planimetry values was taken as a measure of error. The mean of the differences between eyeball estimates and planimetry values was taken as a measure of bias.

The distributions of all these measures of agreement, error and bias were checked for normality using a Shapiro–Wilk test and all were normally distributed. Following this, repeated measures analysis of variance (ANOVA) was used to test for significant differences between multiple data sets; Bonferroni-adjusted t-tests were used to assess the significance of differences between means post hoc.

In addition to more conventional analysis, categorical analysis (as described by Altman in [12]) was used to examine clinicians’ performance when categorising the severity of LV systolic dysfunction (or when deciding whether LVEF was impaired at all). Categories of LV systolic dysfunction were defined as follows: normal = EF > 55%, mildly impaired = EF 45–54%, moderately impaired = EF 31–44% and severely impaired = EF < 31%, based on recognised guidelines [13].

We also performed category analysis as outlined by Altman in [12]. Results describe clinicians’ performance using the maximum available number of images (HLA + VLA + SAS). The following definitions were used: positive = LV systolic dysfunction (EF < 55%); sensitivity = the proportion of true positives (i.e. subjects with LV systolic dysfunction) that were correctly identified by the test; specificity = the number of true negatives (i.e. subjects with normal LVEF) that were correctly identified by the test; positive predictive value (PV+) = the proportion of patients with positive results (i.e. patients diagnosed as having impaired LVEF) who were correctly diagnosed; negative predictive value (PV−) = the proportion of patients with negative results (i.e. subjects diagnosed as having normal LVEF) who were correctly diagnosed.

Data were processed using Microsoft Excel and SPSS software version 16.0 for Mac (SPSS Inc., Chicago, ILL, USA). Confidence intervals for categories were calculated using CIA (www.som.soton.ac.uk/cia). Data are presented as mean ± standard error. Significance was accepted whenever P < 0.05.

Results

Comparison of eyeball assessment with planimetry

Visual assessment of LVEF and planimetry-derived values agreed well (Table 1; Fig. 1), showing strong correlations (HLA alone: 0.89, HLA + VLA 0.91 and HLA + VLA + SAS: 0.93, all P < 0.01, Fig. 2). Correlation coefficients became progressively greater as more images were included in the visual analysis (P < 0.01).
Table 1

Left ventricular ejection fraction (%) calculated by planimetry and visual assessment

 

Planimetry

HLA

HLA + VLA

HLA + VLA + SA

All subjects (n = 44)

52.3 ± 3.5

43.9 ± 2.7*

43.9 ± 2.9*

44.4 ± 2.9*

Categorised by EF

 Normal (EF > 55, n = 23)

72.3 ± 1.5

59.0 ± 2.0

60.0 ± 2.0

61.0 ± 1.9

 Mild (EF = 45–54%, n = 2)

51.0 ± 2.3

33.5 ± 3.4

34.3 ± 2.75

33.6 ± 3.3

 Moderate (31–44%) (N = 9)

36.9 ± 1.0

33.8 ± 0.8

31.8 ± 1.2

32.0 ± 1.0

 Severe (<31%) (N = 10)

20.8 ± 2.2

21.2 ± 2.3

19.8 ± 2.3

19.8 ± 2.1

* Different from planimetry at P < 0.01. See text for details of statistical methods used

https://static-content.springer.com/image/art%3A10.1007%2Fs10554-010-9706-0/MediaObjects/10554_2010_9706_Fig1_HTML.gif
Fig. 1

Mean correlation coefficients between visual (“eyeball”) assessment of left ventricular ejection fraction and planimetry-derived values. Bars show standard errors. HLA horizontal long axis, VLA vertical long axis and SAS short axis stack

https://static-content.springer.com/image/art%3A10.1007%2Fs10554-010-9706-0/MediaObjects/10554_2010_9706_Fig2_HTML.gif
Fig. 2

Scatter plots showing the means of physicians’ estimates of left ventricular ejection fraction (y-axis) versus planimetry-derived values on a patient-by-patient basis. The lines in the graphs are lines of unity. For details of abbreviations, see main text

These results were supported by the Bland–Altman analysis. Our index of clinical error (the mean standard deviation of the differences between eyeball estimates and planimetry values) progressively declined from HLA (11.0%) to HLA + VLA (9.6%) to HLA + VLA + SAS (9.0%), signifying less error with more images seen (P < 0.01). However, the Bland–Altman analysis revealed that overall there was significant bias, with clinicians consistently underestimating LVEF. Overall, the LVEF was underestimated by 8.4% for HLA, 8.4% for HLA +VLA and 7.9% for HLA +VLA +SAS, P all < 0.01 (Table 2). The corresponding Bland–Altman plots for HLA, HLA + VLA and HLA + VLA + SAS are displayed in Fig. 3.
Table 2

Correlation, error and bias between eyeball estimates of LVEF and planimetry-derived values

 

Planimetry vs. HLA

Planimetry vs. HLA + VLA

Planimetry vs. HLA + VLA + SA

Mean correlation coefficient

0.89

0.91

0.93*

Mean SD of the differences (error)

11.0%

9.6%

9.0%*

Mean difference (bias)

−8.4%**

−8.4%**

−7.9%**

* Difference across three values significant at P < 0.01 when tested using a repeated-measures ANOVA

** Different from zero at P < 0.01

https://static-content.springer.com/image/art%3A10.1007%2Fs10554-010-9706-0/MediaObjects/10554_2010_9706_Fig3_HTML.gif
Fig. 3

Bland–Altman plots of the differences between eyeball assessment of left ventricular ejection fraction and planimetry-derived values. The plotted points are the differences between the means of all physicians’ estimates and the planimetry-derived values on a patient-by-patient basis. For details of abbreviations (HLA etc.) see main text. The solid lines are zero difference (heavy) and mean difference (light). The dashed lines are the lower and upper 95% confidence intervals of the differences between the means

There was no correlation between errors and mean values, nor did log transformation have any effect on the strength of the correlations between eyeball estimates and planimetry derived values. Thus, there was no evidence of heteroscedasticity in the data.

Bias by degree of LVEF impairment

Although LV ejection fraction was underestimated overall, this bias was reduced at lower ejection functions, but not for normal hearts and mild degrees of dysfunction. Thus, there was a negative correlation between mean bias (the mean difference between eyeball and planimetry on a patient by patient basis) and planimetry derived LVEF. HLA: r = 0.71, P < 0.01; HLA + VLA: r = 0.67, P < 0.01; HLA + VLA + SAS: r = 0.67, P < 0.01). Furthermore, and importantly, eyeball assessment was able to distinguish between normal, moderate and severe dysfunction, but not between mild and moderate dysfunction (with mean values between 32 and 34% for both categories on eyeball assessment).

The effect of experience

There were no significant correlations between either months of clinical cardiology (r = −0.18) or CMR experience (r = 0.04) and error or bias. Additionally, there was no difference in accuracy when physicians were categorised into those with less than, or greater than, 2 years clinical CMR experience, with the overall error in EF visual estimation of 7.7 ± 1.3 and 8.4 ± 1.2%, respectively (P = 0.71).

Category analysis

Using the predetermined EF categories for mild/moderate/severe LV systolic dysfunction, the majority of those who were classified as having LV dysfunction on eyeball assessment were correctly diagnosed (positive predictive value = 72.7%, Table 3). Nearly all those diagnosed as having normal LVEF by clinicians using “eyeball” assessment were correctly classified (negative predictive value 97.1%). Thus, eyeball assessment was effective at ruling out LV systolic dysfunction, but a significant number of subjects with normal hearts were incorrectly diagnosed as having at least mild LV systolic dysfunction (specificity = 55.6%).
Table 3

Categorical analysis of physicians’ ability to class patients as having impaired cardiac function (EF < 55%)

 

%

95% CI

Sensitivity

98.6

97.0–99.4

Specificity

55.6

50.4–60.0

Positive predictive value

72.7

68.9–76.1

Negative predictive value

97.1

93.8–98.7

95% CI 95% confidence intervals, PV+ positive predictive value, PV− negative predictive value

Discussion

Our data show strong correlations of visual and quantitative assessment of left ventricular function. However, eyeball assessment consistently underestimated ejection fraction, with the bias most pronounced in those with normal or mildly impaired LV function. More importantly, we also highlighted the difficulty of distinguishing between mild and moderate dysfunction using visual analysis. These results have important clinical considerations when evaluating left ventricular function.

Accurate and reproducible assessment of left ventricular function remains fundamentally important in clinical decision making. CMR has advantages over 2D and 3D echocardiography, providing clear endocardial visualisation for analyses of ventricular volumes, mass and function. However, in our study, clinicians consistently underestimated left ventricular function in patients with mild to moderately impaired LVEF, even after assessing multiple CMR image planes. Therefore, if “eyeball” assessment is solely used to determine LV function, it can be misleading. Previous visual assessment studies using 2D echocardiography have shown high correlation between visual and quantitative analysis, but further measures of agreement were lacking [9, 14]. We found Bland–Altman plots demonstrated good agreement and identified that clinicians underestimated EF in many cases.

The assessment of left ventricular function is not only important for the assessment of heart failure, but is an important parameter when assessing the success of therapy. If LVEF was only used to determine the presence or absence of cardiac dysfunction, visual assessment in this study had a high sensitivity and negative predictive value. However, clinicians also rely on LVEF accuracy to determine invasive and costly therapies, including implantable cardioverter-defibrillators (ICDs) [2], which require precise measurements without large error or bias. It therefore seems inappropriate to make therapeutic decisions in patients with mild to moderate reductions in LVEF based on eyeball assessment alone.

It is not surprising that visual estimation of LVEF improved after multiple planes of the ventricle were viewed, given the regional dysfunction observed in a large proportion of impaired ventricles, as occurs post myocardial infarction.

We demonstrated no correlation between the level of experience in clinical cardiology or CMR experience and the accuracy of “eyeball” assessment of LVEF. Although some studies have suggested there is a learning effect in assessing visual analysis of cardiac function [15], others have suggested that experience beyond a basic level does not improve accuracy [16]. It is interesting to note that a study comparing clinical assessment of jugular venous pressure with central venous pressure measurements, as determined by medical students and doctors with a range of clinical experience, showed no significant difference between experience groups [17]. Perhaps a large initial learning curve and individual abilities in 3D geometrical assumptions outweigh the effect of experience in this situation.

Our results are consistent with previous echocardiography studies, which demonstrated improved visual estimation of cardiac function with incremental views [8, 18] and reinforces the need to use 3D measurements tools during LVEF assessment. It may be possible to extrapolate these results to echocardiography, where endocardial definition, and hence visual assessment of function, is more challenging. From our experience, quantitative assessment of cardiac function is not routinely performed by all imaging centres, and although there may be a role for visual assessment to confirm the presence of LV systolic dysfunction, it is insufficient as the primary assessment for accurate determination of left ventricular function.

Conclusion

“Eyeball” assessment of left ventricular function correlates strongly with planimetry assessment of cardiac function, however it consistently underestimates LVEF, particularly in normal, mild and moderately impaired LVEF categories. Given the important clinical consequences associated with an accurate assessment of LVEF, quantitative assessment of LVEF should remain the clinical standard for clinicians with all levels of experience.

Acknowledgments

The British Heart Foundation supported this work.

Conflict of interest

There are no conflicts of interest to declare.

Copyright information

© Springer Science+Business Media, B.V. 2010