This study is a proof-of-concept, that accurate important diagnostic information can be derived from axial anatomy images obtained at the start of a CMR scan. These results are consistent across two different scanner manufacturers. This system could allow technicians performing the scan to be signposted to unexpected pathology, help direct optimized image acquisition for the remainder of the scan, prioritize supervision of scans by reporters, and prioritize scans for urgent reporting.
Earlier & automated diagnosis – a comparison with previous approaches.
Neural networks are now rivalling and surpassing humans for cardiac chamber segmentation and quantification [3, 4]; a situation in which images are acquired in a dedicated conventional manner for all patients. However, the aim of our study was different in two ways.
First, the network in this study has the potential to identify extra-cardiac diagnoses such as aortic dilatation and pleural effusions. Being able to identify findings may allow an adaptive approach to scanning protocols which avoids recalling patients for additional images and even gives technicians performing the scan reassurance that additional images are not required.
Second, previous studies have aimed to improve absolute quantification of chamber size and function and have therefore been trained to process the high-quality cine imaging which reporting clinicians currently use. Such sequences differ from the anatomical stack generated and processed in this paper in several important ways.
First, the dedicated sequences’ scan planes are ideally orientated with respect to the patient’s heart to ensure radial function is within rather than through the plane of imaging and minimizes partial voluming. Second, the dedicated sequences are acquired with less spacing, allowing more voxels per ventricle. Third, the dedicated sequences take considerably longer to acquire: they are acquired over 8–15 s per slice, of which there may be up to 10 slices per sequence .
In contrast, the entire axial anatomical sequences can be acquired in 3–4 breath holds. However, they are of relatively large slice thickness, and are orientated axially with no reference to the patient’s heart. Given this, the ability to acquire accurate diagnostic information early on in the scan has the potential to triage the application of further sequences during the scan. For example, patients found to have LVH and pleural effusions on axial anatomy images may have a diagnosis of cardiac amyloid in which pre-contrast T1 maps would be useful . Dedicated aortic imaging may be too laborious for routine use on each patient but could be reliably targeted to those who need it by our work.
Implications for reporting prioritization, supervision and patient safety
A system providing automated diagnosis during the earliest stages of a cardiac MRI scan would not only be useful for ensuring scans are correctly protocolled but would also allow physicians to prioritize the supervision and reporting of those scans most likely to be abnormal.
Patients with pleural effusions, for example, may have limited tolerance for lying flat in the scanner. Their identification at the earliest stages of could increase vigilance of these more vulnerable patients, and alteration of sequences to allow smaller breath holds, motion-corrected free-breathing sequences and accelerated protocols to minimize scan time.
Scans shown to contain unexpected pathologies may also be flagged for expedited review and reporting by physicians. For example, an outpatient screening CMR scan in a low risk patient might be identified as demonstrating unexpected marked LV dilatation with pleural effusions. Such patients at risk of decompensation and could be identified for early reporting and follow-up.
The neural network described in this study is not 100% accurate. Even the most accurate measurement (ascending aortic diameter) is only 94% accurate on our dataset. However, the correlation between the neural network’s predictions and the true measurements for the measures examined ensure that the extreme biological values associated with disease are more consistently correctly identified by the network (100% of hypertrophic cardiomyopathy cases, 90% of dilated cardiomyopathy cases).
It is difficult to ascertain to what extent the errors in the neural network’s predictions are due to inaccurate segmentation by the network, versus limitations inherent to estimating volumes using anatomy sequence slices. The latter could be estimated by calculating volumes using expert labels across the testing dataset, although this would require every myocardial slice in these data to be labelled. Whilst this dataset would be many times larger than the current dataset used to train the network, we hope to address this question in the future.
Furthermore, previous studies have shown the coefficient of variation is over 10% for estimating left ventricular volumes, even when assessed by the same doctor in the same patient using dedicated left ventricular sequences . In this study, we have compared the performance of this network against human observers behaving clinically, and therefore this 10% variation inherently sets an upper ceiling on the correlation coefficient obtainable by the network—it cannot correlate with the human observers better than the human observers correlate with themselves.
With all deep learning studies, there is concern that the findings in this study may not generalize to a wider population . This can be due to a phenomenon of “overfitting”, where the neural network is highly accurate at processing images on which it was trained but performs much less well on unseen “real world” examples. To try and mitigate these concerns, we took several approaches. First, the performance is reported on a test set which was only assessed after training the neural network. Second, the dataset we assembled was from two different hospitals across multiple reporting physicians. Third, the dataset comprises scans across both Siemens and Philips scanners, with the correlation plots showing similar accuracies for both manufacturers. Finally, we are releasing the neural network with this manuscript for use online, so that its performance can be assessed by any interested academic or clinician.