Introduction

Aortic valve regurgitation, or aortic regurgitation (AR), is a common type of valvular heart disease where the aortic valve does not close properly, causing reflux of blood from the aorta into the left ventricle [1]. This reflux of blood is known as the regurgitant jet. Diagnosis and severity of AR is determined by evaluation of flow metrics, for example peak velocity, pressure drop, and regurgitant volume.

Cardiovascular four-dimensional (4D) flow magnetic resonance imaging (MRI) is a novel imaging technique to quantify full-field blood flow velocities, providing a three-dimensional (3D) velocity field across a region of interest throughout the cardiac cycle. Currently, due to the small width of the regurgitant jet and the limited spatiotemporal resolution of 4D flow MRI, it fails to accurately capture the complex hemodynamics of AR. The combination of computational fluid dynamics (CFD) and deep learning with 4D flow MRI will help to generate higher resolution images and recover hemodynamic parameters lost in current MRI images, with a target upsample factor of 4. This work will extend what has already been completed in 4DFlowNet [2] by using a wider range of flow characteristics to mimic AR, improving the data augmentation steps, and enhancing the artificial neural network with newer architecture structures.

AR occurs when the aortic valve does not close properly, causing blood to flow back into the left ventricle from the aorta. This forces the heart to work harder and pump more blood to the aorta, which can cause further heart problems in the future. AR also has varying levels of intensity, from trace or mild through moderate to severe [1]. Acute AR is considered a medical emergency as it can cause severe pulmonary edema and hypotension, that is, excess fluid in the lungs and low blood pressure, respectively. Patients with AR are monitored yearly with echocardiography to decide whether replacement of the aortic valve is necessary [3]. Flow metrics such as peak velocity and pressure drop are typically calculated non-invasively using 2D velocities from 2D Doppler echocardiography [1]. Due to limited information available in 2D, this method is known to overestimate pressure drops [4].

4D flow MRI is an established imaging technique that captures the temporal changes of 3D blood flow patterns within individual vascular structures [5, 6] and has proven promising for quantifying AR in clinical practice [7,8,9,10]. Velocities of blood particles are encoded in the phase of the MRI signal while the anatomy is visualised from the signal’s magnitude [11]. However, 4D flow has several limitations, such as low spatiotemporal resolution, long scan time, and low signal-to-noise (SNR) ratio [12], which makes its clinical application to AR difficult. With a spatial resolution between 1.0 and 3.5 mm [6], details on the narrowest part of the jet cannot be captured as its width is typically much smaller than 3 mm [13] for mild AR. Therefore, spatial resolution is the biggest limitation in 4D flow MRI.

Deep learning [14] has had a significant impact in many scientific sectors, and is highly relevant in the field of medical imaging [15]. Advances in super-resolution image reconstruction [16] to obtain high-resolution (HR) images from low-resolution (LR) observations are increasingly being adopted for MRI with a deep learning-based approach [17, 18]. This approach is preferred as it not only has an advantage in spatial resolution quality over conventional super-resolution techniques [19], but also successfully denoises flow images [20].

The combination of 4D flow MRI with deep learning has been explored in multiple ways to increase resolution and provide more accurate estimates of physical quantities [2, 21, 22]. However, there are several limitations involved in this approach. Primarily, these have been related to insufficient data [2, 22], which is due to the requirement of paired LR and HR MR images. This can be difficult to obtain as HR MRI takes long scanning times and is subject to motion artifacts [2, 22, 23]. As an alternative, CFD models have been used to simulate 4D flow MRI as ground truth HR images, which are then downsampled to LR images [2, 22]. Other limitations include unstable and non-robust network architectures [21], which describe the organisational structure of the network’s layers. The architectures play a significant role in the performance of the deep learning algorithm, as well as ignoring phase/velocity aliasing error [2, 20]. The aliasing error here refers to aliasing from having a velocity encoding, VENC [24] that is too low [25] rather than other types of MRI spatial aliasing which have been explored previously and reduced [26,27,28]. Note that VENC is an MR parameter to adjust the maximum velocity corresponding to a 360\(^{\circ }\) phase shift in the data.

Recent development in object detection has proven significant in advancing ANN architecture [25], and appears to be widely used in many medical imaging applications [15]. Examples include residual blocks [29], dense blocks [30], and cross stage partial blocks [31], which show promise to increase network capacity mitigating degradation and memory utilisation issues.

Methods

Data generation

Modelling of the aortic valve was done using ANSYS 2021 R1, with CFX chosen for CFD simulations, and was an iterative process to determine the best parameters and options that would increase efficiency and accuracy.

Geometries

The design of the problem geometry focused on simplicity, over replicating a real-life aortic valve during regurgitation, to capture the flow features of the regurgitant jet, which are not captured in the 4D flow data at typical resolution, and not handled well by 4DFlowNet. The jet characteristics are modelled using a variety of eccentricities and angulations to capture the variation in jet properties, giving the additional flow information required for training of the flow super-resolution network. The basic geometry is radially symmetrical and shaped like a 3D cylinder with a constricted section part way along the length, resembling a Venturi tube. The constricted section represents the gap in the aortic valve present for AR to occur. Figure 1 shows a sketch of the basic geometry.

Fig. 1
figure 1

Sketches of the basic (top), angled (middle), and offset (bottom) constricted sections, respectively

In total, 20 different simulations were generated using 20 different geometries with a time step size of 0.001 s. The first set of 10 geometries (see Table 1) had variations in inlet velocity, inlet radius, and constricted section radius, while the other set of 10 geometries (see Table 2) had different shapes. These geometries captured maximum jet velocities between 2.0 and 5.0 m s\(^{-1}\) [13, 32]. The second set of geometries were based on the third geometry, as this geometry was reasonably small with minimal computational time. This set had diagonal and off-centre constricted sections, as well combinations of both, to model eccentric jets. Figure 1 demonstrates these differences in shape.

Table 1 Input parameters for basic geometries
Table 2 Input parameters for angled/offset geometries

Boundary conditions

The relevant boundary conditions are those relating to the inlet, outlet, and inner wall. The inlet boundary conditions take the most work to define as the velocity varies both in space and time. For blood flowing through the aortic valve, the flow is expected to be fully developed, that is, the velocity is zero at the inner walls due to friction and at its highest in the centre. This can be achieved by extending the length that the blood needs to travel from the inlet to the constricted section, allowing the flow to fully develop before reaching the aortic valve. However, extending the length means the geometry becomes larger, hence increasing the computational time. To compensate for this, the velocity profile at the inlet was defined with a parabolic shape and the upstream length was set to 20 mm [33]. This means the flow will start out more developed than with a uniform profile, and become fully developed before reaching the aortic valve. The velocity was also time-dependent and represented the diastole (where regurgitation occurs), with a rapid initial increase in magnitude before slowly decreasing [13].

The parabolic velocity profile at the inlet in Cartesian coordinates was defined by

$$\begin{aligned} v = v_{I} - \frac{x^2+y^2}{R_I^2v_{I}^{-1}}, \end{aligned}$$
(1)

where \(v_{I}\) is the maximum velocity at the centre of the inlet and R\(_I\) is the inlet radius. The remaining boundary conditions were for the outlet and walls. The outlet was defined as an opening with zero pressure difference and the walls were defined as non-permeable with no-slip boundary conditions.

Data preparation

To start, the raw CFD simulations and geometries, which each had 71 timeframes, were sampled onto a uniform Cartesian grid (0.1 mm spacing projected onto multiple planes) to be used as HR images (\(392\times 48\times 48\) mm). To separate the fluid and non-fluid regions for better data processing and result quantification, binary masks were generated using k-nearest-neighbours [34]. The main difference between these synthetic HR images and MR images relates to the voxel size, VENC, and the amount of noise—HR images are noise-free whereas MR contain phase noise. The LR MR images were obtained from the HR images using the same method as in 4DFlowNet [2] to simulate \(4\times\) downsampled MR images with appropriate noise and VENC. This gives the paired LR and HR synthetic images used in network training.

Data augmentation

To augment the data set, similar techniques were used as in 4DFlowNet [2] for each time frame. VENC values were randomly chosen from a set of velocities between 0.3 and 6.0 m s\(^{-1}\), spaced by 0.3 m s\(^{-1}\), for each velocity component. Aliasing was mostly avoided by choosing a VENC larger than the peak velocity. However, since velocity jets cannot be estimated beforehand for actual AR cases (which may cause phase aliasing), a VENC lower than the maximum velocity was chosen with a 10% probability, randomly selecting between 0.3 and 0.6 m s\(^{-1}\) lower. Constant intensity values between 60 and 240 were randomly chosen for the magnitude image, and noise levels were added depending on the SNR, which were randomly and uniformly chosen between 14 and 17 dB.

Since there were a limited number of geometries, further augmentation came in patch generation. From each time frame, 10 patches of \(12\times 12\times 12\)-voxel cubes from the LR image were selected randomly with a minimum fluid region of 20%. These patches acted as random translations, so no extra translation steps were taken. On top of this, for each patch generated another randomly rotated version of the patch was also created. This resulted in 20 patches generated from each time frame and thus 1420 patches per geometry.

Training and validation

To investigate the effect of additional geometries regarding SR image quality, a subset of the data consisting of patches from only five geometries was compared against a the entire data set consisting of patches from all geometries. The validation set in both cases was the same, consisting only of patches from a single geometry.

To investigate the effect of aliasing, duplicates of the two training and validation sets were generated, but with a 10% probability of having a VENC lower than the maximum velocity in any time frame. Networks trained using these aliased data sets were validated against the previously generated validation set without any aliasing, as well as a newly generated validation set with full aliasing, that is, with each time frame having a VENC lower than the maximum velocity.

Network architecture

The simulated pairs of LR and HR synthetic images were used to train a similar deep residual network structure to the one in 4DFlowNet. This consisted of several residual blocks surrounding a central upsampling layer, with the preceding blocks in the LR space pre-processing and acting as denoisers for the input while the following blocks in the HR space refine the output. In 4DFlowNet, LR patches of 16-voxel cubes were used as input and SR patches of 32-voxel cubes were generated as output, with an upsample factor of 2.

Several changes were made to the above 4DFlowNet architecture to provide higher resolution images with improved accuracy. Firstly, the upsample factor was increased to 4 and the sizes of the input and output patches were changed to 12-voxel and 48-voxel cubes, respectively. The smaller patches account for smaller vessel sizes in the cardiovascular space around the aortic valve [22]. Secondly, the dense and cross stage partial blocks in DenseNet and CSPNet, respectively, were experimented with by using them in place of the 12 residuals blocks in the original 4DFlowNet architecture. The growth rate [30], defined as the number of feature maps in each convolutional layer, of the dense and CSP blocks was set to 16, a quarter of the number of channels in each convolutional layer from the original residual blocks. The adapted 4DFlowNet architecture with residuals blocks (4DFlowNet-Res), with dense blocks (4DFlowNet-Dense), and with cross stage partial blocks (4DFlowNet-CSP) had 3.34, 2.55, and 2.08 million parameters, respectively. These modified networks were implemented with TensorFlow 2.0 [35] and trained using an Adam optimiser [36], with an initial learning rate of \(10^{-4}\) and decay rate of \(\sqrt{2}\) after every 14 epochs. Batch sizes of 16 were used, with training completed in 200 epochs.

Loss function

The network was optimised by minimising the mean squared error (MSE) between the paired HR images and the SR images generated from the corresponding input LR ones. The voxel-wise loss was calculated as the mean sum of squared differences between each velocity component:

$$\begin{aligned} L_{MSE} = \frac{1}{N} \sum ^N_{i = 1}{(v'_{x_i} - v_{x_i})^2 + (v'_{y_i} - v_{y_i})^2 + (v'_{z_i} - v_{z_i})^2}, \end{aligned}$$
(2)

where N is the total number of voxels in the geometry, \(v'_j\) is the predicted SR velocity, and \(v_j\) is the actual HR velocity, for \(j \in \{x, y, z\}\).

The MSE of fluid and non-fluid regions were calculated as separate terms due to the imbalance and irregularity of these regions within a specific patch. This gives the total loss to be:

$$\begin{aligned} L_{total} = L_{MSE_{F}} + L_{MSE_{N}}, \end{aligned}$$
(3)

where \(L_{MSE_{F}}\) and \(L_{MSE_{N}}\) are the voxel-wise loss for the fluid and non-fluid regions, respectively.

The original loss function in 4DFlowNet contained a weighted velocity gradient term to smoothen the gradient between neighbouring vectors [2]. This was omitted from the above loss function as improvements were observed in near-wall velocity estimates with its removal [22].

Evaluation metric

The relative speed error (RE), the relative difference between the SR velocity magnitude (speed) compared to the actual HR speed on the validation set, was used to measure network performance and save model checkpoints. This was only calculated in fluid regions to avoid zero division error, as well as adding a small number (\(\epsilon = 10^{-4}\)) to the denominator. Furthermore, since many speed values in the HR images were quite small, this could risk significantly over-penalising the model. Thus, an arctangent approach [37] was adopted, giving the following equation for relative speed error:

$$\begin{aligned} RE = \frac{1}{N} \sum ^N_{i = 1}{\text {arctan}}\left( \frac{\sqrt{(v'_{x_i} - v_{x_i})^2 + (v'_{y_i} - v_{y_i})^2 + (v'_{z_i} - v_{z_i})^2}}{\sqrt{v_{x_i}^2 + v_{y_i}^2 + v_{z_i}^2} + \epsilon }\right) , \end{aligned}$$
(4)

where N is the total number of voxels in the fluid domain, \(v'_j\) and \(v_j\) are the predicted SR and actual HR velocities, respectively, for all \(j \in \{x, y, z\}\), and ‘arctan’ is the arctangent function, defined for all real values from negative infinity to infinity with \({\text {lim}}_{x\rightarrow {\infty }}{\text {tan}}^{-1}x=\frac{\pi }{2}\) for arctan x.

In addition to the RE, network performance was also evaluated using the root MSE (RMSE) and the structural similarity (SSIM) metric [38] in all three Cartesian velocity components. These were compared against the baseline 4DFlowNet model that had been trained with an upsample factor of 2.

Results

Training was performed using a Tesla V100 GPU with 32 GB memory with networks being trained for 200 epochs. Improvements in relative speed error (RE) plateaued around the 100 epoch mark for 4DFlowNet-Res while still improving for 4DFlowNet-CSP and 4DFlowNet-Dense up till the very last epoch. This can be seen in Fig. 2. The time taken was dependent on the type of network; 4DFlowNet-Res, 4DFlowNet-CSP, and 4DFlowNet-Dense took approximately 163, 168, and 255 h, respectively. Note that these times were for the networks trained using all geometries. For the networks trained using only five geometries, denoted as 4DFlowNet-Res5, 4DFlowNet-CSP5, and 4DFlowNet-Dense5, the times taken were approximately 38, 40, and 63 h, respectively. There was no significant difference in training time between networks trained with and without a portion of aliased data, denoted by ‘-A’.

Fig. 2
figure 2

Relative error across all 200 epochs during training for each network. Networks trained with five geometries were not included as the general trend seen was the same

Networks were tested on one complete geometry consisting of 71 timeframes with no (phase/velocity) aliasing, and the same complete geometry with full aliasing. These predictions were required to be patch-based since patches were used as the input and output for each network. The complete geometry was reconstructed by stitching together multiple SR velocity field patches, which was done with a stride of (\(n-4\)) in each Cartesian direction where n is the arbitrary patch size. To avoid patch artifacts at the boundary, four voxels were stripped from each patch side.

The full implementation is accessible on GitHub under an open-source (MIT) license at github.com/dlon450/4DFlowNetv2, with training data available on request.

Synthetic MR images

Fig. 3
figure 3

Predictions on an LR patch from the synthetic 4D flow MRI phase image for different networks with and without aliasing error, focused on the constricted section at two different views along the width (top) and length (bottom) of the geometry. A 2D slice, along the length of the geometry, of the velocity magnitude is shown from the 3D patch for visualisation purposes. Scale is in m s\(^{-1}\)

SR images were analysed visually and quantitatively to better understand how each model was performing. Figure 3 is a visual example of the prediction for the different networks in the constricted section at the peak flow. These display the effectiveness of each network in reducing noise, with the predictions looking quite similar to the ground truth. Furthermore, the networks trained with a proportion of aliased data seem to be performing better than networks without, especially for data with aliasing error.

The values for each evaluation metric were collected and compared in “Appendices 1 and 2” to quantify the performance of every model. Again, these values were taken from the time frame with peak flow, with peak velocities of 2.186, 0.355, and 0.349 m s\(^{-1}\) for velocity components \(v_x\), \(v_y\), and \(v_z\), respectively.

The metrics were plotted in “Appendix 3”. Briefly, 4DFlowNet-Dense-A and 4DFlowNet-CSP-A seem to perform the best with the lowest RMSE error and largest SSIM in the principle flow direction (\(v_x\)), respectively, on both aliased and non-aliased data. However, there does appear to be considerable variation in these results between different networks, depending heavily on the size of the data set and whether aliasing is present.

Regarding the RMSE in the principle flow direction (RMSE\(_x\)), all networks perform better, on average, than the base 4DFlowNet with less variation in RMSE\(_x\). For the smaller data set with five geometries, the residual-based (Res) versions seem to have the smallest RMSE\(_x\). There is also noticeable improvement in error between networks trained on the smaller and larger datasets when predicting on non-aliased data. However, when predicting on aliased data, the improvement is significantly more apparent. The increase in dataset size improves the RMSE in the other two flow directions for all networks too, with the CSP versions somewhat better than other versions.

Regarding the SSIM in the principle flow direction (SSIM\(_x\)), all networks also perform better, on average, than the base 4DFlowNet. However, the SSIM\(_x\) seems to worsen for a larger dataset, in general, when predicting on non-aliased data. On the other hand, when predicting on aliased data, there does appear to be slight improvement in SSIM\(_x\) across all networks. For the SSIM in the other two flow directions, the values are considerably more varied and worse than in the principle direction. Finally, all networks have a substantially lower RE than the base, with the Dense versions having the lowest RE.

Fig. 4
figure 4

Regression and Bland–Altman plots for prediction on non-aliased (left) and aliased (right) data with 4DFlowNet-CSP-A (top) and 4DFlowNet-CSP (bottom). These plots are for each of the velocity components and magnitude (\(v_x\), \(v_y\), \(v_z\), and \(\Vert v\Vert\), respectively, from top to bottom) between SR and synthetic HR images

The regression plots in Fig. 4 show that there is exceptional correlation between the SR and synthetic HR images. The regression slopes are very close to one, in the principle flow direction and velocity magnitude plots, for the two CSP networks. Moreover, offset values are also essentially zero in all examples shown. Training with all non-aliased images appears to have an effect when predicting the higher velocity values in aliased images, with these values being slightly underestimated, as seen on the right in Fig. 4 and confirmed by the Bland–Altman plots too. These plots also indicate minimal bias as the deviations appear constant, uniform, and are all less than 0.08 m s\(^{-1}\).

Fig. 5
figure 5

Regression and Bland–Altman plots for \(\Vert v\Vert\) prediction within the constricted section, comparing the baseline 4DFlowNet (left) against the adapted 4DFlowNet-CSP (right)

For velocities within the constricted section, Fig. 5 shows the correlation between SR and synthetic HR images. Although a slight underestimation bias seem to prevail, 4DFlowNet-CSP shows noticeable improvement from the baseline 4DFlowNet improving on the otherwise observed deviations at higher velocities. Note that the general trends and comments regarding these regression and Bland–Altman plots were present in all other evaluated networks too.

Fig. 6
figure 6

Predicted SR images on actual 4D flow MRI data. Two sets of 4D flow MRI data were used (top and bottom), with the original LR images on the left and the SR images in the other three columns, titled by the network used for prediction. These are 2D slices of the 3D image, showing the velocity magnitude taken at peak flow. Scale is in m s\(^{-1}\)

In vivo 4D flow MRI data

Ethical approval for this study was granted by the Health and Disability Ethics Committee of New Zealand (17/CEN/226), and written informed consent was obtained from each participant. Two sets of in vivo 4D flow MRI data (CH34 and CH37) were acquired and used exclusively to show how the network would perform, seen in Fig. 6. CH34 (resp. CH37) had an AR VTi of 264.8 cm (resp. 201.2 cm), VC width of 3 mm (resp. 3 mm), peak velocity of 4.5 m s\(^{-1}\) (resp. 4.2 m s\(^{-1}\)), and pressure drop of 81 mmHg (resp. 70.5 mmHg).

These datasets were LR images only, with no HR images available to compare the predicted SR images against. However, the SR image produced was compared against the original 4D flow MRI image to help better understand the prediction. The SR image seems to have effectively removed noise from the LR image and strengthened the AR signal. Other networks performed similarly, with predicted SR images almost identical to the one shown. The top set of data (CH34) was also segmented and visualised with ParaView [39], shown in Fig. 7. Image stitching appears to have been performed correctly, with velocities within the fastest section preserved well.

Fig. 7
figure 7

Original 4D flow MRI image against the SR image produced by 4DFlowNet-CSP at systole and diastole, visualised in ParaView. The regurgitant jet can be clearly seen during diastole in the SR image produced. Scale is in m s\(^{-1}\)

Discussion

A noteworthy consideration when analysing the results is that the focus should be more towards metrics and values obtained in the principle flow direction x. Since the peak velocities in the other flow directions, y and z, were almost 10 times smaller than that in the principle direction, networks would have difficulty differentiating between velocity and noise. This is evident in the y and z RMSE values, which were over half of their corresponding peak velocities. The y and z SSIM values were also much worse, potentially exhibiting strange behaviour in these low velocity fields too [40].

Synthetic 4D flow MRI data

The main limitation of previous work has been related to insufficient data and flow characteristics [2, 20, 22]. To understand the effect of additional flow characteristics in the dataset, a wider range was used in this study. There was definitely noticeable improvement in the RMSE across all networks tested, particularly when predicting on aliased data. This indicates that with more geometries and hence flow characteristics, the network seems to generalise better and is more robust against data it has not seen.

In terms of the synthetic LR and HR image pairs, the downsampling process and patch-based approach seemed to work effectively. The noisy LR patches enabled the network to learn noise removal while also enabling greater generalisation to unknown flow characteristics or geometries [2]. However, a small portion of these incorrect non-fluid velocity areas around the fluid domain appear to have been included in network training, seen in Fig. 3. These were generated during the linear interpolation process when obtaining the HR images from the CFD data. This suggests that the binary mask for separating between the fluid and non-fluid regions, created using k-nearest-neighbours, needs improvement. An obvious consequence is that the network may learn incorrect flow characteristics and predict less accurately.

Incorporating aliased patches into the training data seemed to work effectively as well, improving the RMSE across networks trained with all geometries. This was done by choosing VENC values lower than the maximum velocity, within a particular time frame, with a 10% probability. Note that this probability was chosen arbitrarily. For the VENC, if it is set too high, visualization of the jet may not be obtained and be inaccurate, as well as having poorer SNR. On the other hand, if it is set too low, flow characteristics may be lost and a mosaic pattern will be shown [41]. This means that for the time frames with aliasing, the velocities lower than the VENC would have been captured significantly more clearly, with only a small portion of high velocity characteristics being lost. For these time frames, the purpose would be to help the network better learn the flow characteristics in lower velocity fields, improving robustness and generalisability. However, this hypothesis has not been proven yet and more experiments on aliasing will be required to fully understand its effect on network performance. This would involve varying its probability of occurring as well as choosing VENC values considerably lower than the maximum velocity. Future work may also include a thorough analysis of patch size to quantify its effect on network accuracy.

Network architecture

A major drawback of residual blocks is that they suffer from limited learning ability [30]. This was seen in the results, in which the residual networks did not show much improvement even after adding many more geometries or introducing aliased images into the training data. Furthermore, the RE for residual networks seemed to plateau at around the 100th epoch, whereas the RE for the other networks seemed to continue improving, although at a decreasing rate, even up till the last epoch. These observations can be seen in Fig. 2. Despite this, the residual networks still performed well, with RMSE, SSIM, and RE values similar to the other networks, as well as having the fastest training and prediction times. This suggests that, when data is insufficient or very limited, residual networks may work best.

With sufficient or abundant data, cross stage partial or dense networks may be preferred. The learning ability in these types of networks have a significantly higher ceiling than residual networks [30, 31], with the only difference between these two network structures being the training and prediction times. Cross stage partial networks had training and prediction times almost as fast as residual networks, whereas dense networks were almost 1.6 times slower. Although training times may not be a crucial problem, clinicians and patients alike may require and desire fast prediction times. This leads to cross stage partial networks being preferred. Otherwise, if training and prediction times are not significant constraints, then dense networks may be the optimal choice. Furthermore, the growth rate for the dense and cross stage partial networks was set quite low, at a quarter of the number of feature maps in each convolutional layer within the residual blocks. This limits the learning ability of these networks, as there are significantly less parameters, so testing larger values of this hyperparameter will be beneficial and likely improve network performance. Similarly, only a quarter of feature maps were taken from the base input layer within each partial dense block within the cross stage partial networks, so larger values of this hyperparameter will likely improve performance as well.

Despite the network architecture seeming to work quite effectively, there are still improvements that could be made on top of modifying the residual blocks. Presently, the network is not taking full advantage of the temporal aspect in 4D flow MRI. Modifying it to incorporate characteristics of recurrent neural networks [42] may help the network understand this temporal aspect better. This could be done by using predictions of the same patches from one or two time frames prior. Additionally, including physical properties of fluids may also help the network in learning flow characteristics, as seen in [21, 22]. However, due to the bulky nature of the velocity data, which were 3D volumes for each velocity component, these ideas were not considered further as it would have been too costly to process given the available resources.

Clinical application

The clinical motivation for the current study was the assessment of regurgitant or highly stenotic valvular flow; instances where quantification of regional velocities and changes in pressure act as effective biomarkers to describe the severity of disease. In these instances, and in clinical practice, assessment of peak velocities through the vena contracta are used to symbolize disease severity. In fact, derivation of regional pressure drops are routinely derived from such peak velocity measures using the so called simplified Bernoulli equation [43] (coupling peak velocities to effective pressure changes). In this light, our results bear possible clinical implications in that SR velocities effectively recovers HR reference measures. This holds true throughout the evaluation domain, and in the clinically important constricted section, underestimation biases associated with the original 4DFlowNet formulation were effectively suppressed as seen in Fig. 5. The effect of the remaining minor deviation from a true 1:1 correlation between HR and SR data remains to be explored in a larger clinical setting, and in more complex flow scenarios—such as in the instance of regurgitant flow—higher-order methods might be required to derive pressure drops from measured velocity data [4, 44]. Nevertheless, just as we have shown the potential of recovering functional hemodynamic behavior through the spatially challenging cerebrovascular space using SR 4D flow MRI [22], the results of the current study bear similar potential in recovering clinically relevant hemodynamic metrics in aortic regurgitation.

Limitations

Although the number of geometries and flow patterns have significantly increased from previous studies, a wider range of characteristics can still be included. Additional data will diversify the data set further and improve the model’s robustness and generalisability even more. On top of this, more testing on real 4D flow MRI data will be required to validate model performance. Currently, this validation is only done by visually analysing model predictions to see if they look reasonable and sensible. The preferred approach would to be validate quantitatively with a pair of corresponding LR and HR 4D flow MRI images, calculating metrics such as RMSE and SSIM to properly understand model performance. Additional metrics may also be required to properly diagnose aortic regurgitation severity such as the effective regurgitant orifice area, which would be measured as part of a clinical workflow. Furthermore, blood flowing through the aortic valve was expected to be fully developed, which may not be the case in reality.

Lastly, several practical limitations can also be noted. Due to the limited GPU resources, training took approximately 60 days for all networks. With either more time or increased GPU resources, more networks could be trained by using additional optimizers, adding a weighted term to the loss function, as well as different values for hyperparameters such as the growth rate, the probability that aliasing occurs, and the upsample factor; a sensitivity test for this may be worthwhile. Moreover, networks could also be trained for more epochs, or until the error plateaus, to better gauge the learning ability of different networks. Finally, memory constraints were a factor too, as an upsample factor of 4 led to 64 times more usage in disk space. This would not be feasible for much larger upsample factors, so a different data representation may be required in future work.

Conclusion

In this study, 4DFlowNet was enhanced and adapted to effectively quantify hemodynamic metrics for AR by producing super-resolution 4D flow MRI images with an upsample factor of 4. The results show that by adding more geometries and hence flow characteristics into the data set, the accuracy of 4DFlowNet predictions are improved. Moreover, the comparison of different network architecture suggested that the original residual network structure limits learning ability and can be further refined.