Advanced acquisition strategies and image reconstruction have become major research topics in cardiovascular magnetic resonance (CMR). New techniques are commonly validated using numerical phantoms. However, computer simulations are often based on oversimplified and unrealistic models. Furthermore, the lack of standardized reference models has resulted in a large diversity in simulation methods hampering appropriate cross-validation. Comparability is, however, of particular relevance in CMR, where cardiac and respiratory motions add to simulation complexity.

Image reconstruction has evolved into a significant branch of MRI research. Considerable efforts have been invested into the development of non-Cartesian and accelerated data acquisition strategies, as they permit reduced scan time, increased spatial and/or temporal resolution or a combination of these benefits [1]-[4]. Despite the growth of the research field, it remains difficult to benchmark the various image reconstruction approaches. Among others, one reason for this is the lack of comparability between simulation setups.

Simulation models can be divided in two groups. Analytical numerical phantoms mathematically describe the simulated anatomy and the imaging experiment. The first analytical phantom to be used widely for imaging simulations was the Shepp-Logan head phantom [5]. Further development led to computationally more complex cerebral phantoms specifically designed for MRI, using elliptical shapes [6]-[9] or combinations of different geometrical contours [10]. Voxel based phantoms, on the other hand, are usually derived from acquired images of a single volunteer or patient, with a few segmented tissue types only and adapted to the purpose of the study [11]. This has led to a plethora of numerical phantoms used by individual research groups. Voxel based phantoms are realistic and easily accessible, but limited to the acquisition parameters they are based on, such as spatial and temporal resolution. Furthermore, for voxel based phantoms the image to k- space relation is approximated by the discrete Fourier transform, which ignores k- space truncation errors. These effects are omitted with analytical phantoms, where the continuous Fourier transform is well-defined. In contrast, analytical phantoms usually do not incorporate motion as required in CMR.

A practical approach to numerical phantoms is the integration of advantages of both analytical and voxel-based methods into hybrid phantoms. Hybrid phantoms address the key limitations of analytical and voxel-based approaches, leading to a versatile, realistic and reproducible simulation framework. The Mathematical Cardiac-Torso (MCAT) and the extended Cardiac-Torso (XCAT) phantoms present such approaches by combining voxel-based in-vivo data with non-uniform rational B-splines (NURBS) or polygon meshes to define anatomical objects [12]-[14]. The anatomical information in XCAT is based on the Visible Human Project of the National Library of Medicine [15].

Several numerical phantoms for CMR applications have been proposed so far, including analytical [16], voxel based [17] and hybrid models [18]-[25]. These in part allow for reproducible figures of merit based on known simulation parameters such as spatial resolution. Despite the attempt to generate versatile and realistic phantoms, none of the above-mentioned studies have made their phantoms available to the community. Therefore, reproducibility is limited and reimplementation of the phantom is cumbersome.

In this paper, a hybrid numerical CMR phantom referred to as MRXCAT is presented. MRXCAT is a simulation framework designed for research on CMR sampling and reconstruction strategies in the presence of motion. In order to promote comparability and reproducibility in CMR simulation research, MRXCAT is made available online to the community [26]. Cine and myocardial perfusion MRXCAT implementations are presented alongside with reconstruction and post-processing examples as showcase applications.


The XCAT software [13] allows generation of realistic human anatomical masks including cardiac contraction and respiratory motion. Spatial resolution, matrix size, temporal resolution, coverage, geometry and non-rigid object motion are adjustable. For the MRXCAT implementation, each tissue mask is assigned MR properties as illustrated in Figures 1 and 2.

Figure 1
figure 1

MRXCAT workflow overview. Tissue masks in desired orientation with optional cardiac and respiratory motion are generated using XCAT. MR contrast, including dynamic contrast enhancement, is mapped onto these masks. After coil sensitivity multiplication, noise is added to attain desired image degradation. Optional gridding to non-Cartesian trajectories and resampling, followed by image reconstruction and post-processing complete the simulation procedure.

Figure 2
figure 2

MRXCAT software structure. Geometry, timing and physiology parameters are set at the XCAT input. The tissue and sequence operators depend on tissue properties, sequence parameters and the signal model m. The coil operator and noise object use coil geometry and SNR as input. The trajectory governs the effect of the gridding and sampling operators.

Mathematically, the MRXCAT phantom D k , t in k-t space is expressed as

D k , t = i = 1 NS R i G F S N c , r c , x c C m T E , T R , , N R T T 1 , T 2 , , c t O i x , t + n i x , t ,

where O i x , t is the XCAT object, which is a function of space and time, and n i x , t denotes object noise. The circle denotes application of the operator on the left hand side of ○ onto the object on the right hand side. Consecutive operations are applied from right to left. T is the tissue operator, which depends on relaxation times T 1 , T 2 , the proton density ρ and the contrast agent concentration c(t). The sequence operator C m implements the signal model m with various parameters including echo time T E , repetition time T R , flip angle α and number of repetitions N R . Coil operator S applies coil sensitivities of N c receive coils with radius r c and location x C as input. After Fourier transform F, the gridding operator G and the sampling operator R i convert the raw k- space phantom into simulated MRI data, as detailed below. The index i refers to k- space segmentation. In segmented acquisition, the full XCAT and noise objects are created for each segment i and the sampling operator R i extracts the segment i from the full k- space data.

XCAT object module

A number of physiological parameters such as heart beat and respiratory cycle duration can be adjusted in XCAT. Several parameters are defined at the XCAT input including spatial and temporal resolution, matrix size, slice offcentres and angulation and the number of simulated heart beats. The timing information allows setting the temporal resolution during the heart phase and the number of segments needed to acquire the full k- space. XCAT produces binary masks of different organs and tissue types. The blood pool and myocardial tissue of the atria and ventricles in the left and right heart are divided into eight individual partitions. Arteries, veins and pericardium form separate compartments.

Tissue and sequence operators

Using the tissue and sequence operators T and C m the organ masks (XCAT object O i ) are assigned realistic MR signal intensities based on tissue properties T 1 , T 2 , proton density ρ and contrast agent concentration c(t). The concentration is used to simulate dynamic contrast enhancement by shortening T 1 . Sequence timing parameters such as repetition time T R , echo time T E , number of repetitions N R and the flip angle α are at hand. Finally, signal model m converts these parameters to signal intensities. Examples of available signal models are balanced SSFP [27],[28] and saturation-recovery gradient echo sequences [29],[30].

Dynamic contrast enhancement during first-pass myocardial perfusion is simulated based on a population average arterial input function (AIF) from six healthy volunteers. A contrast dose of 0.025 mmol/kg b.w. and a short saturation delay (30 ms) were used to ensure linearity between signal intensity and contrast agent concentration [31]. Mid-ventricular signal intensity vs. time curves were extracted from the left ventricle and manually shifted in time to the same bolus arrival time before averaging, which mitigates temporal broadening of the AIF peak by averaging. The population average AIF was fitted by a 3-parameter gamma-variate function to obtain a first-pass AIF of the contrast bolus [32]. The myocardial concentration-time curve is obtained by convolution with a Fermi function [30]. The blood pool and myocardial concentrations are then converted to signal intensities using the relationship between T 1 and concentration c(t),

R 1 = 1 T 1 , 0 +c t R,

where T 1,0 is the baseline tissue T 1 in the absence of contrast agent and R the relaxivity of the contrast agent.

Coil operator and noise object

Three-dimensional coil sensitivities are simulated using the Biot-Savart law,

B r = μ 0 4 π coil i dl r × r ' r r ' 3 ,

by discretising the integral and calculating the effective magnetic field at the voxel positions of the XCAT masks. Normalized sensitivity masks are multiplied with the phantom yielding one dataset per coil. Adjustable parameters in this module are the number of receive coils N C , the radius r C and position x C of the individual coil elements. The coil elements are arranged on one or more circles around the MRXCAT phantom to mimic typical cardiac coil arrays around the chest. Note that the Biot-Savart law is only valid in the near field approximation, i.e. for field strengths < 4 T. Since most CMR protocols are designed for 3 T or lower field strength the approximation is considered a valid assumption.

A spatial low-pass filter option is implemented at this point with variable filter strength to obtain realistic signal intensities at tissue interfaces. White Gaussian noise is added to obtain the desired signal-to-noise ratio (SNR). In case of dynamic contrast enhancement, the noise level is defined by the contrast-to-noise ratio (CNR). After this step, the fully sampled Cartesian MRXCAT phantom is available.

Gridding and sampling operators

The image-domain phantom is transformed into k- space via discrete Fourier transform F. The gridding operator G allows interpolation of the data onto non-Cartesian trajectories. To apply these trajectories, the non-uniform fast Fourier transform (NUFFT) implementation by J. Fessler [33] and the NUFFT wrapper by M. Lustig [2] are used, which are available online [34],[35]. The sampling operator R defines the sampled k- space profiles and enables application of undersampling schemes by omitting certain k- space lines. If the simulated acquisition consists of multiple k- space segments, the required segment is extracted from the full k- space. The MRXCAT phantom production is repeated for each segment to yield the final MRXCAT phantom, which is saved as MATLAB output. Other output formats such as the ISMRMRD data structure [36] are possible.

MRXCAT examples and in-vivo comparison

To allow comparison of the numerical phantoms with in-vivo CMR, cine and first-pass myocardial perfusion in-vivo data were acquired in two healthy volunteers. CMR was performed on two Philips Achieva scanners (Philips Healthcare, Best, The Netherlands) at 1.5 T and 3 T field strengths. All volunteers gave informed consent according to institutional policy. Cine in-vivo parameters were: spatial resolution: 1.56×1.56 mm2, 10 mm slice thickness, field-of-view: 400×400 mm2, T R = 3.4 ms, T E = 1.7 ms, flip angle = 50°, 20 heart phases. Parameters for in-vivo myocardial perfusion imaging were: spatial resolution: 2.14×2.25×5 mm3, field-of-view: 360×360×80 mm3, T R = 1.97 ms, T E = 0.78 ms, flip angle = 15°. Perfusion image acquisition was performed using a 10-fold accelerated saturation-recovery gradient echo sequence with a saturation delay of 150 ms followed by k-t PCA reconstruction [37]. A gadolinium bolus (Gadovist, Bayer Healthcare, Zurich, Switzerland) with a contrast agent dose of 0.025 mmol/kg b.w. was applied during a breathhold acquisition of 30 time frames.

Example cine and myocardial perfusion MRXCAT phantoms were generated with T 1 and T 2 values according to textbook literature [38]. Relative proton densities were extracted from a proton-density weighted in-vivo measurement. Two-dimensional (2D) cine MRXCAT parameters were equal to the in-vivo measurement, except for the number of heart phases, which was set to 24 to enable 8-fold undersampling and k-t reconstruction without zero-filling (cf. below). An SNR of 20 and 15 segments were simulated. Parameters for simulated first-pass myocardial perfusion imaging were as in-vivo except for the in-plane resolution, which needed to be isotropic in XCAT and was set to 2.14×2.14 mm2. Further parameters were: CNR of 30, 40 profiles from acquisition start to k-space centre, contrast agent dose: 0.075 mmol/kg b.w., contrast agent relaxivity: R = 5.6 l/mmol·s, 32 acquired time frames with one image per heartbeat. Rest and stress myocardial blood flow were set to 1.0 and 3.5 ml/g/min, respectively.

Image reconstruction and quantification from undersampled data

Both the 2D cine and the 3D myocardial perfusion fully sampled Cartesian phantoms were retrospectively undersampled and reconstructed using k-t PCA [39] and k-t SPARSE [40]. For both cases, an acceleration factor of R = 8 was chosen. To visualise local variations in reconstruction accuracy, error maps were calculated by subtracting the reconstructed images from the fully sampled reference phantom reconstruction. All error maps were normalized by the average signal intensity over all time frames of the reference image to yield percentage errors. Root mean square error (RMSE) values were calculated for each case to estimate the total error. Quantification of myocardial blood flow (MBF) was performed on the myocardial perfusion reconstruction results. Using the XCAT masks, signal intensity-time curves were extracted from left ventricle and myocardium and converted back to contrast agent concentration using equation (2) and the signal model. The signal intensity-time curves allow display of temporal filtering effects introduced by image reconstruction from undersampled data. The concentration vs. time curves were quantified using Fermi model deconvolution [30]. Analysis of the MBF from fully sampled versus undersampled data enabled estimation of the additional uncertainty introduced by undersampling compared to measurement noise.

The MRXCAT extension for XCAT was implemented in MATLAB (MathWorks, Natick, MA, USA). All computations were run on a laptop PC equipped with an Intel i7 m620 2.67 GHz CPU with 2 cores and 8 GB memory. Computation performance numbers were assessed for all computational tasks.

The MRXCAT extension to XCAT is available online:[26].


Figure 3 gives an overview of the MRXCAT cine phantom. Examples at systole and diastole are shown in inspiration and expiration states. Temporal profile plots of breathhold MRXCAT cine, free-breathing MRXCAT cine and in-vivo breathhold cine data are demonstrated (Figure 3c). Individual coil maps of a simulated 8-element cardiac coil array are displayed in Figure 3f.

Figure 3
figure 3

MRXCAT cine phantom overview. Full field-of-view at systole and expiration (a) and at diastole and inspiration breathhold (b). (c) Profiles along dashed line in (b) for 24 heart phases for breathhold and free-breathing MRXCAT with 15 segments and the breathhold in-vivo case for comparison. (d-e) Cardiac region-of-interest at different cardiac phases for breathhold MRXCAT (d) and in-vivo scan (e). (f) Single coil images from 8 individual coils.

In Figure 4 the whole-heart myocardial perfusion MRXCAT phantom is displayed. Apical, mid-ventricular and basal slices at the time points of bolus arrival in the right and left ventricle as well as in the myocardium are shown in Figure 4a. In-plane profiles as a function of time extracted from the breathhold, free-breathing phantom and in-vivo acquisition are demonstrated in Figure 4b. A cardiac region-of-interest (ROI) is shown in Figure 4c. In addition, the effect of timing of the saturation pulse with respect to the image acquisition is shown for a long saturation to acquisition delay of 150 ms relative to a short saturation delay of 10 ms to demonstrate the non-linearity between MR signal and contrast agent concentration [41].

Figure 4
figure 4

MRXCAT cardiac perfusion phantom overview. (a) Three slices from apex to base at maximum contrast agent enhancement in the right ventricle, left ventricle and myocardium. (b) Anterior-posterior profiles along dashed line in (a) for all heart beats for breathhold MRXCAT, free-breathing MRXCAT and in-vivo breathhold myocardial perfusion images (left-right). (c) 16 slices of the MRXCAT perfusion phantom, covering the whole left ventricle at a time frame during contrast enhancement. (d-e) Signal intensity curves extracted from the left ventricle (AIF) and the myocardium of stress and rest MRXCAT perfusion phantoms generated with saturation delays of 150 ms (d) and 10 ms (e).

Figure 5 shows the results of image reconstruction of the breathhold 2D cine MRXCAT phantom with 8-fold data undersampling. A cardiac ROI and a profile through the slice as a function of heart phase are shown. Absolute error maps reveal that both k-t PCA and k-t SPARSE yield reconstructed images of good quality. Reconstruction errors are found at the borders of the myocardium, where cardiac contraction causes most motion. The RMSE in the cardiac ROI was 10.0% and 10.2% for k-t PCA and k-t SPARSE, respectively.

Figure 5
figure 5

MRXCAT cine reconstruction example. Diastolic and systolic heart phases and profile along the dashed line in (a) for all heart phases. Reference images (a), reconstruction results and difference maps using 8x k-t PCA (b) and 8x k-t SPARSE (c). The gray values for all error maps were scaled from 0 to 30%.

Figure 6 illustrates reconstruction results for the 8x undersampled myocardial perfusion phantom at time points of bolus enhancement in the left ventricle and myocardium. Error maps show pronounced edge effects for k-t SPARSE, whereas the error distribution in k-t PCA is more homogeneous. Signal intensity-time curves for four voxels in the heart are plotted. The RMSE was 16.1% and 14.4% for k-t PCA and k-t SPARSE, respectively. Fermi model based MBF quantification in eight mid-ventricular slices of the simulated stress perfusion image is shown for reference, k-t PCA and k-t SPARSE reconstructions in six angular sectors per slice. Mean and standard deviation of MBF were 3.48 ± 0.13 ml/g/min in the reference image, 3.48 ± 0.36 ml/g/min for 8× k-t PCA and 3.55 ± 0.33 ml/g/min for 8x k-t SPARSE.

Figure 6
figure 6

MRXCAT myocardial perfusion reconstruction and quantification example. Fully sampled reference (a), 8x k-t PCA (b) and 8x k-t SPARSE (c) reconstruction results and error maps. A mid-ventricular slice at peak left-ventricular (LV) and myocardial enhancement and profile along the dashed red line for 32 time frames are shown. (d-f) Signal intensity-time curves for reference and reconstructed data from 4 voxels (inset). (g-i) Fermi MBF quantification of stress perfusion simulation in 8 slices with 6 angular sectors per slice (inset). Ground truth MBF was 3.5 ml/g/min. The gray values for all error maps were scaled from 0 to 80%.

Total computation times for MRXCAT generation were 38 and 124 minutes for the breathhold and free-breathing cine phantom, respectively. For the perfusion phantom, calculations took 13.6 and 10.7 minutes for breathhold and free-breathing simulations. In all cases, the vast majority of the time was spent generating the XCAT anatomy. The large difference between cine and perfusion computation times mainly stems from the fact that 360 XCAT time frames were created for cine (24 heart phases, 15 segments), while only 32 time frames were needed for perfusion. By executing only the MATLAB part of MRXCAT, calculation times spanned from 54 s (breathhold cine) to 166 s (free-breathing perfusion).


In this work a CMR extension to the XCAT phantom has been proposed to enable realistic simulation to aid and benchmark image reconstruction and quantitative post-processing approaches.

The MRXCAT framework is composed of several sequential software modules, which enable user interaction via parameters at each stage of the simulation. The XCAT hybrid phantom, which is used to simulate anatomy and motion, delivers realistic and generalized tissue masks. Although the XCAT object is rasterised, the drawbacks of voxel based phantoms can be addressed by increasing the spatial resolution in XCAT. The spatial resolution is typically on the order of millimetres in human CMR. Thus, the 0.33×0.33×1 mm3 voxel size of the Visible Human segmentation [13] is sufficient for most CMR applications. All subsequent software parts act on that rasterised XCAT object, which simplifies implementation and data handling in array-based numerical software. The individual operators or software modules allow setting parameters specific for tissue and sequence, receiver coils, measurement noise and k- space characteristics. The independency between modules potentially enables parallel computing and division of large data portions into chunks on computers with less available memory.

The use of MRXCAT phantoms for research on accelerated imaging was exemplified using k-t PCA and k-t SPARSE. The effect of cardiac contraction in conjunction with image acceleration was showcased with the cine phantom, where the biggest residual errors after reconstruction remain at the inner and outer border of the myocardium. Using the myocardial perfusion showcase, the effect of signal changes due to contrast agent administration was demonstrated. In this example, residual reconstruction errors mainly stem from temporal filtering of the signal-time curves. Moreover, MRXCAT phantoms can be used for the development of post-processing methods such as myocardial blood flow quantification. A key advantage of quantification using MRXCAT is that noiseless ground truth is always at hand and available for assessment of errors.

The XCAT phantom produces masks with well-defined, sharp tissue boundaries. There is no partial volume effect. This issue was accounted for by adding a low-pass filter module before noise addition. A similar problem is the missing organ texture in XCAT, which has recently been addressed [42]. Organ heterogeneity, however, is currently not incorporated in MRXCAT.


MRXCAT allows simulating cine and dynamic single-shot and segmented data acquisition. However, MRXCAT assumes instantaneous acquisition of the individual segments. Extending the framework to enable individual timing of single k- space lines would require full XCAT masks for each k- space line. Since the XCAT part of MRXCAT is the computationally most time-consuming step, this would yield unacceptably long simulation times.

MRXCAT is based on signal equations rather than Bloch equations. As such, it is suitable for the simulation of sampling strategies, trajectory optimization and post-processing methods. However, a number of error sources are excluded, such as magnetic field inhomogeneity, magnetization evolution across time frames or signal alteration due to motion during sampling. For those applications, other simulation frameworks based on solving Bloch equations may be more suitable.


The MRXCAT framework is a versatile tool for simulation in CMR research. Applications to cine and perfusion CMR have shown example usage of the framework. Extensions to other CMR modalities, such as diffusion or flow encoded MRI, are possible.

Authors’ contributions

LW, CS, SK devised MRXCAT in Matlab, created phantom examples, performed image reconstruction and post-processing. WPS implemented off-centres and angulation options in XCAT for this study. LW, SK acquired in-vivo data for reference and prepared the manuscript. All authors read and approved the final manuscript.