Myocardial tissue characterisation by T1 mapping and estimation of extracellular volume (ECV) by cardiovascular magnetic resonance (CMR) is playing an increasingly important role in the diagnosis and management of patients and clinical trials [1]. T1 mapping is available as three broad classes of sequences, on multiple platforms, at two field strengths. Factors influencing T1 mapping stability and inter-sequence comparisons are well understood [14] but little is known about T1 mapping delivery at a larger scale over many sites and there is no global quality assurance (QA) system.

The goal of the T1MES program (T1 Mapping and Extracellular volume Standardisation) was to construct an optimised phantom for QA of myocardial T1 mapping, covering a relevant range of T1 values with suitable T2 values for the tissues modelled. The proposed QA consists of regular scans using fixed T1-mapping protocols identical to whatever fixed protocols are used in vivo at each participating site. We therefore aimed for a phantom design that would have stable T1 values for as long as possible. We also aimed for a phantom design avoiding temperature sensitivity of its T1 values as explained later in Methods.

Such a QA system would form part of a system for optimal mapping precision and accuracy [2] within the increasingly known fundamental limitations of the T1 mapping methods [5, 6].

The T1 Mapping and ECV Standardization (T1MES) program therefore aimed to:

  1. 1.

    Create a partnership of physicists, clinicians and national metrology institutes

  2. 2.

    Design phantom systems for 1.5 T and 3 T for any manufacturer/sequence reflecting T1 values in myocardium and blood, pre- and post-Gadolinium-based contrast agents (GBCA)

  3. 3.

    Reproducibly specify and mass produce phantoms with a rigorously repeatable process and to regulatory standards

  4. 4.

    Distribute them to global CMR sites with detailed instructions for fortnightly scanning

  5. 5.

    Publish full details of the formulation to encourage additional applications

  6. 6.

    Measure confounders (e.g. temperature dependency)

  7. 7.

    Analyse scans over 1 year to study the stability of T1 measurements over time at each scanner, including a temperature correction model for T1

  8. 8.

    Curate phantom data long-term in an open access repository available for reuse/analysis

  9. 9.

    Analyse the inter-site differences in T1 values and explore the deliverability of a technique-independent ‘T1/ECV Standard’ through local calibration

To date we have achieved steps 1 to 6 of this process, namely the development, testing, certification, QA protocol and preliminary results of T1MES. This paper summarises these first 6 milestones.



The term “phantom” refers to the complete test object (Fig. 1).

Fig. 1
figure 1

Internal and external phantom structure. Internal (3 T, looking at the front—a) and external (1.5 T, front and back—b) T1MES phantom structure. The nine tubes are supported on a translucent resin base composed of unsaturated polyester/styrene. A careful hardening and curing process ensured a smooth surface finish for the resin base. The front of the phantom (b left) contains an isocenter cross label to aid positioning as well as an LCD thermometer. Careful positioning of the bottle on the scanner table (c) with the cap towards its head end is needed to ensure it is scanned at isocenter each time. HDPE = high-density polyethylene; LCD = liquid crystal display; NiCl2 = Nickel Chloride; PE = polyethylene; PVC = poly vinyl chloride

The term “tube” refers to each of the small bottles embedded within the phantom.

The “gel matrix” is the gel and bead mixture filling the phantom that surrounds all of the tubes.

Collaboration process

A design collaboration for developing and testing the T1MES phantom and its prototypes was established, consisting of clinicians, physicists, national metrology institutes (the US National Institute of Standards and Technology [NIST] and the German Physikalisch-Technische Bundesanstalt [PTB]) and a small-medium enterprise familiar with phantom production (Resonance Health [RH], Perth, Australia). Funding was secured including a grant from the European Association of Cardiovascular Imaging. Time and expertise was provided for free by the partnership. To engage a global community with constrained funding, the phantoms were gifted (first come, first served) to centers with the proviso that they: a) scan them fortnightly for 1 year and upload the results; b) engage with the partnership to explore any unexpected results; c) do not do anything that could potentially compromise (a) or (b) (e.g. deconstruct the phantom object); and d) give proper reference to the T1MES project if they use the phantoms for other purposes.

Phantom design

The design process involved several prototype iterations (known as models A—D before the final mass-production of E-models). Some aspects such as artefacts from the prototype A through D-models that guided the final E-model design are described in Methods and in Fig. 2 with a timeline in Fig. 3. At the very least, the initial A-D models were needed to achieve reasonable T1 and T2 values without deleterious imaging artefacts, especially as imaging was conducted remotely from the manufacturer.

Fig. 2
figure 2

Artifact examples in earlier prototypes (a-g) and final T1MES phantom (i, j). Four earlier prototypes (models A—D) were rejected before the final model. a Coronal image of the earlier A-model (aqueous fill) showing bright artifacts around the tubes resulting from bSSFP going off-resonance that would have led to variations in T1 values by MOLLI and similar sequences. b Transverse image of A-model showing the characteristic ‘cat’s head’ artifact of air-bubbles trapped in the paramagnetically doped aqueous tubes. Significant off-resonance artifact is also noticeable in the central tubes. c Another coronal image through A-model but with larger gaps between tubes showing the combined effect of motion artifact (due to the aqueous fill) and B 0 distortion. d Transverse image of C-model attempting to use narrower tubes to pack 12 instead of 9, but significant Gibbs artifact can be seen in each tube. e Transverse image of C-model showing three small dark circular artifacts (12, 3 and 9 o’clock positions) caused by glue used to stabilize the tube arrangement. We subsequently switched to silicone-based glues that were less likely to trap air bubbles and were artifact-free. f Severe stabilisation artifact appearing as a thick dark band around the border of a D-model—here the phantom was scanned immediately after being received from the courier company and the bottle was still very cold from the transportation. Additionally susceptibility artifacts can be seen as thin linear bands spoiling some of the tubes (9 and 3 o’clock). g Significant image intensity inhomogeneity during a D-model test session on a GE scanner caused by accidental omission of the folded blanket, intended to separate the phantom bottle from the anterior chest coil. h Curved tube artifact and dark rings arising from ink printed onto the sides of digestive tubes (images courtesy of K. E. Keenan and NIST). i Coronal bSSFP localiser image and (j) typical T1 map of a final 3 T T1MES phantom obtained by MOLLI using a bSSFP readout on a Siemens 3 T Skyra scanner. bSSFP = balanced steady-state free precession; MOLLI = modified Look-Locker inversion recovery. Other abbreviation as in Fig. 1

Fig. 3
figure 3

Prototype models and T1MES project timeline. CE = Conformité Européene; FDA = Food and Drug Administration; GE = General Electric; NIST = US National Institute of Standards and Technology; PTB = German Physikalisch-Technische Bundesanstalt; QA = quality assurance; RH = Resonance Health

The range of T1 and T2 values in the phantom aims to cover typical native and post-GBCA values in both myocardium and blood. The especially wide range of T1 post-GBCA (due to variable practice regarding dose, wash-out delays etc. and of course also disease) requires several tubes to cover it. From a review of published values and our own experience, we selected the values listed. Whatever rationale is adopted, with a limited number of tubes there will inevitably be gaps.

T1 is generally longer at 3 T compared to 1.5 T. Initially we aimed to design a single phantom for both 1.5 T and 3 T, containing a sufficient number of tubes to cover the needed T1 ranges in blood and myocardium, with suitable T2 values, pre and post-GBCA at both field strengths. However, the frequency dispersion (i.e. B 0 field dependence) of relaxation times in the phantoms differed strongly from that of myocardium and blood, particularly for the long pre-GBCA tubes, requiring a total of 13 different tubes for 1.5 T and 3 T. Fitting 13 tubes into a single phantom would either have made the object ‘large’ (in relation to the B 1 distortion at 3 T discussed below) or would have required the use of smaller calibre tubes. The following considerations justify our construction therefore of ‘field-specific’ phantoms:

  • Tubes had to be a minimum of 20 mm diameter so regions of interest (arbitrarily set to13 mm) would exclude in-plane imaging artifacts at the boundaries between tubes related to the use of clinical T1mapping protocols with coarse image resolution, mostly based on single-shot imaging (e.g. Gibbs artifact at the edge of tubes [Fig. 2d] or the potential impact of filtering against it applied differently by various protocol parameters). Altering protocols to optimise phantom scanning would be inconsistent with the aim of the project. The true resolution achieved is further convoluted by the use of asymmetric frequency-encoded readouts for faster repetition time (TR) in balanced steady-state free precession (bSSFP) imaging or partial-phase-encode sampling for shorter total shot duration, and to some extent also by signal variation during the shot.

  • Embedding tubes into a gel-filled phantom is important for three reasons: 1) to permit sufficient signal for scanner calibrations; 2) to minimise B 0 and B 1 field distortions local to each tube; and 3) for greater thermal stability. However, embedding all the 13 tubes (to cover 1.5 T and 3 T values) into a single phantom (whether water or water-based gel-filled) will have increased its overall dimensions making it harder to make (our tests and others [7, 8] show that B 1 homogeneity across large ROIs could not be achieved especially at 3 T). Alternative oil-based phantoms have a smaller dielectric permittivity, useful for weaker radiofrequency (RF) displacement current distortion of B 1, but the chemical shift of the matrix fill would require embedded tubes also to use oil-based chemistry (as in diffusion phantoms). Alkanes or similar [9] could not deliver the required range of T1 and T2 (written as T1|T2) and a predominately single-peak nuclear magnetic resonance (NMR) spectrum, with the required temperature stability. By using separate water-based gel-filled phantoms for 1.5 T and 3 T with the known high permittivity of water, at a size large enough to fit the needed tubes there was still significant B 1 distortion (range of different flip angles achieved for a prescribed protocol nominal flip-angle) but we were able to counteract it using a method described later.

  • This project aims to provide quality assurance for clinically used T1 protocols without adapting to the phantom (e.g. no switching to spoiled-gradient echo, or using shorter-TR, no alterations of resolution or field of view etc.; see Additional file 1). Clinical T1 mapping protocols are sensitive to off-resonance effects for various well-known reasons. Therefore, B 0distortion near any of the tubes needed to be minimised (tests showed how tube alignment with the B 0 direction was best—this data not shown).

Phantom materials

All materials proposed for phantoms to date suffer different deficiencies. We adopted the most suitable formulation known, which are paramagnetically doped agarose or carrageenan gels [10, 11]. Some of the main design aspects are listed in Table 1.

Table 1 Design factors when developing a T1 mapping phantom

Agarose or similar gel phantoms are widely used in MR research but less often in commercial phantoms, probably because of long-term stability issues discussed later. Gels permit independent variation of T1|T2 and they avoid fluid movement within image slice during long inversion recovery (IR) times that could potentially introduce uncertainty in the T1* to T1 conversion [12]. A more concentrated gelling agent mainly shortens T2; a higher paramagnetic ion concentration mainly shortens T1 [11, 13]—the two effects are not independent but can be modelled [14] enabling design of mixtures with any required T1|T2 combination. We did not include sodium chloride (NaCl) (see B 1 uniformity section below). Gel choices include carrageenan, gelatin, agar-agar, polyvinyl alcohol, silicone, polyacrylamide. Some have undesirable NMR spectral properties. The paramagnetic ion choice [15] includes copper, cobalt, iron, manganese (Mn2+), gadolinium and nickel (Ni2+). Due to the individual T1|T2 relaxivities of the various ions, no currently known ionic mixture in water can deliver the native myocardial T1|T2 combination (which requires a relatively high T1 with a short T2). Ni2+ was our first choice as the paramagnetic relaxation modifier at it is less temperature and frequency dependent than other ions [13, 16] and because nickel chloride (NiCl2) agarose gel phantoms have been shown to be stable over a 1 year period [17].

Characterization of T1 and T2 dependence on agarose and nickel

To achieve the required T1|T2 tube values we characterised the relation between T1|T2, agarose and NiCl2 concentrations. We made a wide variety of test mixtures as follows: we dissolved at 95 °C for 2.5 h, 135 different concentrations of NiCl2, water and agarose, each in a separate 50 ml digestive tube. Using a preheated serological pipette, samples were transferred into preheated NMR tubes (to prevent instant setting of the gel while flowing down the tube), allowed to set and analysed at a measuring temperature of 22 °C with a 1.4 T Bruker Minispec mq60 (60 MHz) relaxometer (Perth, Western Australia). Exponential fitting was done and T1 and T2 recorded. Based on these results we calibrated the equations [14] modelling the relationship between ingredients and T1|T2 relaxation times (omitting saline). The model assumes a linear relation between the ingredients and the relaxation rates (R1,R2) = (1/T1,1/T2). Using this the ingredients for any required T1|T2 tube could be calculated. The model was tested for the set of 13 unique T1|T2 combinations desired for the 1.5 T and 3 T phantoms. Some iterations (models A through D, Fig. 3) were required to derive from the model (based on a non-imaging 60 MHz relaxometer) tube values applicable to clinical 1.5 T and 3 T MR systems described later.

B 0 uniformity

The approximately cuboid, outer body of the T1MES NiCl2-agarose gel phantom (Fig. 1a) consisted of a short, hollow, wide necked and leakproof brown-transparent poly vinyl chloride bottle with a melting temperature of 140 °C (Series #310-73353, Kautex Textron GmbH & Co. KG, Bonn, Germany). The adopted shape is more ellipsoidal than many of the shapes rejected in our tests, consistent with basic magnetostatics (sphere of Lorenz) at 1.5 T and 3 T. The B 0 distortion by the phantom arises from electronic diamagnetism and is not significantly affected by the paramagnetic ion concentrations used. Adding sufficient paramagnetic material to cancel the diamagnetism and flatten B 0 would excessively shorten the relaxation times.

The final body shape gave sufficient B 0 uniformity for T1 mapping over only a small region approximately halfway along its length when aligned coaxially with B 0. Regions towards the cap and base of this object were subject to off-resonance errors [18]. The tubes inside the phantom were therefore not fixed directly down to the base of the main bottle. A 20 mm layer of non-coloured (non-saturated) polystyrene resin (Diggers Casting and Embedding Resin 500GM, #FIE00506-9311052000759, Recochem Inc. Perth, Western Australia) was first set hard in the base of the main bottle, and the tubes were adhered to the top of this layer, so that the tubes occupied the middle of the phantom in the cap-to-base direction, where the B 0 field is optimally uniform. B 0 uniformity was mapped to evaluate this cause of distorted T1 estimates, using a multi-echo gradient echo sequence based on the phase difference between known echo times [19]. A frequency range of +/−50 Hz across the phantom was regarded as acceptable based on published T1-mapping sensitivity to off-resonance [18].

B 1 uniformity

B 1 uniformity in large water-based phantoms [20, 21] is complex but fundamentally the electric dipole moment of the water molecule rotates in the oscillating electric field associated with the RF B 1field, giving rise to displacement current. Sucrose or other large nonionic molecules can reduce water permittivity, by in effect diluting the problematic water molecules. However, the spectral contribution of such molecules at the high concentrations required is a severe complication. An alternative approach often described in phantom literature is the addition of sodium chloride or similar simple ionic solutes (n.b. not to be confused with high permittivity of powdered titanates, suspended in deuterated water). This tackles the problem from a different direction as it leaves the permittivity unchanged but increases the conductivity (σ) instead, to reduce ωε/σ, i.e. the ratio of displacement current to conduction current. Adding NaCl to the T1MES phantom acted on B 1 distortion at a shallower depth in the T1MES phantom and did not cancel the overall B 1 curvature at any NaCl concentration tested.

In this work, deriving from the sucrose approach, we hypothesised that mixing plastic beads into the matrix gel might also effectively dilute the dielectric permittivity of water and lead to improved B 1 uniformity without directly altering the outer matrix gel T1|T2 values (see Table 2, 846 ms |141 ms). Our choice of outer matrix gel T1|T2 values was informed by tests looking at different outer matrix gel T1|T2 combinations (data not shown) and their impact on bSSFP-stabilisation artifacts at both field strengths. For the beads, two different kinds of plastic bead were evaluated: highly monosized microbeads composed of crosslinked poly methyl-methacrylate (PMMA) polymer (6 μm, Spheromers, Microbeads AS, Norway) and high-density polyethylene (HDPE) beads of oblate spheroidal form (3 mm polar axis by 4.2 mm equatorial diameter) consisting of smooth, semi-translucent, colourless HDPE with a melt index >60 °C (HDPE Marlex HHM 5502 BN, Chevron Phillips Chemical Company LP, Texas, USA). It is important to control the supply of HDPE pellets to ensure that they have not been reground, reblended or otherwise modified. The two different plastic bead versions of T1MES matrix gel were compared to the use of sucrose or sodium chloride (formulations tested: (1) added to 1050 ml of Ni2+-doped gelling solution, separately and in combination = 800 g sucrose, 50 g NaCl; (2) added to 1000 ml of distilled water containing NiCl2 and MnCl2 with T1 ~ 600 ms, T2 ~ 170 ms: 5 g NaCl; (3) added to 2534 ml of distilled water: 1 g, 4 g, 6.5 g, 11.5 g, 14 g, 19 g, 21.5 g NaCl). B 1 homogeneity was evaluated by flip angle (FA) maps derived by the double angle method using FA 60° and 120° (θ1, 2*θ1) by long TR (8 s) scanning using a 4 ms duration sinc (−3π to +3π) slice excitation width to minimise error due to FA variation through the slice.

Table 2 List of T1|T2 values for the target 13 tubes and outer matrix gel and the required agarose/NiCl2 concentrations for the final phantom

Temperature dependence of T1 and T2

Temperature dependency experiments on T1|T2 values [15] were carried out at various stages:

  • Test 1: Performed at the PTB laboratory in June 2015 on a 3 T prototype-D (whole phantom with 9 tubes) across 17 temperatures between 14.9 °C and 32.0 °C for T1 and across 6 temperatures between 15.6 °C and 31.1 °C for T2. Each measurement was repeated twice (with a 2 day gap) and made using a 3 T Siemens Magnetom Verio system (VB17) and a 12-channel head coil.

  • Test 2: Performed at the NIST laboratory in November 2015 on six loose tubes from the final production run of E-model phantoms. T1|T2 were measured at 9.9, 17.1, 20.1, 23.1 and 30.1 °C on an Agilent 1.5 T small bore scanner in a temperature-controlled environment. Temperatures were measured using a fiber optic probe. T1 was measured by inversion-recovery spin echo (IRSE) (TR [s] = 10, inversion time [TI, ms] = 50, 75, 100, 125, 150, 250, 500, 1000, 1500, 2000, 3000) and T2 by SE (TR [s] = 10, TE [echo time, ms] = 15, 30, 60, 120, 240, 480, 960). Note that some of the data acquired under short-term reproducibility was obtained in support of temperature Test 2.

Short-term reproducibility

Short-term reproducibility (single site, single manufacturer, single sequence) aided temperature sensitivity work and assessed baseline variability between fortnightly scans with all other parameters constant (not least, temperature). For the final T1MES phantom (E-model) two short-term reproducibility experiments were performed:

  • Test 1: Six loose tubes from the final production run of E-model phantoms were tested for short-term reproducibility of T1|T2 values at the NIST laboratory in November 2015, at 20.1 °C on an Agilent 1.5 T small bore scanner. T1 was measured by IRSE (TR [s] = 10, TE [ms] = 14.75, TI [ms] = 50, 75, 100, 125, 150, 250, 500, 1000, 1500, 2000, 3000) and T2 by SE (TR [s] =10, TE [ms] =14.75, 20, 40, 80, 160).

  • Test 2: One of the final E-model phantoms for 3 T was tested for short-term repeatability of T1|T2 values using a Siemens 3 T Skyra at Royal Brompton Hospital in November 2015. This test was performed by removing and repositioning the receiver coil, phantom and its supports on each of ten runs, incurring full readjustment of all scanner setup procedures before each run. The acquired data was ten runs, each containing two repeated T1 maps, performed at 20.3 ± 0.5 °C. An extension of this work showed that the temperature increase of the T1MES phantom caused by specific absorption rate (SAR) deposition during imaging for repeated T1 maps was negligible.

Detailed construction of phantoms

Some of the detailed construction topics and constraints are listed in Table 1.

Each phantom (1.5 T or 3 T) contains nine tightly capped digestive tubes (#SC475, 50 ml from Environmental Express, South Carolina, USA) embedded in a gel matrix containing Nickel (II) Chloride hexahydrate (99.9999 % purity grade, Acros Organics, New Jersey USA, n.b. highly hygroscopic), high purity deionized water (Ibis Technology) and polysaccharide agarose powder with low endosmotic flow for electrophoresis (molar ratio ≤0.07, Acros Organics).

Mass production was from large batches of 14 solutions (13 tubes + outer matrix gel, Table 2) from which all the tubes and outer containers were filled accordingly. The mass production required some caution against deterioration of the agarose/NiCl2 mixtures if kept at high temperatures for periods exceeding around 8 h. The production of all copies of each tube therefore had to be completed within a single working day and as rapidly as possible. Deterioration was noted as a change of agarose gel colour from colourless to faint yellow. Microwave oven heating for initial agarose dissolution was followed by further magnetically-stirred heating and adjustments (based on relaxometry of samples from the mixture). Stirring was essential for uniform gel production into all copies of each tube. Each of the nine tubes is filled with differently doped agarose gels and contains minimal air gaps. Agarose gel contracts as it sets solid, contracting more in stronger agarose mixtures. By “topping up” more gel to the space left by contraction after the initial fill had set in each tube, the air gap can be minimised. Further, by cooling the tubes from the base (by standing them in approximately 2 cm depth of cold water), the gel solidified from the base upward so that contraction left a gap at the top of the tube for adding the “top-up”. This practical step was essential to avoid mid gel contraction gaps forming that is otherwise observed when the gel is allowed to set naturally earlier along the tube sidewalls. Such mid-gel gaps tend to cause a tear down the middle of the gel-filled tube making it unusable for ROI placement in images. The dissolving and solidifying temperatures of agarose gel show hysteresis, dissolving fully only near boiling-point, but requiring cooling to around 45 °C for solidification. The hysteresis assists practically, for example when pouring molten gel around the HDPE beads needed for the main matrix fill.

Of the 18 tubes used in the 1.5 T and 3 T phantoms, 4 are 1.5 T specific, 4 are 3 T specific (because tissue native T1 is longer at 3 T) and five tubes (the post-GBCA tubes) common to both field strengths (Fig. 4). Although some difference in post-GBCA T1 values does occur between 3 T vs. 1.5 T, this difference is absorbed within the very wide range of GBCA doses, post-GBCA times, GBCA types etc. in clinical use. Therefore 13 individual recipes were made. The 9 tubes in each field-specific phantom generate 9 different T1|T2 combinations (Fig. 5) modelled to cover the physiological range of native and post-GBCA, blood and myocardium in health and disease. There was no macromolecular addition with no attempt to model magnetisation transfer [22].

Fig. 4
figure 4

T1 and T2 values in T1MES. T1 and T2 values in the phantom mimic those of myocardium and blood pre and post-GBCA at 1.5 T (Panel a) and 3 T (Panel b). The 13 relaxometry scopes (refer to Table 2) are explained in the figure. Slow scan reference data for T1|T2 is displayed in green (for T1 by slow IRSE and for T2 by slow SE, RR interval 900 ms at 21 ± 2 °C), T1 values shown in orange represent the mean value per tube derived from tests on five of the E-model phantoms (using a 5(3)3 256-matrix RR = 900 ms at 21 ± 2 °C variant of MOLLI adapted for native T1 mapping; Siemens WIP 448B at 1.5 T and WIP 780B at 3 T), and in blue are T1|T2 values obtained by the manufacturer in Australia using a 1.4 T Bruker minispec relaxometer at 22 °C. Tube arrangement is such that long T1 tubes potentially suffering from more artifacts are kept towards the middle of the phantom and away from the corners. GBCA = gadolinium-based contrast agents; IRSE = inversion recovery spin echo; myo = myocardium; RR = inter-beat interval; SE = spin echo. All T1|T2 values are stated in ms. Other abbreviation as in Fig. 2

Fig. 5
figure 5

T1 and T2 relaxation times versus ingredients at 1.4 T: agarose and NiCl2. Grid represents results of the model. Red points represent single measurements. a Longitudinal relaxation time constant (T1), RMSE in R1 compared to the linear model was 4.8 × 10−5 /ms. b Spin–spin relaxation time (T2), RMSE in R2 compared to the linear model = 5.3 × 10−4 /ms. Since the x and y axes of both fits are comparable, the ingredient that contributes most can be identified. RMSE = root mean square error

After pouring in the resin base, leaving this to set, and adhering the 9 filled tubes on top of this base using ethylene vinyl acetate and polypropylene uncoloured mixture based hotmelt typically applied from a “hot glue gun”, we packed the compact HDPE pellets into the bottle and then poured in the agarose/NiCl2 mixture (typically at a temperature ~ 50–60 °C) taking care to avoid air pockets from forming in the matrix gel fill.

The T1MES phantom has a volume of 2 l, inner length of 187 mm and inner body cross section 122 mm by 122 mm. The labels show an isocenter cross mark, the correct orientation for positioning it under an anterior chest coil, and a serial number and date of manufacture. Also attached to the outside of the phantom is a liquid crystal display (LCD) thermometer of 1 °C resolution. Notably some pigments used on plastic tubes distort the magnetic field [12] (Fig. 2h), so all components were tested carefully, rigorously sourced and documented to avoid unexpected changes which could affect future production batches. Even with the efforts to optimise B 0 and B 1 uniformity, some T1|T2 combinations are more sensitive to off-resonance errors so these tubes were placed centrally in the phantom avoiding corner locations of greater B 0/B 1 error (explaining the otherwise somewhat counterintuitive ordering of tubes according to their T1 values).

Production of one phantom took on average 5 h (distributed over batch production not serial manufacture). As the phantom build was all by manual labour and not automated, it took 3 weeks and four full-time members, 340 h in total to produce the 69 phantoms in this batch.

Prototype and production batch testing and quality control

Reproducible manufacturing was established for all tubes. Three prototypes (models A to C) had unsatisfactory B 0 and B 1 uniformities before the satisfactory model-D design. Between June and August 2015, 10 D-model phantoms (five for each of 1.5 T and 3 T) were characterized at ten experienced CMR centers for artifacts and for initial verification of the tube T1|T2 values. In September 2015, the final batch of artifact-free (Fig. 2i, j) T1MES phantoms (E-models) were mass-manufactured and shipped to CMR centers worldwide.

All aspects of phantom production conducted at the RH laboratory were performed in accordance with their certified quality management system including the recruitment and training of staff and the quality control checks of final phantoms. Prior to the mass manufacturing, extensive experiments were done in order to setup the standard operation procedures and working instructions to ensure final phantom integrity. Quality control was ensured at three levels: operator level (e.g. careful choice of materials), engineering level (e.g. the responsible process engineer conducted in-production tests/measurements and inspections, such as checks for bubbles in the tubes and bottle seals, and based on the outcome of this analysis, initiated improvement activities) and management level (e.g. by facilitating training and identifying better measurement or production equipment that could be used for future batches). Operator level quality control evaluated phantoms in real-time during the production process through visual inspection to ensure production ran smoothly, predictably, and to the required standards (e.g. by ensuring a flat resin surface, correctly sealed tubes, tight bead packing of the outer matrix gel, etc.). Overall phantom integrity was also visually checked for any production defects prior to shipment (e.g. precise alignment of isocenter cross label correctly offset from the upper surface of the resin base, no distortion of the outer bottle due to excessively hot gel etc.).

Phantom calibration and validation has limitations as phantoms do not fully model tissue (see Discussion). Nonetheless, ‘ground truth’ values in phantoms were measured using slow scanning ‘gold standard’ sequences that have previously demonstrated accuracy in phantom work. Of the 69 final E-model phantoms, 10 (14.5 %; 5 at each of 1.5 T and 3 T) underwent ‘gold standard’ slow T1 measurements by IRSE (8 TIs from 25–3200 ms) and T2 measurements by slow SE (8 TEs from 10–640 ms) at a single center (Royal Brompton Hospital; Siemens, 1.5 T Aera and 3 T Skyra; Fig. 6). These slow T1|T2 measurements were only performed once and the results used as ‘ground truth’ for the subsequent measurements. In addition, all tubes were relaxometer-certified pre-assembly.

Fig. 6
figure 6

Reference T1|T2 values. Variation in the mean T1 (red dots) and T2 (blue dots) reference values and standard deviation (whiskers) of the nine tubes averaged for the ten final batch T1MES phantoms that underwent ‘gold standard’ slow T1 and T2 measurements by IRSE and SE respectively at 1.5 T (a) and 3 T (b). T1 values obtained by MOLLI (5(3)3 [256] (WIP# 448B at 1.5 T and WIP# 780B at 3 T) pre-GBCA sequence (green dots) are also shown. Abbreviations as in Figs. 2 and 4

Scanning protocol for T1MES

A fundamental aspect of T1MES was to invite each site to submit phantom data with whichever T1 mapping sequence they were using clinically. We did not pre-specify any aspect of the T1 mapping sequence to use, except careful replication of position and phantom setup without any alteration of the parameters used clinically and not to modify any other parameter of the chosen protocolled T1 mapping method during the period of supplying T1MES repeat scans—i.e. to stick to a fixed protocol (as specified in the JCMR Consensus Guidelines for T1/ECV). If changes were inevitable, for example due to scanner upgrades, a method of informing T1MES has been implemented and is described in the manual (Additional file 1). Instructions for adjustment and sizing of the shim volume did need to be vendor-specific and these are explained in the appendix section of the T1MES user manual circulated to all participants.

At all participating T1MES sites, the final phantom is currently scheduled for fortnightly scanning for 1 year using a fixed protocol for inter-scan test-retest analysis. Some centers are additionally scanning the phantom using the same sequence at the same position providing data necessary for short-term intra-scan test–retest analysis. Results from this longitudinal data collection are expected to be published in 2017. The T1MES user manual and QA protocol [23] stipulates that the T1MES phantom be kept in the MR magnet room (for stability and also so that its internal temperature will match that displayed by the surface LCD label) and imaged every 2 weeks for 1 year using consistent coil and phantom arrangement. The T1MES user manual emphasises that image parameters be kept unchanged for serial scans except for automatic adjustments of FA and reference frequency. The user manual specifies the range of acceptable positioning of the phantom in the scanner aligned with the main magnetic field. The phantom is scanned axially halfway along the length of the 9 internal tubes corresponding to halfway along the length of the main bottle, imaging only that slice, to avoid z-end B 0 distortion. To ensure consistent adjustments of B 0 and scanner reference frequency over the phantom at each repeat scan, the shim volume (also referred to as adjustments volume, adjust region, shim region, shim box) is identically sized and positioned on the phantom bottle for each scan (see Additional file 1). The scan protocol is kept identical for serial scans at each center. Centers were requested to use the same standard anterior chest coil each time.

The minimum fortnightly contribution to T1MES consists of conventional CMR scans: A) the initial localizers; B) at least any one T1 mapping sequence with simulated electrocardiogram set at 67 beats per minute (inter-beat [RR] interval 900 ms). The T1MES QA program generates three main types of multicenter data: 1) raw data pertaining to long reference scans for T1 (IRSE) and T2 (SE) that we reconstruct on receipt: 2) raw T1 mapping data from some specific centers without the ability to reconstruct their own maps locally, thus we reconstruct the maps on receipt; 3) reconstructed T1|T2 maps (majority of sites). T1|T2 values were taken as mean values from circular ROIs of fixed diameter, in each of the nine tubes in pixel-wise maps.

Within the network are sites using identical magnets, coils and protocols providing an opportunity for a wide range of inter-sequence and inter-site analyses (scheduled for 2017).


Statistical analysis was performed in the R programming language (version 3.0.1, The R Foundation for Statistical Computing). Descriptive data are expressed as mean ± standard deviation except where otherwise stated. Distribution of data was assessed on histograms and using Shapiro-Wilk test. The coefficient of variation (CoV) between repeated scans was calculated as a measure of reproducibility. For defining the model that describes the relation between ingredients and relaxation rates (R1|R2), the fitted parameters were found by fitting a surface for both T1 and T2 using the MATLAB (The MathWorks Inc., Natick, MA, USA, R2012b) curvefitting tool and the linear least-squares approach. The analysis of incoming T1MES datasets is carried out using a MATLAB graphical user interface. From the data, mean T1 and T2 values were measured from each of the nine contrast tubes. Using the ROI measurement tool in MATLAB, mean signal intensity of the central 50 % area of each of the nine tubes was calculated.


Model predictions of T1 and T2

Linear models for longitudinal and transverse relaxation rates R1|R2 in terms of the ingredients agarose and NiCl2 can be written following similar work previously published [14]:

$$ {R}_x/{\mathrm{ms}}^{-1} = {a}_x+{b}_x\ {C}_{w, agarose}/\%+{c}_x{C}_{{\mathrm{Ni}}^{2+}}/\ \mathrm{m}\mathrm{M} $$

where x = 1, 2, C w,agarose and \( {C}_{{\mathrm{Ni}}^{2+}} \) are the weight and molar concentration of agarose and Ni2+, respectively, and a x , b x and c x are found by surface fitting (Fig. 5):

$$ {a}_1 = 3.750\times {10}^{-4},\kern0.5em {b}_1 = 8.790\times {10}^{-6},\kern0.5em {c}_1 = 6.683\times {10}^{-4} $$
$$ {a}_2 = 1.645\times {10}^{-4},\kern0.5em {b}_2 = 7.622\times {10}^{-3},\kern0.5em {c}_2 = 7.201\times {10}^{-4} $$

From these relationships and replacing relaxation rate R x by relaxation time T x we calculated the required agarose % (by weight) and Ni2+ concentrations (equal to added molar concentration of NiCl2.6H2O as it is highly dissociated) for each of the 13 tube stock solutions as shown in Table 2.

The presented model was accurate within the root-mean-square errors (RMSE) in Fig. 5 caption over the range T1 = 300–1900 ms and T2 = 40–300 ms that cover the range of relaxation times expected in healthy and diseased myocardium pre- and post-GBCA.

Reference T1 and T2 values

Comparison of ‘gold standard’ T1 and T2 values (Fig. 6) between the ten E-model phantoms tested, confirmed reproducibility of manufacturing. Across the 9 tubes, CoV for T1 ranged from 0.17 to 1.25 % at 1.5 T and 0.08 to 1.0 % at 3 T, while T2 ranged from 0.74 to 2.12 % at 1.5 T and 0.40 to 1.72 % at 3 T.

B 0 uniformity

Final phantoms were free of air bubbles and susceptibility artifacts at both field strengths. T1 maps were obtained in the specified mid-phantom slice at the specified scan setup, and were free from off-resonance artifacts (Fig. 2i, j). Provided the bottle was placed coaxial with z-axis, imaged as a transverse slice halfway along, and with the use of shimming as specified in the T1MES manual, B 0 uniformity was delivered (Fig. 7a) to within ±30 Hz at 3 T.

Fig. 7
figure 7

B 0 and B 1 field homogeneity. a B 0 field homogeneity across the nine phantom compartments as a measure of off-resonance in Hz at 3 T (single E-model phantom results). These are extremely small shifts in frequency (30 Hz = 0.25 ppm) at 3 T and should not be regarded as significantly different between the tubes. b Diagonal profile of the B 1 field (as per green discontinuous line in the inset) comparing relative flip angles on a Siemens 3 T system. Variance of B 1 was smallest across the 9 compartments with CoV 1.54 % for HDPE beads consisting of smooth, semi-translucent, colourless compact discs (as colouring in plastics has the potential to distort the B 0 magnetic field [12], see Fig. 2h) with a melt index >60 °C. We choose pellets that had not been regrinded, reblended or composite for this purpose. Highly monosized microbeads measured 6 μm and were composed of crosslinked PMMA polymer. Neither microbeads, sucrose nor NaCl were comparably effective in flattening the B 1 field. PMMA = poly methyl methacrylate. Other abbreviation as in Fig. 4

B 1 uniformity

The compact HDPE beads (~1 kg of compact pellets per phantom bottle) adequately flattened the B 1 field at 3 T (Fig. 7b), compared to the PMMA microbeads, sucrose and sodium chloride. The HDPE beads cause a speckle of dark regions in the gel matrix as they generate no MR signal that is normally detectable. The beads are expected to have similar diamagnetism to the gel so they have no impact on the B 0 field.

Temperature dependency experiments

Collectively the results (Fig. 8) by slow SE scanning methods show that over the range 15–30 °C the short-T1 tubes are more stable with temperature than the long-T1 tubes where T1 increased more strongly with temperature. T2 values also change significantly with temperature (Fig. 8b), decreasing as temperature increases.

Fig. 8
figure 8

Temperature experiments in T1MES. Temperature dependency experiments (Test 1 in methods) performed on a D-model whole phantom (tube nomenclature differed from that used in E-models) comparing the stability of T1 (a) and T2 (b) values between two repeat experiments (2 days apart) at various temperatures between 15 °C and 32 °C on a 3 T Siemens Verio system. Whiskers represent mean ± standard error. (c) Temperature dependency experiment (Test 2 in methods) comparing T1|T2 values in tubes A, B, C, D, E and I (middle right insert) from a final E-model phantom across five temperatures

Short-term reproducibility

  • Test 1: Six loose tubes as used in the 1.5 T E-model (Fig. 9) showed a CoV of ≤1 % for both T1 and T2reproducibility. Tube B with the longest T1 and T2 showed the greatest variability between repeated scans.

    Fig. 9
    figure 9

    Short-term reproducibility. Short-term reproducibility (three runs) at the NIST laboratory (Test 1 in methods) for phantom T1values in six loose tubes (top left insert) from a final E-model phantom showing CoV of 1 % or less. Tube B with the longest T1|T2 showed the greatest variability between reads. CoV = coefficient of variation

  • Test 2: Test-retest evaluation of one of the final phantoms for 3 T by cardiac T1 mapping, including complete repositioning and readjustments, also gave a short-term repeatability CoV for T1 ≤1 % (Table 3 detailing results for 3 T). For T2 measured by fast T2-prepared single-shot methods, the CoV was usually below 1 % with an exceptionally large 4.1 % in the tube B with longest T1.

    Table 3 Short-term reproducibility experiments in a 3 T final phantom (E-model)*

Production, distribution and initiation of trial

On 1st September 2015 the E-model T1MES phantoms (batch numbers TTP15-001 and TTP30-001 for 1.5 T and 3 T respectively) received regulatory clearance by the Food and Drug Administration (FDA) and Conformité Européene (CE) marking as a Class I Medical Device (GMDN 40636). This initial mass manufacturing phantom experience was not always 100 % successful and important quality control lessons have been learnt: for example two different fill solutions for tubes were accidentally mislabelled initially and had to be discarded and remade; individual tubes with visible bubbles on inspection had to be corrected with appropriate procedures; any solution stock with T1 or/and T2 not falling within +/− 3 % of our pre-specified targeted range had to be adjusted.

A total of 75 multi-vendor CMR scanners (four systems: Siemens, Philips, General Electric [GE] and Agilent) across five continents (Table 4), are currently using T1MES phantoms for their local T1 mapping QA as part of the international T1MES program. This amounts to an initial 53 individual CMR centers and 69 devices, with six centers using the same field-specific phantom for QA scans on more than one local machine.

Table 4 Quality assurance of T1 mapping: the initial T1MES CMR centers


Results obtained thus far demonstrate that: 1) mass production of phantoms to regulatory standards and in accordance with a rigorously repeatable process is feasible, 2) based on the sequences used, T1|T2 times in gels are highly reproducible in the short-term, 3) a significant temperature dependency of measured T1|T2 values exists in tubes with longer T1 values that will require the use of a correction model.

The T1MES program seeks to advance the field of quantitative CMR relaxometry and the use of imaging biomarkers like T1 mapping and ECV in clinical trials and clinical practice. Our aim was to collaborate with industry, with leading CMR academics and clinical centers with an interest in T1 mapping, so as to develop and test a multicenter QA infrastructure, to protect normal reference data at centers and also potentially to improve consistency of T1 mapping and ECV results across imaging platforms, clinical sites, and over time. Key to the achievement of accurate and reproducible T1mapping/ECV results in CMR is the accelerated development and adoption of rigorous hardware and software standards.

However, this proposal is subject to a further limitation that the phantoms do not model other aspects of tissues, particularly for myocardium—the magnetisation transfer [22] neither does it address the mapping techniques’ ability to discriminate T1 values between adjacent regions of interest (the clinical challenge of discriminating tissue T1 values in adjacent myocardial segments). For example, the signal-to-noise ratio in the phantoms is unrealistically high as the surface coils are typically nearer; evaluating such an ability is beyond the scope of T1MES. The only realistic aim may prove to be that of providing individual (or genuinely identical) centers with a QA phantom that could protect normal reference data and assure (or even permit correction of changes in) stability of protocols during a long study.

The 1-year study, now running, is expected also to give information about gel stability. It seems reasonable to expect sudden steps in T1 values from genuine changes in the acquisition, or scatter from any remaining uncontrolled parameters or imperfect temperature correction, but there would be a gradual monotonic drift as the gel water content changes. Agarose gel is inherently unstable even within a sealed tube, because the gel contracts as water leaves it, appearing as excess water (as droplets) in the gap left by the contraction, often visible on the inner wall of the tube. Note that this effect can occur within well-sealed tubes. It is unrelated to contamination because agarose without added nutrients does not support mould growth. Over time, this shrinkage may also occur in the matrix fill leading to air-gaps and B 0 distortion, potentially occurring near the tubes making a possible contribution to an apparent drift in T1 values over time. For the first time, the 1-year study will give large-scale initial data on the durability of this type of phantom. At study end, we aim to recall approximately 10 % of the phantoms which will be inspected for flaws in the gel using high-resolution 3D imaging, with collection also of long reference T1|T2data as gel drying with shrinkage and condensation into the gap is known to occur even within a sealed tube. Centers are free to keep and use the T1MES phantoms after the 1-year study ends. There is no provision for return shipment to the coordinating site, nor any knowledge of how long the gels will remain usable.

The field and temperature dependence of T1 for phantoms containing Ni2+ is much smaller than those containing other paramagnetic ions like Cu2+. As T1 increases above 500 ms (in tubes with a low concentration of Ni2+), the tube’s T1 becomes more temperature-sensitive as it is increasingly dominated by the temperature sensitive T1 of water in the gel [24, 25]. Therefore temperature monitoring of each fortnightly session is essential. Our results enable us to integrate a temperature-correction model into our multicenter T1MES analysis, that will be published at the end of the project. The temperature sensitivity of T1 revealed in the present work may not be a concern for clinical T1 mapping in healthy volunteers (as the human body is homeothermic—temperature of 37 °C) but it may be a concern for hypothermic or febrile patients. Furthermore T2 temperature dependence could also impact measured T1 as some fast-T1 methods have considerable T2 sensitivity.


We report on the establishment of a collaboration to develop CMR phantoms to CE/FDA standards and an initial multicenter repeat scanning program aiming for global QA of T1 and ECV protocols. A rigorous and reproducible manufacturing process for the phantoms has been established. The temperature sensitivity, short-term stability and inter-phantom consistency have all been assessed in support of the main project. An initial 69 phantoms with a multi-vendor user manual are now being scanned fortnightly in centers worldwide, permitting the academic exploration of T1 mapping sequences, platform performance and stability over a year.