Introduction

Haemochromatosis is the most common cause of iron overload. In the primary or genetic form, there is excessive absorption of iron in the intestine, which cannot be eliminated by the body and accumulates in various organs, causing irreversible damage. In the secondary form, the surplus iron comes from multiple blood transfusions or blood disorders, such as haemolytic anemia or ineffective erythropoiesis, and again there is progressive accumulation in various organs. In both forms, the liver and the heart are the most damaged organs. The process can be reversed through treatment, by phlebotomies in primary haemochromatosis cases and using iron-chelating agents in patients with the secondary form [1, 2].

Indirect markers have been identified, such as serum ferritin and transferrin saturation, and these are sensitive but are not specific [3]. Further, the diagnosis of primary haemochromatosis has improved since various associated genetic mutations were discovered some years ago. Nevertheless, the key indicator is the measurement of liver iron concentration (LIC). Haemochromatosis is the only condition known to be able to cause the LIC to rise to above 80 μmol Fe/g or to twice the age of the patient. In its secondary form, it generally involves LIC values even higher than those seen in the primary one.

This is currently an important field: in the case of primary haemochromatosis because it is being confirmed that genetic mutations, widely studied in recent years, are not conclusive for the definitive diagnosis of the disease; and in the secondary form due to the emergence of new iron chelating agents, which are changing the clinical management of the disease.

Chemical analysis of a liver biopsy is the “gold standard” for the measurement of LIC, but patients are resistant to this approach as it is invasive and, moreover, results vary widely: the coefficient of variation in the quantification obtained by liver biopsy is from 19% in a healthy liver up to 40% in the case of cirrhosis [47]. In this context, the possibility of having at our disposal, a non-invasive tool such as magnetic resonance imaging (MRI), to accurately measure the LIC and thereby assess the need to treat a patient, as well as monitor their response to any treatment, represents a great step forward.

MRI-based techniques for assessing LIC can be classified into signal intensity ratio (SIR) and relaxometry methods. Various different techniques have been described, including: (1) relaxometry methods measuring absolute T2; (2) relaxometry methods measuring T2*, which is also an absolute value but measured with gradient echo (GRE) sequences; (3) methods measuring SIR between the liver and other tissues in which iron is not generally deposited, usually paraspinal muscles.

For MRI to be able to respond to the current challenges in clinical practice, it is necessary for there to be a universally accepted MRI method that is accurate, reproducible, standardised and widely available.

T2 relaxometry methods

Assessment of iron overload is based on decreases in the T2 relaxation times induced in the liver due to the paramagnetic properties of iron. This acceleration of the T2 relaxation is proportional to the quantity of iron and leads to a decrease in the MRI signal intensity (SI) from the liver (Fig. 1).

Fig. 1
figure 1

T2* transverse relaxation decay curves of signals from the liver in four theoretical examples with different LIC values. a Liver without iron overload, at the lower limit of the normal range (T2* = 6.7 ms). b Liver with slight iron overload (T2* = 4.5 ms). c Liver with high iron overload (T2* = 2 ms). d Liver with very high iron overload (T2* = 1 ms)

Relaxometry methods calculate T2 [8, 9] or T2* [1013] by fitting a decay models to the average signal intensity at various echo times (TEs). These values may also be expressed as relaxation rates R2 (1/T2) or R2* (1/T2*). To obtain reliable measurements of T2 at different levels of iron overload, acquisition sequences with many different TEs are necessary (Fig. 2). SI is plotted as a function of echo time and a T2 parametric map is automatically obtained.

Fig. 2
figure 2

Multi-echo sequence of two patients with different LICs (TR 21 ms, flip angle 35° (TE first = 1.22 ms, TE interval = 1 ms, 20 echoes). Acquisition time: 14.5 s; matrix 244/145). a Patient without iron overload; isointensity of the liver for all the echoes; T2* =17.4 ms. b Patient with iron overload with loss of the signal from the first echoes; T2* = 2.1 ms

Many studies have shown a high correlation of T2 and T2* values with LIC measured in liver biopsies for the assessment of all levels of LIC [12, 13] (Fig. 3). Furthermore, it has been confirmed that the technique is reproducible with different machines. Some mathematical models have also been developed to transform T2 and T2* values into LIC values in mg Fe/g [11, 14]. In addition, they allow the measurement of the iron content in the myocardium, which is a very important parameter for the management of patients with secondary haemochromatosis.

Fig. 3
figure 3

R2 and R2* versus LIC measured in liver biopsies. a R2-LIC versus biopsy LIC in 105 patients. “R2 LIC” corresponds to values of LIC estimated from a calibration equation with R2 values (see [8]). b R2* versus biopsy LIC in 21 patients (r = 0.97) (see [11])

However, T2 and T2* relaxometry methods still have notable limitations. Measurements depend on both the MRI sequence parameters and the image analysis procedure, and there is still no consensus concerning which is the most appropriate MRI technique. Accordingly, to have a reproducibility of measurements it is important that universally accepted MRI protocols are established.

In particular, there is no general consensus on which index (R2, R2*) is best for the measurement of LIC [1416]. For T2* calculations, GRE sequences are more sensitive to low iron content but they suffer from inaccuracies at high iron overload [17]. Christoforidis et al [18] compared two different MRI models that calculate R2 and R2* respectively, in a group of 98 patients with thalassemia, and found a lack of correlation between the two methods.

To minimise the uncertainty in the measured T2, a relatively large number of echoes is required in order to sample the entire exponential decay of the transverse magnetisation. Further, the first TE is key and should be as short as possible: 5 ms or less for T2 measurements [9, 14] and 1 ms or less for T2* methods [13, 18]. Hankins et al [13] compared R2* values to the LIC measured by liver biopsy using three different MRI methods, finding that the estimations obtained with the shortest first TE were the most closely correlated with the directly measured values (Fig. 4). T2 measurements are also influenced by echo spacing [9].

Fig. 4
figure 4

R2* values calculated with three different MRI methods with respect to the LIC measured in liver biopsies for the same group of patients. The models of Hankins et al and Wood et al have a better correlation than the model of Anderson et al The first echo is 2.3 ms in the Anderson et al model and 1 ms in the other two (see [13])

In the image analysis procedure, a critical aspect is the choice of the decay model to be used for fitting the SI-time curve. The single-exponential models used in earliest studies [10] had some limitations and various other data-fitting models have been proposed: single exponentials with truncation or with progressively less weight is given to weaker echoes, single-exponentials with a constant offset or with subtraction of measured image noise, or bi-exponentials [9, 11, 14, 19, 20]. Each model has advantages and disadvantages, and consensus has not yet been reached on which should be used. Further research is needed to identify the most accurate model.

Another serious limitation of T2 and T2* relaxometry methods is the limited access to the technique [15, 2123]. New pulse sequences and software capable of generating T2* images are not compatible with older hardware [22]. The number of machines capable of performing these calculations automatically remains small. For example, the MIOT (Myocardial Iron Overload in Thalassemia) network in Italy, has grown since 2006 to reach 49 hematological or pediatric centers specialised in thalassemia, but has only six radiological centers with standardised MRI acquisition and image processing protocols. These groups point out that “MRI multi-echo T2* sequences able to quantify iron overload are still non-commercial and their installation requires the special involvement of the MRI scanner vendor”. Moreover, they state that other MRI centers need to be set up in the near future in order to “satisfy the requests from thalassemia centers”; currently, in the existing network, the average distance between the clinical centers and their corresponding radiological center is over 238 km [23].

There are now various types of post-processing solutions on offer. In particular, St Pierre et al [8, 14] have developed a service for calculating T2 (www.ferriscan.com) and the method has been approved by the Food and Drug Administration (FDA) in the USA. It requires previous external validation with a phantom and the data analysis is centralised. This involves an economic cost and takes two working days. Commercial post-processing software, approved by the FDA, is also available (CMR Tools, London, UK), but annual licensing fees may be prohibitive for some institutions [22].

In summary, in 2011, T2 relaxometry methods are already accurate and reproducible but are not yet standardised, nor are they widely available.

SIR methods

In this method, an indication of LIC is obtained by calculating the ratio between the SI of the liver and the SI in paraspinal muscle that does not accumulate iron. The SI measurements are performed in the same slice using large regions of interest (ROIs), and GRE sequences are used due to their greater sensitivity to the paramagnetic effect of iron. In order to be able to quantify all levels of iron overload more than one sequence are necessary [2428].

The estimation of LIC by an SIR method is easier to perform than T2 relaxometry. The most widely recognised method is that developed by Gandon et al [24, 29] at the University of Rennes, France. They have designed a set of five breath-hold GRE sequences with fixed TR and different TE and flip angles, optimised for 0.5, 1.0 and 1.5 Tesla (T) magnetic fields. The method can easily be implemented by virtually all machines in the world (Table 1). On each sequence, the liver the SI is measured at three ROIs in the right lobe, while muscle SI is measured at ROIs in the right and left paraspinal muscles.

Table 1 Different GRE sequences with fixed TR and variable TE and flip angle, for MRI with fields of 0.5, 1, and 1.5 Tesla, used in the method of Gandon et al from the University of Rennes, France (see [24, 29])

Gandon et al [24] published a description of this method in the Lancet in 2004. In a group of 139 patients, they estimated the LIC with high accuracy, ranging from 3 to 375 μmol Fe/g (mean difference of 0.8 μmol Fe/g; 95% confidence interval of 6.3–7.9). They validated the results in a validation group of 35 patients (Fig. 5). The University of Rennes has now provided a free online worksheet for calculating LIC using this method [29]. By combining liver and muscle (L/M) SIRs from the set of five sequences, it automatically calculates an LIC value in μmol Fe/g.

Fig. 5
figure 5

MRI-estimated LIC versus biopsy-measured LIC in 174 patients by the method of Gandon et al (see [24])

This recommended protocol is in general use across the world and many studies take the results obtained with it as the reference value for LIC. Christoforidis et al [18], compared the R2, R2* and values from the University of Rennes model in a group of 94 patients with thalassemia. The correlation between R2 and the Rennes values was better than that between R2 and R2*.

The method does, however, have some limitations. It saturates with very high iron overload and does not give a value of LIC higher than 350 μmol Fe/g. It always identifies these cases as very high iron overload (>350 μmol Fe/g). Such patients always have a sufficiently high iron overload to require treatment; nevertheless, not recognising all the range of LIC values with clinical significance is a limitation of the model. Specifically, many patients with secondary haemochromatosis have LIC levels in this range.

Responding to this issue, a research group led by Rose et al [30] from the University of Lille, France, has designed an algorithm with two T1-weighted sequences for cases which saturate at 350 μmol in the University of Rennes model, significantly improving the correlation with the true LIC (R = 0.81). They designed a computer-based algorithm to obtain the corresponding LIC automatically and have made it available free of charge online (from: http://oernst.f51vg.free.fr/liver/iron.html).

The protocol of the University of Rennes is now widely used in clinical practice, despite the fact that there has been relatively little research comparing the results of this model to the “gold standard”, the LIC measured by biochemical analysis of a liver biopsy. One recently published study did compare the LIC quantified by the University of Rennes model with the “gold standard” in a group of 171 patients [31]. It was found that the model correctly classified 105 (61%) of the patients according to the various levels of iron overload (normal <36 μmol Fe/g, moderate 37–79 μmol Fe/g and high >79 μmol Fe/g) (Table 2) (r = 0.87). A moderate tendency to overestimate LIC was observed: 43% of normal patients being classified as having iron overload and 45% of the patients with moderate iron overload being diagnosed with high iron overload; while very few patients were underestimated. Cut-off points with high predictive value (PV) for high iron overload diagnosis were calculated: (1) >170 μmol Fe/g revealed a positive PV of 100% (sensitivity 69%) and (2) <60 μmol Fe/g, a negative PV of 100% (specificity 75%). With these cut-off points, the model was reliable for ruling out or for supporting high iron overload diagnosis in 127 (74%) patients. For intermediate values, however, those between 60 and 170 μmol Fe/g, the diagnosis remained uncertain, 44 (26%) patients were in this group: 12 of these had normal iron levels and eight high iron overload [31].

Table 2 Correlation between LIC values estimated by MRI using the method of Gandon et al [24] and true LIC measured in biopsies for three groups of patients with different levels of LIC (normal <36 μmolFe/g; moderate iron overload :37–80 μmolFe/g; high iron overload: >80 μmolFe/g) (see [24])

In addition, our working group, from 1999 to 2001, compared the L/M ratio of all sequences of the method from the University of Rennes to the LIC measured in liver biopsies in 112 patients (Fig. 6). When linear regression was performed between LIC and L/M ratios, we observed that the T2 and proton density (PD) sequences provided the best correlation. We created a model to estimate LIC: LIC = e[5.808 - (0.877 × T2) - (1.518PD)] where T2 and PD are the L/M ratios of these two sequences respectively [PD 120 ms/4 ms/20° (TR/TE/flip angle) and T2 120 ms/14 ms/20°] [25]. The correlation with the true LIC for this model was very high (r = 0.937) (Fig. 7a), while in the same group of patients the correlation with the model of the University of Rennes was less strong (r = 0.887) (Fig. 7b). We also calculated cut-off points with high predictive value for high iron overload. These were closer to true LIC values: 100% positive PV for estimated LIC > 85 μmol Fe/g (sensitivity 86%) and 100% negative PV for estimated value <40 μmol Fe/g (specificity 75%) [25].

Fig. 6
figure 6

MRI sequences of the method of Gandon et al in three patients with different levels of LIC. a Patient without iron overload. b Patient with moderate iron overload. c Patient with high iron overload. d Scatterplots of L/M ratio and LIC for each MRI sequence. There is a maximal decrease in liver SI with most T2-weighted sequences. SE spin echo T1 sequence, PD proton density sequence (see [25])

Fig. 7
figure 7

MRI-estimated LIC versus biopsy-measured LIC in 112 patients. a By Osatek’s model (r = 0.937). b By the model of Gandon et al (r = 0.887) (see [18])

Our hospital is a referral center for a population of 400,000 and since 2002 we have processed more than 900 patients with alterations in iron metabolism. It requires a straightforward test, with patients having to lie in the machine for only 10 min. Then, with a simple Excel worksheet, we automatically obtain the iron concentration in micromoles [32].

Since 2007, we have had a multi-echo sequence to calculate R2* (20 different echoes with TE ranging between 1.1 and 19.1 s) (TR/TE/ΔTE/flip angle, 188 ms/1.14 ms/0.9 ms/35°) (Fig. 2). We compared the LIC estimated by the aforementioned SIR model and by R2* in a group of patients, obtaining a good correlation between the two values (r2 = 0.86). Currently, we are carrying out studies using the SIR method, which has demonstrated a high level of precision in clinical practice and provides us with a meaningful LIC value in μmol Fe/g to include in clinical reports. In parallel, we continue to use the multi-echo sequence to measure T2* to assess iron concentration in the myocardium.

In, summary, SIR methods are sufficiently accurate for many cases in clinical practice and, what is more, they are also reproducible, standardised and already widely available.

Conclusion

T2 relaxometry is the best method to quantify LIC using MRI. It is accurate and reproducible at all levels of iron overload, and also allows myocardial iron concentration to be measured. However, T2 relaxometry models are not yet standardised and, in our opinion, there are considerable barriers to them becoming widely available in the near future. On the other hand, SIR methods, although they are less accurate with values of LIC > 350 μmol Fe/g, should continue to be used since they have high specificity at all levels of iron overload, they have been standardised, and are reproducible and, moreover, they are already widely available.