Introduction

Orthopaedic surgery in Norwich first gained recognition with Tommy Brittain who pioneered extra-articular arthrodesis for arthritis [7], but it was his one-time registrar Ken McKee who set the scene for the later experience in imaging of metal-on-metal (MOM) arthroplasties. Ken McKee had also worked for Philip Wiles who in 1938 performed what is widely recognized as the first documented total hip replacement (THR) at the Middlesex Hospital, in London, using stainless steel acetabular and femoral prostheses [11]. McKee was hooked. While his medical career was interrupted by active service with the Royal Army Medical Corp during World War II, he continued to think about the possibilities of hip arthroplasty writing in 1940 that “If one could replace the bearing of a motor car then it must be possible in human joints” (personal communication: Mr Hugh Phillips, Consultant Surgeon, Norwich, 2002).

After the end of the war, McKee was appointed as a consultant at the Norfolk & Norwich University Hospital. At this time, the wisdom of his peers was that arthrodesis was the best treatment for the arthritic hip and that artificial joints were doomed to fail. Despite the prevailing sentiment, McKee designed and improved a series of MOM total hip replacements culminating in 1965, with the assistance of John Watson-Farrar, in the McKee–Farrar: a cobalt–chromium–molybdenum alloy MOM THR with a studded “sputnik” acetabular cup and a cemented collared femoral prosthesis [3, 5, 11] (Fig. 1a).

Fig. 1
figure 1

Plain radiographs illustrating a McKee–Farrar (a), a first-generation MOM THR, and a modern MOM THR, a hybrid Ultima TPS (b) with minimal resorption of the medial calcar (c) as the only sign in a patient with extensive ARMD.

The McKee–Farrar became the first widely implanted total hip replacement, but it was not without its problems. Early- to medium-term failure was caused by mechanical loosening with two thirds of these hip replacements demonstrating radiographic loosening followed by a revision rate of 10% to 17% by 10 years [3, 11]. The cause of this relatively high rate of early aseptic loosening has been attributed to both surgical technique and biomechanical problems of the bearing itself, which were felt to reflect the limits of the engineering techniques of the time prior to 1971. The McKee–Farrar was an equatorial loaded bearing as opposed to later polar bearing designs, and this was considered to be a major cause of loosening in the earlier implants due to high friction. Elevated serum metal ion levels were found in patients with MOM prostheses, and the concern that this may be associated with carcinogenicity contributed to the decline in popularity of the McKee–Farrar and the coming of age of Charnley’s metal on polyethylene (MOP) “low friction arthroplasty” in the early 1970s [23] despite the fact that the intermediate- and long-term results of the McKee–Farrar and Charnley were comparable [3, 24].

After the predominance of MOP systems in the late 1970s and 1980s, aseptic loosening caused by osteolysis from polyethylene wear debris became an increasingly common problem. In an attempt to find an alternative, manufacturers revisited the MOM concept. Long-term data revealed that when MOM implants were successful, outcomes were excellent with significantly lower wear (two orders of magnitude lower) of the metal bearing components than MOP arthroplasty [2, 24, 39] with reports of MOM bearings lasting for over two decades [2]. The expectation was that with improved tolerances resulting from new machining techniques, the short- to medium-term failures of the McKee–Farrar could be addressed, and so by 1988, MOM hip replacement systems had been reintroduced and by 1997 MOM arthroplasty had returned to Norwich.

This review describes the experience of imaging these modern MOM THR in Norwich and compares our experience with that of other groups. In particular, this review will focus on four specific questions. These are: (1) What are the MR appearances of the normal post-operative hip? (2) What are the characteristic MR features of ARMD? (3) How do these features correlate with other biomarkers of disease? (4) What is the prognosis for patients with mild disease or normal MRI early in their post-operative course?

Method and Materials

Search Strategy

On the 14th July 2013, a word search of MEDLINE was performed using the terms “metal-on-metal” AND “MRI”. This search produced 51 articles. Case reports, reviews, letters and lecture notes (12) were excluded along with publications that did not specifically report analysis of radiological findings in their results (11), any publications from Norwich (6) and a single paper not in English. This left 21 original scientific articles from institutions other than our own. The comparison of these papers with the results from Norwich forms the basis of this review.

Types of MOM THR

Experience of imaging failing MOM arthroplasties in Norwich includes several different types including total hybrid, total uncemented, large and small bearing and resurfacing arthroplasties.

Hybrid Total Hip Arthroplasty: Small Bearing

Between 1997 and 2004, 545 patients underwent 652 total hip replacements using an Ultima hybrid (then Johnson and Johnson Professional, now DePuy International Ltd, UK) MOM THR, initially as part of a multicentre clinical investigation into the safety and efficacy of the Ultima MOM THR. By January 2008, nearly 14% had been revised. Pain was the most common indication for revision (49%) followed by peri-prosthetic fractures (19%) [9]. At revision surgery, the findings included peri-prosthetic cavities containing milky fluid under pressure, soft tissue necrosis, tendon avulsion and osteonecrosis of the proximal femur (Fig. 2). Histology demonstrated extensive necrosis, fibrosis and dense peri-vascular lymphocytic infiltrates in keeping with aseptic lymphocytic vasculitis-associated lesions (ALVAL) [10, 12, 13]. Various terms have been used to describe this disease. ALVAL is probably best restricted to histopathological findings whereas adverse reactions to metal debris (ARMD) is more suitable to the range of bone and soft tissue changes seen at operation and on MRI.

Fig. 2
figure 2

Intra-operative photograph demonstrating severe ARMD with a large head metal-on-metal THR. Extensive soft tissue destruction is associated with complete detachment of the abductors from the greater trochanter which allow visualization of the prosthesis.

Uncemented Large Bearing: Total and Resurfacing

Between 2005 and 2008, 79 ASR-XL hip arthroplasties were performed in 68 patients (17 resurfacing and 62 total). Cobalt-chrome (CoCr) acetabular components and large bearing CoCr femoral heads were used in all cases, with the addition of a Corail titanium femoral prosthesis in the total hip arthroplasties [14]. Following the experience with the Ultima TPS hybrid THR and following an index case that was revised early for pain and soft tissue abnormalities on MRI, the whole cohort of these patients underwent annual clinical assessment and self-assessment questionnaires including annual radiographs and MRI. There was a high early revision rate of 11% at 40 months for the ASR cohort [14] which was comparable with published results from other groups [1517].

From 2001 to 2007, 463 Birmingham hip resurfacing arthroplasties (BHR) were performed in Norwich and a local district general hospital. The revision rate at 5 years was 3.1% with risk factors identified as female gender, high acetabular inclination, obesity and small femoral components [18].

The BHR is the most successful MOM prosthesis in the National Joint Registry of England & Wales [19], with a revision rate of 5.09% at 7 years, but even this compares poorly with a cemented Exeter MOP THR at 1.32%. The BHR is now reserved, by most British hip surgeons, for younger men with large femoral heads, who appear to have the lowest risk of revision.

Imaging Protocols

Radiography

One of the key radiological findings in ARMD is that conventional radiographs are usually normal. The most commonly recognized radiographic finding is resorption of the medial calcar, which is often subtle and does not reflect the sometimes extensive soft tissue changes [26, 47] (Fig. 1b, c). Interestingly some of the features that are now widely recognized on MRI were first described using conventional arthrography in 1975 although at the time ALVAL and ARMD were not recognized as a unique disease process [1].

CT

Our experience of using CT for the detection of ARMD is limited to those patients with absolute contraindications to MR although some institutions appear to use it routinely with success [6]. Others have used it for 3D measurements of the position of the prosthesis in which there appears to be no difference between symptomatic and asymptomatic patients supporting the hypothesis that patient-specific factors are more influential [16, 17, 29].

MRI

The MR protocol in Norwich has evolved since MR of THR started in 2005. Machines have changed offering stronger gradients, but our basic protocol has remained the same for most of this time and includes fast spin echo and short tau inversion recovery (STIR) sequences in three planes with alternating phase and frequency encoding directions used to “steer” artefact away from tissues of interest (Table 1).

Table 1 The Norwich MAR MR protocol at 1.5 T (Siemens Symphony, Erlangen, Germany)

The matrix sizes and receiver bandwidth were determined following phantom studies. Both can be used to reduce artefact, but each comes with a penalty: increased acquisition times with smaller voxel sizes and reduction of signal-to-noise ratios with increasing receiver bandwidths. In fact the reduction in SNR associated with increasing bandwidth does not significantly adversely affect the diagnostic quality of the images and so increasing the receiver bandwidth has been our primary tool for metal artefact reduction. The matrix is determined by anatomical resolution requirements alone [43] (Fig. 3).

Fig. 3
figure 3

Graph demonstrating the reduction in metal artefact with increasing receiver bandwidth for different matrix sizes. For any matrix size, a receiver bandwidth of 600 Hz/pixel or more produces 90% of the achievable reduction in artefact.

Our preferred method of fat suppression is to use STIR imaging. STIR images are limited by signal, and as a result, some authors advocate using a two-point Dixon technique producing separate fat and water sensitive images [8]. However, in our experience, using comparable imaging parameters, the near metal artefact is worse with two-point Dixon-based techniques than with STIR (Fig. 4).

Fig. 4
figure 4

Fat suppression using coronal STIR (a) and two-point Dixon IDEAL (b) sequences demonstrating optimal control of susceptibility artefact with STIR.

Not all radiologists use T2W sequences as part of a MARS MRI protocol. The longer echo times are more susceptible to dephasing, but we find this useful to identify microscopic metal debris which is not always visible on very short echo T1W or PD sequences.

This protocol comprises basic modifications to standard spin echo and STIR sequences. There are a number of more advanced techniques for metal artefact reduction which include view angle tilting and multi-spectral imaging techniques such as MAVRIC and SEMAC, but we have not used these as part of our routine practice [15, 21, 40]

Results

MRI of the Normal THR

Early work optimizing MR for imaging concentrated on imaging the predominantly soft tissue complications of THR [35, 36, 45]. Soon after we started to MR image large numbers of patients with MOM THR, we realized that the normal post-operative MR appearances for THR were not clearly defined. This meant that some abnormalities such as muscle wasting might be attributable to the pre-existing disease or the operation. The presence of gluteus minimus atrophy and fluid collections around the greater trochanter had been previously demonstrated on MR imaging of asymptomatic patients following lateral transgluteal arthroplasty [34].

In order to define the normal spectrum of post-operative MRI findings, a cross-sectional observational study of 22 asymptomatic hips after MOP (n = 10) and MOM (n = 12) arthroplasty was performed. This revealed that short external rotator muscle atrophy was a standard finding in THRs performed using a posterior approach, even when the tendon was reattached, which it was in all these cases. It was also apparent that small simple thin-walled periprosthetic fluid collections were commonly found in the surgical bed and presumably represented post-operative seromas [30]. Similar results have been described by other authors [22].

One of the original features described as part of the spectrum of findings in ARMD on MRI was bone marrow oedema on the STIR sequences, but it became apparent that the inter-observer reliability for this feature was poor and it was dropped from routine clinical reporting practices.

MRI of ARMD

A case report of two patients with soft tissue masses associated with MOM THR had been published shortly in 2007; however, ALVAL had not been identified as the histological diagnosis although the description of histological features of a hypersensitivity reaction that would be consistent with ALVAL [14]. MAR MRI of the first 20 MOM THRs in 19 patients from Norwich demonstrated soft tissues changes of ALVAL that had not previously been systematically described on MRI [41]. The typical MR appearance in the Norwich Ultima TPS THR was one of periprosthetic collections extended from the neck of the femoral component in to the surrounding soft tissues, most commonly the gluteal muscles. These collections are typically isointense to muscle on T1W images. On T2W images, the fluid is usually hyperintense (rarely iso- or hypointense) and enclosed by a thick irregular pseudocapsule, which is isointense on T1W and very low signal intensity on T2W, and sometimes referred to as synovitis by other authors [20]. This very low signal is caused by susceptibility artefact from microscopic metal particles in the pseudocapsule, which is only apparent on long TE sequences [41, 47] (Fig. 5). Other MR findings included gluteal tendon avulsion, muscle oedema and atrophy, bone marrow signal abnormalities and periprosthetic fractures [34, 47]. The MR reports have been largely validated by surgical findings, but if anything the impression is that MRI tends to underestimate the extent of the disease (Fig. 6).

Fig. 5
figure 5

Axial T1W (a) and T2W (b) fast spin echo in a patient with bilateral MOM THR demonstrating the typical appearance of ARMD in the Ultima TPS hip with intermediate signal on T1W and fluid signal enclosed in a thick ragged very low signal pseudocapsule on T2W (arrow).

Fig. 6
figure 6

Coronal T1W MRI (a, b) in a patient with an Ultima TPS THR, and a normal plain radiograph, demonstrating intermediate signal abnormalities (arrows) in the marrow of the proximal femur on T1W. Anecdotal reports from the revision procedures suggest that the MR underestimates the extent of the marrow disease found at surgery in patients with ARMD.

This case series of MRI findings in ARMD was followed shortly by similar experience from Oxford [13]. Their experience in 17 patients with MOM resurfacing arthroplasty (RSA) reported what they described as solid pseudotumours. Solid lesions were uncommon in the Norwich Ultima TPS cohort and have been almost exclusively reported with MOM RSA, more commonly in female patients [10, 33]. We have not systematically analysed our experience of imaging the Birmingham RSA as we have the other prostheses, but suffice it to say similar patterns of solid and cystic periprosthetic soft tissue masses are commonly identified. Lymphoreticular spread of polyethylene particles is a well-recognized phenomenon in MOP THR but was also first described in a Birmingham RSA using MRI [42] and has subsequently been identified in other implants (Fig. 7). These early reports have since been replicated in other symptomatic cohorts [20, 38].

Fig. 7
figure 7

Sagittal (a) and axial (b) T2W MR of an ovoid soft tissue lesion lying between the right gluteus maximus and medius muscles demonstrating very low signal with areas of subtle blooming which turned out to be a histiocytoma containing microscopic metal particles accounting for the signal characteristics.

Our experience with the ASR cohort was different to the Ultima TPS. Debris and heterogenous T2 weighted signal on MRI was a common finding within periprosthetic cystic structures in this cohort and was in contrast to a homogenous fluid signal of the Ultima TPS. The very severe disease seen with the Ultima TPS was not seen with the ASR; gluteal tendon avulsion and myositis were not features of this group [13, 41, 46]. It was noted that iliopsoas extension of the disease was more common in the ASR cohort, and this may be a feature of the larger bearing (Fig. 8).

Fig. 8
figure 8

Axial T1W (a) and sagittal T2W (b) MR in a patient with a large bearing uncemented total ASR hip replacement. A large solid iliopsoas soft tissue lesion is present which is isointense on T1W and hypointense on T2W. Iliopsoas ARMD appears to be more common in large bearing THRs.

The Norwich ASR and Ultima TPS cohorts are not directly comparable. The process of case selection was different: a cross-sectional study compared to a symptomatic case series. The ASR patients were also imaged considerably earlier in their postoperative course than the Ultima TPS group, and it is now clear that the mechanism of production of debris is different: crevice corrosion in the TPS and wear in the ASR.

It seems clear from our work and the work of others that the spectrum of ARMD on MRI does vary with the type of prosthesis. It seems likely that the bearing size, composition, method of fixation and replacement of the femoral neck may all contribute to the severity, distribution and MRI appearances of ARMD.

Grading ARMD with MRI

After the initial work of diagnosing and treating those patients who had presented with severe early disease, it became apparent that there were large numbers of patients appearing who were asymptomatic or who had mild symptoms with relatively minor changes of ARMD on MRI. These patients required long-term monitoring but to do this required a standardized method of describing the severity of changes on MRI. The system that was subsequently devised was based on what the surgeons felt were clinically significant features that would persuade them to monitor conservatively, revise electively or revise urgently. These three stages of the disease were defined as mild, moderate and severe (grades C1 to C3) (Table 2). To test the reproducibility of this grading system, 73 MAR MR of THRs comprising known ARMD admixed with normal controls (grade A) and patients with proven infection (B) were anonymized and analyzed [31]. This grading system demonstrated substantial interobserver agreement at our institution with kappa = 0.61–0.8 between the three observers (two experienced at reporting MAR MRI and 1 not). Other authors have since reported lower levels of agreement with Chang et al. reporting moderate interobserver agreement, kappa = 0.44 [4, 9].

Table 2 The key features of the Anderson staging system for severity of ARMD on MRI

This grading system is by no means universally accepted. The units at Oxford and Imperial College, London have both described their own scoring methods which are based on MR signal characteristics which the Oxford group have reported correlates with symptoms in that those patients with solid lesions are more likely to experience pain [19]. This is a fundamentally different approach to the Norwich grading system which stages the severity of the disease in much the same way as a local staging of a cancer would be performed.

Correlation of MRI with Clinical Symptoms

In the Norwich cohort of ASR patients, 27 (34%) of the 75 imaged hips showed MRI features of ARMD. The prevalence of disease on MARS MRI in asymptomatic patients has been reported by other groups in a range of 4% to 68% with the highest prevalence in THR and the lowest in RSA [6, 17, 18, 29, 32, 44] [25]. When the grade of disease on MR, or simply the presence or absence of ARMD, was compared with clinical symptom scores, there was no statistically significant correlation in the Norwich ASR cohort. The mean Oxford hip score (OHS) for patients with a normal MRI was 23 (original OHS, with “best” 12 and “worst” 48) and 19 in those with MR findings of ARMD. Severe ARMD was also observed in asymptomatic patients with OHS of 12 [46]. These findings, which suggest that clinical scores cannot be used as a reliable tool for diagnosing or monitoring ARMD, have since been replicated by other groups [9]. Other groups have taken a different approach to staging AMRD by using continuous measures of disease, rather than ordinal data, and have demonstrated that there is an association between synovial volume, and maximal synovial thickness, with the severity of ALVAL graded histologically [31] as well as a correlation between the severity of synovitis and symptoms in the presence of adverse local tissue reaction [32].

Correlation of MRI with Metal Ion Levels

In response to the concern over the increasing numbers of revisions for ARMD, the UK Medicines and Healthcare Products Regulatory Agency (MHRA) released a Medical Device Alert in April 2010 [37]. These recommendations included regular clinical follow-up of patients and investigation of those patients with painful MOM arthroplasties with regular measurement of serum Co and Cr ion levels. Radiological follow-up was particularly recommended for those patients with serum metal ion levels above 7 ppb (in micrograms per litre).

In an attempt to assess whether or not serum metal ion levels are useful in monitoring ARMD, a retrospective cohort study of all patients who had received ASR implants and who had not undergone revisions were reviewed. Sixty-two hip replacements in 57 patients were included in the study [27]. There were significant statistical associations between elevated serum metal ions and acetabular inclination greater than 50° (no association with acetabular version), larger heads, female gender (although head size may be a confounding influence), THR rather than resurfacings and bilateral implants.

Although there was a linear correlation between serum metal ions and the severity of disease on MRI, the sensitivity and specificity of serum metal ions for predicting ARMD on MRI was limited. This result has been supported by other studies from the UK [28, 29]. The sensitivity of serum Cr and Co was 56% for both and the specificity was 83% and 76%, respectively. Reducing the serum level to 4 μg/L increased sensitivity to 61% and 66%, respectively, but at the cost of a drop in specificity. Between 20% and 25% of patients with serum CoCr < 7 μg/L have evidence of ARMD on MR, and all of these were asymptomatic [27] (Fig. 9). The conclusion of this study is that serum metal ion measurements are not on their own sufficiently accurate for monitoring the presence of ARMD in asymptomatic patients.

Fig. 9
figure 9

Graphs comparing the frequency distribution of serum Co (a) and Cr (b) ions in patients with ASR THRs with and without ARMD. The patterns of distribution demonstrate no clinically useful correlation.

Prognosis

Several questions arose from the MHRA Medical Device Alert and the recommendation for routine imaging of all patients with MOM THR. The minimal frequency of MR imaging required to adequately screen populations of relatively asymptomatic patients with MOM THR was not known because the natural history of the disease was not understood.

To assess this, a retrospective study of all patients who had undergone MAR MRI of their MOM THR on at least two separate occasions were reviewed and every MR re-reported and staged according to the Anderson criteria [12]. Eighty patients with 103 THRs who had a total of 239 MRIs were included in the study. All 80 patients had had two MR examinations, 29 had had three, and four had had four serial MRIs. The range of time from operation to MRI was 0.8 to 13.4 years (mean 7 SD 2.4 years). For those patients whose first MRI was classified as stage A or “normal postoperative appearances”, 9.5% developed ARMD. This occurred at 7 to 11 years after the initial operation. For those patients with ARMD of any grade, on any MR, 15% deteriorated on a subsequent MR. Nineteen percent of patients staged C1 were reclassified as normal on a subsequent MR at 5 to 7 years. This either reflects a reduced specificity for MR in detecting mild ARMD or true resolution of the disease (Fig. 10).

Fig. 10
figure 10

Scatterplot of grades of severity of ARMD measured with serial MRI in patients in which the first MR was normal. The plot demonstrates that most subsequent MRI is also normal but in the 10% that develop ARMD do so between 7 and 11 years after surgery.

This study did not include a substantial number of patients in whom their MOM THR failed and were revised early, and so the conclusion is that for this cohort, ARMD appears to develop in the early post-operative phase. Those patients with severe disease present early on. Those with mild disease are often stable for many years, and only a small proportion of those with normal MRIs will develop ARMD but when they do it is usually 7 to 10 years after the operation. In light of this, annual assessment of asymptomatic patients with MOM THR with MAR MRI, as recommended by the MHRA, would seem to be an adequate frequency for screening, but our evidence does not necessarily apply to all types of prostheses.

Conclusion

ARMD is often a silent complication of MOM THR. Metal artefact reduction MRI allows adequate visualization of the periprosthetic soft tissues on most standard clinical MR systems. MR characteristics are now well described and appear to vary from prosthesis to prosthesis. There are different approaches to the grading of the severity of ARMD on MRI; however, there does not appear to be any good correlation between the severity of changes at MRI and either clinical symptoms scores or metal ion levels. Therefore, MRI appears to be the most useful tool for diagnosing and monitoring ARMD at this moment in time. Although severe early ARMD is associated with significant morbidity, mild disease is often stable for years. Depending on the type of prosthesis, a patient with a normal post-operative MRI may have a 10% chance of developing ARMD and if they do it seems to occur 7 to 11 years after surgery. A 1-year interval between MRI examinations is reasonable in asymptomatic patients.