Introduction

The first generation hip resurfacing arthroplasty (HRA) developed in the 1970s used a cobalt-chromium or titanium alloy femoral component bearing against a polyethylene metal-backed acetabular component. These HRAs generated large volumes of polyethylene wear debris [18] and were highly susceptible to osteolysis and, because of this, they were largely abandoned. The reintroduction of metal-on-metal (M-M) bearings in THAs [36] has encouraged their use in HRA as well [1, 28]. One design, the Birmingham Hip Resurfacing (BHR), has been in wide use in the United Kingdom, Europe, and Australia for a decade [34] and received FDA approval for use in the United States in 2006. A dozen or more HRA designs are available in Europe; some of these are undergoing clinical trials and are pending FDA approval for use in the United States. Targeted to young and active patients, HRAs are expected to account for an increasing number of hip arthroplasties in the future [5]. Thus, it is important to understand the limitations and complications associated with these devices before their widespread use.

Femoral neck fractures and aseptic loosening account for the majority of HRA failures [6, 12] whereas, unlike their historical metal-polyethylene predecessors, osteolysis is not a common cause of failure in modern M-M HRAs [2, 19, 34]. This is consistent with their ability to operate with very low wear if factors such as surface smoothness and diametric clearance (the difference between the diameters of the femoral head and acetabular cup) are optimized [17, 32]. However, it is becoming evident that socket placement outside of a recommended range (30° to 50° abduction and 15° to 25° of anteversion) [9, 10, 23] can lead to a greater amount of metal release, particularly in small-diameter components [10, 22, 29]. Under such conditions, large quantities of particulate cobalt-chromium debris and associated corrosion products can lead to a variety of adverse reactions, including osteolysis [6, 9], periprosthetic soft tissue masses [9, 13, 14] and extensive necrosis [4, 29].

Recent reports from one large-volume resurfacing surgery center described “pseudotumors” forming in the hips of some female patients with M-M HRAs [30] which led the authors to speculate that a preoperative sensitization to metal may be a factor. This complication was estimated to occur in 1% of patients undergoing HRA within 5 years, but the incidence could be higher with longer followup and in patients with bilateral implants [25]. Subsequent studies by this group reported higher metal wear in patients with pseudotumors compared to patients without pseudotumors [20]. Pseudotumor-like, enlarged, fluid-filled bursae in HRAs with malpositioned acetabular components had been previously reported by our group [6, 9] and others [8]. However, pseudotumor-like reactions have also been reported in M-M HRAs without evidence of high wear or metal hypersensitivity [27] as well as in non-M-M bearing hips [16, 24]. The histology of pseudotumors includes features consistent with metal wear reactions (eg, macrophages with particles [9, 30]) as well as metal hypersensitivity (eg, lymphocytic aggregates, granulomas [26, 31]) although both may occur together or extensive necrosis may prevent detailed histological characterization [31].

The aim of this study was to compare the histopathologic features (synovial lining integrity, inflammatory cell infiltrates including lymphocytes, macrophages, plasma cells, giant cells, as well as tissue organization, necrosis and metal wear particles) in pseudotumor-like tissues from M-M hips revised for suspected high wear with pseudotumor-like tissues from M-M hips revised for unexplained pain and suspected metal hypersensitivity.

Materials and Methods

We selected from archived M-M hip retrievals 32 specimens that were submitted with an unusual soft tissue reaction described by the revising surgeons as an aseptic “soft tissue mass,” “enlarged bursa,” or a “cyst” which could be considered as “pseudotumor-like.” Twenty-seven of the 32 cases were hip resurfacings (four articular surface replacements (ASR, DePuy International, Leeds, UK), 20 Birmingham Hip Resurfacings (BHR, Smith and Nephew, Memphis, TN), two Conserve Plus hip resurfacings (Wright Medical Technology, Memphis, TN), and one McMinn resurfacing (Corin, Cirencester, UK). The remaining five cases were conventional total hip arthroplasties (one Biomet M2 THA, Biomet, Warsaw, IN), one big femoral head THA (Wright Medical Technology), and three Metasul bearing total hips (Zimmer, Warsaw, IN). There were 23 females and nine males with average ages of 54 years (range, 18–68 years) and 62 years (range, 48–82 years), respectively. As documented by the revising surgeons, the reasons for revision were acetabular malposition (steep abduction angle, excessive or insufficient anteversion, n = 15), unexplained pain (i.e., in the absence of infection, radiographic loosening or malposition, and where metal sensitivity was suspected, n = 9), and aseptic loosening (n = 5).

We calculated the wear depth of 24 of the 32 explanted components including HRAs that had not been sectioned and were still intact by digitizing 300 to 400 points on the bearing surface with a coordinate measurement machine (BMT 504; Mitotoyo, Aurora, IL). The remainder had been sectioned without prior wear measurements for a separate study. The resolution of this equipment was approximately 4 μm, so wear depths at or below this level were considered “undetectable.” Acetabular cup abduction angles were measured by the revising surgeons on AP radiographs in 28 of the 32 cases using standard radiographic techniques [33] (four cases had poor-quality radiographs deemed unsuitable for this analysis). This involved measuring the angle between a line connecting the ischial spines and another line drawn tangent to the opening of the cup, representing the large diameter of the ellipse. This method is widely used in clinical practice for postoperative measurement of cup position [35].

Twenty-two of the tissue samples submitted were large, smooth-walled sacs, and some of the tissues were clearly metal-stained (Fig. 1). All of the tissues were fixed in 10% formalin immediately after removal. They were weighed, measured, photographed, and their gross appearance was noted. From two to five tissue samples from several sites, especially if there were variations in color or texture of the specimen, were embedded in paraffin blocks for routine sectioning and staining with hematoxylin and eosin. Three of us (PC, SN, KT) examined at least six tissue specimens per case semiquantitatively for lymphocytes, macrophages, plasma cells, giant cells, necrosis and metal wear particles using the method of Doorn et al. [11], i.e., where a zero to 3 plus score is given as features of interest become more numerous in a high power 40× microscopic field of view. This type of method reportedly has an interobserver agreement level of 0.91 [3].

Fig. 1
figure 1

An enlarged fluid-filled bursa excised from the hip of a male patient during revision surgery for acetabular malpositioning 13 months after metal-on-metal hip resurfacing arthroplasty is shown. There is light gray discoloration and the wear measurement of the explanted component showed an annual femoral wear rate of 12.8 μm.

Each case was also given an ALVAL (aseptic lymphocytic vasculitis-associated lesion) score of 1 to 10 (Table 1); ALVAL has been applied to a unique, lymphocyte-dominated reaction in M-M periprosthetic tissues [37]. To check the reproducibility of scoring, two of us (PC, KT) performed the scoring in a blinded fashion on two separate occasions. The kappa coefficient for interobserver variability showed a correlation between the two observers of 0.71 and between the two separate measurements of each observer of 0.68. Using only the histologic features and the ALVAL scores, each observer predicted whether each case failed in association with high-wear, suspected metal hypersensitivity, or some other cause. The kappa coefficient for interobserver variability was used to determine the validity of these predictions against the actual wear measurement from the retrievals. The kappa coefficient for the agreement between the first observer’s prediction and the retrieval findings was between 0.69 and 0.81 and the second observer’s between 0.43 and 0.73. The kappa coefficient for the prediction regarding the association of a case with high wear was higher for the observer with many years of experience with histologic analysis of retrieval tissues (PC) compared with the second, less experienced observer.

Table 1 Histologic scoring criteria for ALVAL score

The independent variable considered in this study was whether the patients were revised with suspected high wear or with suspected metal hypersensitivity. The dependent variables were the histologic features related to the intensity of the inflammatory reaction: ALVAL score, lymphocytes, macrophages, plasma cells, giant cells, necrosis, and metal particles. Univariate analysis was used to determine the mean, median, SD, and distribution for each variable as necessary. The Mann-Whitney test was used to compare the femoral wear and wear rate of patients with suspected high wear with those suspected to have metal hypersensitivity. Likewise, histologic ratings for ALVAL were compared in these two groups. The Mann-Whitney test was determined to be appropriate because the dependent variables were not normally distributed.

Results

When comparing the histologic features in the tissues of patients revised for suspected high wear with those revised for pain and suspected metal sensitivity, the higher wear cases had a lower (p < 0.001) ALVAL score, fewer lymphocytes but more macrophages and metal particles (Table 2). Histologically, there was considerable variability in the amount and distribution of metal debris, the number and type and arrangement of inflammatory cells, and the degree of necrosis. Very few tissues demonstrated an intact synovial lining and there was often a layer of adherent fibrin, organized fibrin, or necrosis on the joint cavity side of the tissue. Macrophages and lymphocytes were present in all cases, but those with extensive infiltrates of macrophages tended to have smaller lymphocyte aggregates (Fig. 2). This was in contrast to the appearance of very large, dense lymphocyte aggregates, often arranged distal to the surface (Fig. 3), which were more often seen in association with small to moderate amounts of macrophages. The highest ALVAL scores occurred in patients revised for pain and suspected metal sensitivity. Most of the tissues had focal to moderate necrosis and one case had necrosis that dominated most of the tissue sections (Fig. 4).

Table 2 Results of the semiquantitative evaluation of histologic features for cases revised for suspected high wear and for unexplained pain/suspected metal sensitivity
Fig. 2
figure 2

Light micrograph showing typical histologic features of high wear cases, including organized fibrin (F), a diffuse, extensive infiltration of slate blue/gray macrophages, and a small aggregate of lymphocytes (arrows) (Stain, hematoxylin and eosin, original magnification ×40). This received an ALVAL score of 5 (2 for synovial lining, 2 for inflammatory infiltrate, and 1 for tissue organization).

Fig. 3
figure 3

Light micrograph showing typical histologic features of cases revised for suspected metal sensitivity, including a thick, mostly acellular tidemark area lined by fibrin (F) and thick, dense aggregates of lymphocytes at the rear of the tissue (arrows) (Stain, hematoxylin and eosin, original magnification ×40). This received an ALVAL score of 10 (3 for synovial lining, 4 for inflammatory infiltrate, and 3 for tissue organization).

Fig. 4
figure 4

Light micrograph showing dense lymphocyte aggregates behind a thick necrotic fibrous tissue layer. The tissues were from a male patient with a THA that was revised for pain after 2 years. Extensive necrosis was found at revision, but the component wear was within normal range (Stain, hematoxylin and eosin, original magnification ×40).

The average femoral wear rate for the components from patients revised for suspected high wear was 19.9 microns per year (standard deviation 18.1, range 3.1–76.2 microns per year) and was higher (p = 0.003) than that for components from patients revised for pain and suspected metal sensitivity (average 3.7, SD 2.2, range 1.5–6.7 microns per year).

Discussion

Pseudotumors, masses and enlarged bursae have been reported in hips with M-M bearings associated with pain and swelling. The cause of these reactions is unclear but several authors have suggested it is a reaction to high wear [20, 30] or to metal hypersensitivity [15, 26, 30]. This study was conducted to compare the histology of pseudotumor-like tissues from hips suspected to have high wear with those from patients with pain and suspected metal sensitivity and our results support the formation of pseudotumors from both wear and hypersensitivity reactions.

We acknowledge several major limitations. First, we did not provide any specific morphologic criteria for the tissue specimens we studied because we were confident experienced orthopaedic surgeons would recognize an unusual, adverse reaction. For this reason, we have used the term “pseudotumor-like” and we are confident that the submitted specimens, even though labeled as masses, cysts or enlarged bursae, were comparable to the cystic or solid pseudotumors described by Pandit et al. [30]. Similar histological features were noted in pseudotumor and pseudocapsule tissues in our analysis (results not shown), a finding also reported by Mahendra et al. [26]. Thus, even if some of our samples were actually misclassified thickened capsules, the results of our analysis remain valid. Second, we cannot prove a presumptive diagnosis of metal sensitivity. Unlike component wear or serum ion levels, which can be measured with a known degree of accuracy, there are currently no definitive blood tests or histopathologic criteria to diagnose metal hypersensitivity. However, we devised a working postulate to diagnose hypersensitivity: early onset of pain, the absence of other reasons for pain (such as loosening, impingement, infection, or high wear), and the resolution of symptoms after the removal of the cobalt-chromium components. Other clinical reports have noted similar features in patients suspected to have a metal hypersensitivity reaction [7, 37]. We recognize that it is possible for metal hypersensitivity to coexist with any of these other causes of pain and we also recognize that there is variability in the clinical presentation that will confound this working definition. Third, we observed a range of intra- and interobserver statistics for light microscopic tissue features. This is to be expected given that we were using a semiquantitative scoring system and that the importance of histologic features is subject to individual interpretation. The histologic rating is meant to be used in conjunction with the case history, radiographic findings, and retrieval findings and when so used, the interpretation of histologic features is more likely to predict their cause.

Our semiquantitative analyses demonstrated substantial differences in the histological features of pseudotumor-like tissues from patients with high wear compared with those tissues from patients suspected to have metal hypersensitivity. The tissues from both groups contained macrophages and lymphocytes in variable amounts and distributions but applying the ALVAL rating allowed clear patterns to emerge. In particular, there was generally less disruption of the synovial surface, and greater preservation of the normal tissue architecture in the high wear group. In contrast, the most extensive damage to the tissues and the densest lymphocyte aggregates occurred in patients suspected to have a metal hypersensitivity reaction and typically this occurred in the absence of high wear. The variability we noted is consistent with other histologic reports; Pandit et al. [30] noted scattered, focally heavy macrophage and lymphocytic infiltrates, including lymphoid aggregates, in formal biopsy samples of pseudotumors from 10 hip resurfacings revised for pain and/or pseudotumor formation. Metal particles were present but not prominent in the tissues. Similar findings were reported in two female patients with masses causing femoral neuropathy around unilateral hip resurfacings [15]. Wear measurements were not provided for the implants associated with these pseudotumors and it is not clear if the patients had risk factors for high wear such as small component size and implant malposition.

One recent study by Langton et al. [21] analyzed tissues from 17 patients with M-M hips following revision for an adverse response to metal debris including pseudotumor formation. They reported substantially higher component wear and blood ion levels in these patients compared with those revised for other reasons. Their histological examination noted ALVAL features such as synovial ulceration and perivascular lymphocytes which ranged from absent to moderate. The lack of lymphocytes in some of their cases and the absence of high levels of lymphocytic infiltrates in this group of patients with high wear is consistent with our observations. We suggest that using the ALVAL score will promote more standardized reporting of the histological features of tissues removed from M-M hips.