Introduction

Anterior surgery of the cervical disc with fusion using autograft from the iliac crest was introduced in the 1950s [10, 40]. The clinical results are typically satisfactory, with at least 75–80% of the patients satisfied, reporting reduced pain intensity, improved function and neurological restitution [9, 24, 27, 30, 32, 35]. However, infections, hematomas and longstanding pain [12, 25, 28, 33, 36, 37, 50] are frequently reported complications from the donor site.

Allografts have been widely used, but imply risks of producing immunogenic response from the host, which might disturb fusion healing and involve the risk of transmitting infections [2, 5, 16, 26]. The risk is low, but can still be important if the infection is severe, as shown, e.g., when HIV is detected [7]. The ideal substitute for autograft should provide all three of its fundamental properties: osteogenicity, osteoconductivity and osteoinductivity. Several implants have been tested, but no ideal substitute or surgical method has been found [44]. Previously we have reported a low fusion rate for a cervical body interfusion with carbon fiber cage (Brantigan) [45]. Trabecular Metal (TM) is a porous tantalum biomaterial with structure and mechanical properties similar to trabecular bone (Fig. 1) and has been shown to be more osteoconductive than other commercially available biomaterials [4, 11, 22].

Fig. 1
figure 1

Trabecular MetalTM implant

The objectives of the study were to measure and compare the radiological and clinical outcomes of anterior cervical decompression and fusion (ACDF) with Trabecular Metal (TM) devices and the traditional Smith–Robinson (SR) procedure with autograft.

Methods

All patients scheduled for single-level anterior cervical decompression and fusion (ACDF) and fulfilling the criteria for the study were consecutively invited to participate during the period February 2002 to September 2003. Five patients declined participation and the remaining 80 provided informed consent. Study inclusion criteria were cervical radiculopathy with or without myelopathy due to degenerative disc disease (including disc herniation and/or spondylosis) with compatible MRI and clinical findings. Exclusion criteria were previous cervical spine surgery, posttraumatic neck pain, inflammatory systemic disease, another neurological disease and drug or alcohol abuse. No patient had spontaneous fusion at the adjacent segments. A flowchart of the study is presented in Fig. 2 .

Fig. 2
figure 2

Flowchart for the study

After discectomy, the subchondral bony end plates were roughened by the burr until they bled, taking care that they would be able to function as a bearing surface for the implant. The posterior longitudinal ligament was removed in the majority of cases, and consequently osteophytes, if present, were removed. After decompression had been completed, randomization to fusion group was performed in the theater by a nurse using sealed envelopes. This late randomization was used to avoid surgeons’ bias to treatment group during as much as possible of the surgical procedure. An implant size that could be positioned between the end plates by light tapping was chosen. The tricortical autografts were taken from the iliac crest using a saw with a twin blade. A subcutaneous catheter was placed at the donor site for administration of ropivacaine hydrochloride (NaropR) 2 or 3 days postoperatively, to reduce the pain. All patients used a soft collar for 6 weeks postoperatively.

The randomization procedure yielded similar group distributions of age, gender and smoking habits (Table 1). The operated segment was C3/4 in 2 patients (both SR), C4/5 in 4 patients (2 TM, 2 SR), C5/6 in 50 patients (26 TM, 24 SR), C6/7 in 23 patients (11 TM, 12 SR) and C7/T1 in 1 patient fused with TM. See Table 2 for implant size. The operations were performed by five senior surgeons, and 70 out of the 80 patients were operated on by one of the two authors (HL, LV).

Table 1 Patient data
Table 2 Trabecular Metal implant size

Clinical follow-up

Pain intensity in the neck, arms and pelvis/hip were rated by patients on a visual analog scale (VAS, 0–100), and neck function was rated using the Neck Disability Index [46] (NDI, 0–100) the day before surgery, and 4, 12 and 24 months postoperatively. Pain drawings were obtained at the same intervals. Follow-ups at 12 and 24 months were performed by an unbiased observer (ME), and patients also assessed their global outcomes at these same follow-up intervals.

Radiological follow-up

Digitized plain radiographic images of 78 (98%) patients were obtained preoperatively, immediately postoperatively and at 2-year follow-up, and were subsequently evaluated by two senior radiologists. Consensus about fusion or non-fusion was reached after the first evaluation in 49 cases (62%) and after the second evaluation in the remaining 29 cases (38%). The second measurement was used to calculate intra-observer variability. The entire data set was analyzed to assess inter-observer variability of radiographic measurements and associated precision of the measurements.

Fusion/non-fusion was classified by visual evaluation of the A–P and lateral views in forced flexion/extension of the cervical spine, i.e., (1) the presence/absence of bone-bridging or interface lucencies between TM and bone, and (2) by measuring the differences between the angles of the spinal processes of the fused vertebrae at flexion and extension. Fusion was classified as either clearly fused (I), probably fused (IIA), probably non-fused (IIB) and clearly non-fused (III) (Figs. 3, 4, 5). Finally, the material was dichotomized so that groups I and IIA were combined to fused, and groups IIB and III to non-fused. The same classification had been used in a previous study of the Brantigan carbon fiber cage [45]. For classification in group I, radiological signs of bone bridging were required and mobility of up to 1.0° was accepted. Cases classified in group II had uncertain signs of bone fusion. Group IIA had mobility of 2.0° or less and group IIB had more than 2.0°. Group III required both the absence of bone bridging and mobility of 3.0° or more.

Fig. 3
figure 3

Clearly fused (group I) after surgery with autograft (SR)

Fig. 4
figure 4

Clearly fused (group I) after surgery with Trabecular Metal (TM)

Fig. 5
figure 5

Clearly non-fused (group III) after surgery with Trabecular Metal (TM). Note the radiolucent zone above the implant. Mobility is seen between images in flexion and extension

MRI was performed on 20 consecutive TM cases at 2-year follow-up. Several parameter sets suggested for TM in the published literature were tested [21, 47] in addition to our standard protocol for the degenerative cervical spine, and the parameter sets below were ultimately chosen. The scans were performed on a Siemens Vision 1.5 T MRI scanner using a cervical spine coil with a protocol consisting of T1-sagittal images (TR 500, TE 12, se), T2/PD sagittal images (TR 4000, TE 128, tse and TR 1300, TE 120, se), PD sagittal images (TR 1300, TE 60, se), T1 axial images (TR 600, TE 15, se), T2 axial images (TR 620, TE 10, 25°, Fl2d), T2 oblique images (TR 4000, TE 128, tse), T2 coronal images (TR 1485, TE 120, se) and finally PD coronal images (TR 1485, TE 60, se). Slice thickness was 4 mm in all images.

Statistical methods

A rank-invariant non-parametric method for analysis of pairs ordered categorical data was used to compare the pain ratings (VAS) and NDI for the groups. The method makes it possible to separately measure order-preserved individual changes attributable to the group change, as well as an individual change in category that is different from the change of pattern in the group [41, 42]. Clinically relevant improvement, set at 10, was calculated for VAS and NDI. χ2 tests were used to compare the groups. Student’s t test was used to analyze the operative and hospital time. Fischer’s exact test was used to compare the fusion rate in the groups. Inter- and intra-observer correlation was calculated using kappa analysis. A value of P < 0.05 was considered to be statistically significant.

Results

Surgery

Operation times were shorter for fusion with TM as compared to autograft; mean times were 100 min (SD 18) and 123 min (SD 23), respectively (P = 0.001). There was no difference in intra-operative bleeding between the implant groups. Of the 80 patients, 72 had less than 50 ml of bleeding. There was no statistically significant difference in the hospital time between the groups; TM mean was 3.6 days (SD 1.1) and SR mean, 4.1 days (SD 1.7) (P = 0.18).

Clinical outcome

For patients receiving TM, the maximal pain (VAS) was reduced from median 57 in the neck and 45 in the arm before surgery, to 40 and 14 at 1 year and to 41 and 24 at 2 years postoperatively. In the group with autografts, the corresponding VAS ratings were reduced from median 66 in the neck and 60 in the arm before surgery to 36 and 28 at 1 year, and to 24 and 28 after 2 years (Fig. 6a, b). The number of patients showing clinically relevant improvement (set to at least 10 for VAS) in neck pain was 39% with TM and 63% with SR (P = 0.07), and in arm pain 50 and 58%, respectively.

Fig. 6
figure 6

Pain rating (VAS) and Neck Disability Index preoperatively and at the follow-ups. The box plots illustrate the 25th and 75th percentiles with the median value marked in between. The range is shown by whiskers, but extreme outliers are separately shown by circles. There were no statistically significant differences between the surgical methods

NDI improved from median 36 preoperatively to 30 after 2 years in patients with TM, and from 44 to 25 in the SR group (Fig. 6c). Clinically relevant improvement in NDI (set to at least 10) was found in 53% of the patients with TM and 61% with SR. The patients’ global assessment of their neck and arm symptoms 2 years postoperatively for the TM group were: 41% much better, 38% better, 10% unchanged, 8% worse, and (one patient) 3% much worse. In the SR group, the assessments were: 42% patients much better, 33% better, 13% unchanged and 12% worse, i.e., 79% were much better or better after fusion with TM and 75% using autograft.

At all follow-ups of 4, 12 and 24 months, pain scores (VAS) in both neck and arm, and NDI scores were significantly improved in both groups when compared with baseline, except for neck pain (VAS) at 12 months in patients fused with TM (P = 0.06).

No statistically significant difference was found between the Trabecular Metal and autograft techniques for pain scores, NDI or the patients’ global assessments, and at all follow-up intervals. A trend toward a higher proportion of patients with clinically relevant improvement in neck pain (at least 10 mm VAS) was measured after 2 years in patients with autografts (P = 0.07). The clinical results and corresponding P values are presented in Table 3.

Table 3 The clinical results and corresponding P values

No differences in clinical outcomes were seen between patients who appeared radiologically fused or non-fused (P = 0.6). There was a tendency toward poorer clinical outcome for smokers compared with non-smokers, estimated by the patients’ global assessments (P = 0.07).

Pelvic pain

There was no difference in pelvic/hip pain (at the donor site) preoperatively and at 4, 12 or 24 months, between patients fused with and without autograft. Further analysis of the pain drawings showed eight patients with markings at the right iliac crest (four SR, four TM). However, the majority had marked this as related to the pain caused by lumbago/sciatica or generalized pain. Only one patient, who had been fused with TM, marked localized pain in this area.

In summary, no remaining donor site pain was marked in the pain drawings, and none was seen in the VAS scoring.

Complications

Further surgery

Three patients were reoperated: two of them because of non-fusion and one due to graft dislocation. All had been primarily fused with autografts (SR). They were all clearly fused 2 years postoperatively. One patient fused with TM was operated on at the adjacent segment after 19 months.

The only patient with remaining symptoms due to complications 2 years after surgery had a sensory deficit below the donor site at the iliac crest (SR). Further complications in the SR group included three patients with wound infections at the iliac crest, and one of them with an infected hematoma. One patient had pneumonia and one had a lower urinary tract infection (cystitis). All infections were cured after antibiotic treatment. One patient developed a fissure in the autograft during the primary surgery. A plate was added to the fixation directly, and the fusion healed without further complications. Among the patients fused with TM, two had transient hoarseness, and one of them also had swallowing disturbances. One patient was treated with antibiotics for a urinary tract infection. In summary, nine patients in the SR group and three patients in the TM group had complications, but only one patient (SR) had symptoms 2 years after surgery.

Radiological outcome

The fusion rate shown by the radiological analysis is presented in Table 4. All patients in group III (clearly non-fused) showed at least 4° of mobility (the limit for the group set by the classification was 3.0°). There was no statistically significant difference in the fusion rate between smokers (92%) and non-smokers (74%) (P = 0.2). Smokers operated on with TM showed an 87% fusion rate. Kappa analysis showed 0.63 and 0.66 for the intra-observer correlation, and 0.58 for inter-observer correlation.

Table 4 Radiological fusion

Magnetic resonance imaging

MRI, of 20 TM cases was successfully used to assess decompression of the neural structures, but was not helpful in determining fusion/non-fusion attributable to metal artifacts in the area immediately surrounding the implants.

Discussion

Autograft is referred to as the gold standard for spinal fusion [43] due to its unique combination of osteogenicity, osteoconductivity and osteoinductivity. We had hypothesized that similarly high fusion rates for TM as for autograft could be obtained, but without the risk of complications from the donor site associated with autograft. The fusion rate of carbon fiber cages used in the treatment for the degenerative cervical spine was 62% in our previous study of the Brantigan cage [45], which led us to discontinue the use of the device. TM cages were chosen for the present study because of the unique microstructure of the material and because of the published affinity of osteocytes to tantalum metal [22]. These factors were hypothesized to promote bone ingrowth and enhance fusion. The fusion rate for TM in the present study was higher than for the carbon fiber cages, but lower than that of the SR group.

In a recent study, Fernández-Fairen et al. [14] compared TM used as a stand-alone cage with autograft used with plate. The fusion rate was 89% for TM and 85% for autograft with plate. No statistically significant difference in radiological fusion or in clinical outcome was found between the groups. Criteria for fusion were that “segments were deemed fused when there was evidence of bony bridging around the implant and/or <2° of variation of Cobb’s angle on F/E radiographs or <2 mm of variation in the interspinous distance, in the absence of periimplant radiolucency”. We had similar criteria for fusion, besides measuring the movement between the spinal processes. The criterion accepting <2 mm movement is probably wider compared to the criterion of <2° variation of the angle [8, 13]. With less stringent criteria for fusion (more motion allowed), the apparent fusion rates increase, as demonstrated by Fasset [13].

Smokers were excluded from the study by Fernández-Fairen et al. [14], while we had 40% smokers in the group fused with TM. We found no statistically significant difference between smokers and non-smokers in our study, but it is still possible that smoking had some adverse influence on the fusion healing. On the other hand, the fusion rate in our control group with autograft was 92%, while it was 85% after autograft with plate in the study by Fernandez et al., though 25% of our patients operated on with autograft were smokers. Our patients were not randomized to fusion group until the major part of the surgical procedure including the decompression was completed, which reduced the risk for surgeons’ bias, while the preoperative randomization in the study by Fernandez et al. might have had an adverse influence on the control group.

Wigfield et al. [49] have presented a study with tantalum interbody implant, where inclusion of patients was halted after radiographs 6 weeks postoperatively had shown inferior end-plate lucency, raising concerns about delayed fusion or non-fusion. However, fusion was subsequently noted in all 15 patients available for follow-up at 12 months of the 17 patients operated on with tantalum implant, but the study numbers were too small for statistical analysis. Fusion was defined as less than 4° angulation between flexion and extension radiographs and absence of radiolucency extending over more than 50% of the implant/end plate interface. Baskin and Travnelis [3] compared TM with autograft in an RCT that was terminated due to concerns over delayed fusion after 39 patients had been enrolled. Of the 28 patients operated on with TM, 6 out of 16 patients (37%) who were examined with radiographs at 24 months were fused. A low fusion rate with TM was found by Zoëga and Lind [51] as well. Two years after ACDF with TM cage, none of the 13 patients had fusion. Those authors used radiostereometric analysis (RSA) for the follow-up, which is a very sensitive method for detecting motion [17, 29, 53].

Clinical outcome data showed no statistical difference between non-fused and fused patients in the present study. Earlier studies have shown divergent results concerning correlation between fusion and clinical outcome, with some pointing to the importance of the fusion for the clinical outcome [9, 27, 48] and others denying such a connection [12, 25, 31]. Addressing the fusion rate alone (without considering the clinical outcome), the use of TM as stand-alone device does not seem sufficient.

The fusion rate with TM might be enhanced with an anterior plate, considering published results of TM with and without pedicle screws used in the porcine lumbar spine [54], as well as fusion rates for TM with allograft and anterior plate [35]. The use of an anterior plate in these studies suggests that initial stability may be an important factor in achieving fusion [36]. Because of the results obtained, we now use TM together with an anterior plate. Because our earlier study of the Brantigan cage showed closer correlation between radiological fusion and clinical outcome 5 years postoperatively as compared with 2 years [27], the present study will be extended.

It has been advocated that a fusion cage can avoid subsidence better than an autograft, due to collapse of the latter. Some studies support this [45], while others show similar subsidence with the cage as well [15, 23]. This question was not addressed in the present study, where the radiological evaluation focused on whether the operated segment was fused or not.

The accuracy of measurements of motion on digitized radiograph images was considerably higher in the present study (2.4°, 95% CI) than in our previous experience measuring on conventional radiographic films. The accuracy of measurements on conventional radiograph images has previously been estimated at 5° [17] and the cutoff for mobility has been set at 4° for studies of cervical implants. The described method using digitized radiograph images has reduced the difference in accuracy compared to the much more complex and expensive radiostereometry (RSA). We estimated the accuracy for RSA in the cervical spine at about 1° in a previous study [25, 29], which is less accurate than in the lumbar spine, mainly owing to the small size of the cervical vertebral bodies. With distortion-compensated roentgen analysis (DCRA), another technique for computerized analysis of conventional radiographs, Leivseth found an error of 2.4° [20].

MRI of 20 TM cases was successfully used to assess decompression of the neural structures, but was not helpful in determining fusion/non-fusion. The artifacts from the implants could be limited to the structures immediately surrounding the implants. Hence, the spinal canal and the foraminae could be visualized, and the decompression assessed, but interpretation of the interface between implant and vertebral body was disturbed. This is in contrast to the experience in the lumbar spine [personal communication, D Robertson] and is mainly due to the smaller size of the cervical vertebrae.

The primary advantage of using an implant rather than autograft bone for ACDF is that it avoids complications associated with the donor site. Several previous studies have reported persistent pain in 15–40% of the patients 2 years after surgery [6, 12, 25, 33, 37, 50], though some studies show that this is less frequent [1, 34]. In our earlier study of the Brantigan cage, we found more donor site pain immediately postoperatively when using a conventional graft from the iliac crest as compared with using a percutaneous technique [45]. In the present study, no residual donor site pain was found at 4 months or later after surgery. In the early postoperative period, donor site pain is frequent, and it was experienced by our patients, but no assessment of the pain was made in this period. The administration of ropivacaine hydrochloride (NaropR) subcutaneously for the first few postoperative days resulted in pain reduction in our patients. This postoperative pain reduction might have reduced the tendency to persisting pain as well, due to less central sensitization caused by the postoperative pain [18, 19]. Sing et al. [38] have shown the good effects of continuous local anesthetic infusion on the acute graft-related pain as well as a remaining effect 4 years postoperatively [39]. In both groups in our study, 10% of the patients marked the pelvic/hip region on the pain drawing at 2 years, which illustrates the importance of having a control group for all follow-ups. Patients operated on with autografts were at risk of rare complications, such as neuralgia, although this did not occur in the moderate number of studied patients. Despite absence of chronic pain, we still found donor site morbidity; one patient had lasting sensory disturbance and three were treated for local infections.

The clinical outcome in the present study showed 28 and 22 mm reduction in pain rating (VAS) in the neck and arm, respectively, 12 points improvement in NDI, and improvement for 77% of the patients in the global assessment. This is in accordance with earlier studies [9, 24, 25, 30, 32, 35, 45, 52].

Conclusions

This study of uninstrumented single-level ACDF showed a lower fusion rate with Trabecular Metal than with the Smith–Robinson technique with autograft after single-level anterior cervical fusion without plating. There were no differences in the clinical outcomes between the groups, and there were no differences in outcomes between patients who appeared radiologically fused or non-fused. The operating time was shorter with Trabecular Metal implants. No remaining donor site pain at the iliac crest was seen at 4 months or later.