Introduction

The gold standard for surgical treatment of cervical radiculopathy has been cervical discectomy and fusion (ACDF). However, studies on cervical fusion have reported an increased adjacent-level intradiscal pressure and range of motion [14]. The procedure has thus been perceived to accelerate adjacent segment degeneration (ASD) [5, 6]. Accordingly, the introduction of anterior cervical discectomy and arthroplasty (ACDA), which aims to preserve motion at the operated level, has gained growing interest among spinal surgeons. However, this interest is accompanied by concern with respect to heterotopic ossification (HO), a well-known phenomenon in arthroplasty of the hip and knee [7, 8]. It is also known to occur after arthroplasty in the lumbar spine, and was described and classified in this context by McAfee et al. [9]. After the introduction of cervical arthroplasty, Mehren et al. [10] published their classification system based on McAfee et al. [9]. The degree of HO is described as low (grade 0–2) or high (grade 3–4) [10], and in the last decade several reports have been published in which the occurrence rate of HO varies according to the disc prosthesis used [1113]. Cervical arthroplasty devices are usually manufactured to be semiconstrained or nonconstrained. Semiconstrained devices allow for motion similar to normal physiological movement. Nonconstrained devices have no mechanical stop and extremes of motion are prevented by the perispinal soft tissue and inherent compression across the disc space [14]. Previously, reports regarding fusion (Mehren grade 4) have focused mainly on semiconstrained devices [15]. Few studies have assessed the occurrence of heterotopic ossification and complete fusion in cervical nonconstrained arthroplasty [1618]. Heary et al. [16] presented a case report where complete fusion was found 5 years after surgery, and Skeppholm et al. [17] demonstrated complete fusion in 5 % of their patients at an average of 40 months follow-up. The present study was realized under the framework of the Norwegian Cervical Arthroplasty Trial (NORCAT) to assess to what degree preservation of motion was maintained 2 years after surgery, and to compare our results with previous reports. In addition, we wanted to investigate if high-grade HO had an impact on clinical outcome compared with low-grade HO.

Material and method

Study population

NORCAT is a prospective, randomized controlled, single-blinded, multicenter trial on one-level ACDA versus ACDF. One hundred and thirty-six patients were included at five university hospitals in Norway during the time period from November 2008 to January 2013. Seventy nine out of 136 patients were included at the Department of Neurosurgery, Oslo University Hospital Rikshospitalet, Oslo, Norway. Of these, 39 were randomized to arthroplasty. Two patients had their arthroplasty device removed before 2 years follow-up due to loosening and anterior migration of the prosthesis. The present study is based on the remaining 37 patients 2 years after surgery.

Inclusion criteria were age 25–60 years, clinical C6 or C7 root radiculopathy with corresponding radiological findings with or without neurological deficits, Neck Disability Index (NDI) ≥ 30 %, no response to non-operative treatment and no sign of improvement during the last 6 weeks prior to surgery.

Exclusion criteria were significant spondylosis involving more than one level, adjacent-level ankylosis, intramedullary changes on magnetic resonance imaging (MRI), clinical suspicion of myelopathy, chronic generalized pain syndrome, mental illness, infection, active cancer disease, rheumatoid arthritis involving the cervical spine, previous trauma involving the cervical spine, pregnancy, allergy to the contents of the cage/artificial disc, previous neck surgery, abuse of medication/narcotics, and that the patient did not understand oral or written Norwegian.

Methods

Computed tomography (CT) of index level was performed 2 years after surgery on all 37 patients. The CT scans were carried out with a multidetector scanner using bone algorithm, dFOV 15–18 cm, between 80 and 100 mA and 120 kV, and 1 mm increment with coronal and sagittal reconstructions. The images were evaluated twice by two experienced neuroradiologists and assessed by consensus. The radiologists were blinded with respect to clinical outcomes. To assess the degree of HO, the Mehren classification system was used (Fig. 1) and classified as either low grade (grade 0–2) or high grade (grade 3–4).

Fig. 1
figure 1

The grade of heterotopic ossification (HO) was assessed using the Mehren classification system. Illustration by K. C. Toverud

The arthroplasty device

The arthroplasty device used in the present study was the DISCOVER® Cervical Arthroplasty Disc Replacement System (DePuy Spine Inc., Raynham, Ma.). It is a nonconstrained device which comprises a titanium alloy superior endplate with an ultrahigh molecular weight polyethylene core that is mechanically fixed to the inferior titanium alloy endplate (Fig. 2). The hard polymer core on the inferior endplate articulates with the superior metal endplate to form a ball-and-socket type of joint. Flexion–extension and axial rotation are limited by musculoligamentous restraints and the articulating surfaces. Lateral motion is limited to 21°. The fixed inlay in combination with a ball-and-socket superior articulation with limited lateral motion has resulted in the device being described as minimally constrained [19].

Fig. 2
figure 2

Illustration of the DISCOVER® disc prosthesis (DePuy Spine Inc., Raynham, Ma.) Illustration by K. C. Toverud

The surgical procedure

A standard discectomy via the anterolateral approach was performed. The posterior longitudinal ligament was opened to visualize the dura mater and the nerve roots were decompressed. The endplates were trimmed with a diamond burr. A fluoroscope was used to ensure that the prosthesis was placed in the midline and sufficiently close to the posterior edge of the vertebra. The appropriate size of the prosthesis was determined with the use of templates.

Clinical outcome and baseline variables

Primary outcome measure

The primary outcome was the NDI [20], which is a self-rated disability score of neck and arm pain. It is composed of ten items: pain, personal care, lifting, reading, headache, concentration, work/daily activities, driving, sleep, and recreation. Each item is scored from 0 to 5. The score was calculated in percentage where higher scores represent worse function.

Secondary outcome measures

  1. 1.

    The Short Form-36 (SF-36) [21] is a generic health-related quality of life questionnaire that measures along eight dimensions: physical function, role limitations due to physical problems, bodily pain, general health, vitality, social function, role limitations due to emotional problems, and mental health. There are two summary measures: Physical Component Summary (PCS) and Mental Component Summary (MCS).of four to six items and score ranges from 0 to 100, where a higher score is related to better health. We used the Norwegian (chronic) version v 2.0 [22], and for scoring the questionnaire, we used QualityMetric Health Outcomes Scoring Software 2.0 (QualityMetric Incorporated, Lincoln, USA).

  2. 2.

    The EuroQol-5 Dimension-3 level (EQ-5D-3L) [23] questionnaire is applicable to a wide range of health conditions and treatments, and provides a simple descriptive profile and a single index value for health status. The calculated index ranges from −0.59 to 1, where a higher score represents better health. For conversion to utilities, we used the time trade-off method (TTO) and the UK tarif [24].

  3. 3.

    The Numeric Rating Scale 11 (NRS 11) [25] is a one-dimensional pain scale from 0 to 10 where the two extreme categories are labeled “no pain at all” and “worst imaginable pain”. It was used for the description of arm and neck pain.

Statistics

Continuous data are described as mean and standard deviation (SD) or median and interquartile range (IQR) as appropriate, and were statistically tested between the groups with independent T test or Mann–Whitney U tests depending on assumptions on statistical distribution. Categorical data are described as number of patients and percentage, and were tested with Pearson Chi-square tests or Fischer exact tests as appropriate. To assess differences in baseline characteristics between patients with low- and high-grade HO, the demographics and baseline outcome measures were analyzed. The level of significance was defined as a p value < 0.05. SPSS version 18.0 (IBM Corporation, Armonk, New York) was used for all analysis.

Ethical considerations

The Regional Committee for Medical and Health Research Ethics and the data protection official for research approved the study. The study is registered at clinicaltrials.gov [26]. All patients included in the trial gave their written informed consent to participate.

Results

The mean age at inclusion was 44 years (range 33–59 years). Twenty patients (54.1 %) were female. The disc level C5/C6 was operated in 59.5 % and C6/C7 in 40.5 %. There were no statistically significant differences in baseline characteristics between patients with low- and high-grade HO (Table 1). HO was encountered in all patients 2 years after surgery. Twenty-three patients (62.2 %) had developed high-grade HO and complete fusion was found in six (16.3 %). The remaining 14 patients (37.8 %) were classified as grade 2 (Table 2; Fig. 3).

Table 1 Baseline characteristics of the patients with low- and high-grade HO 2 years after surgery
Table 2 Distribution of patients according to the HO grade
Fig. 3
figure 3

Distribution of heterotopic ossification (HO) 2 years after surgery. Classification from 0 (no HO) to 4 (fusion) according to Mehren [10]. No patients were classified as Mehren grade 0 or 1

At 2 years follow-up, NDI scores (±SD) in patients with low- and high-grade HO were 27.0 (±19.6) and 26.8 (±20.3), respectively (Fig. 4). The difference was not statistically significant, p = 0.98; nor were there any significant differences in any of the secondary clinical outcome measures (Table 3).

Fig. 4
figure 4

Primary clinical outcome measure, the Neck Disability Index between patients with low- and high-grade HO 2 years after surgery

Table 3 Clinical outcome for low- and high-grade HO 2 years after surgery

There were no major perioperative complications. However, out of 39 patients who were randomized to arthroplasty at the Oslo University Hospital, Rikshospitalet, 2 patients (5.1 %) had undergone index level reoperation before 2 years follow-up, leaving 37 patients in the present study. The reason for additional surgical treatment was due to loosening and anterior displacement of the arthroplasty device. Reoperations were performed with removal of the prosthesis, followed by fusion with cage and anterior plating.

Discussion

Cervical arthroplasty aims to preserve motion, but heterotopic ossification is an undesirable phenomenon in arthroplasty surgery which may cause reduced or absent mobility at the operated level. The objective of this study was to investigate the occurrence of heterotopic ossification 2 years after arthroplasty surgery, and to assess if the degree of ossification had an impact on the clinical outcome. The study was realized under the framework of the Norwegian Cervical Arthroplasty Trial.

Trial limitations are the number of patients included, which preclude firm conclusions about the study results. Further, the images were evaluated by two neuroradiologists by consensus and not as interobserver variation with kappa statistics.

HO was found in all patients 2 years after surgery. High-grade HO was found in 62.2 % of the patients and a complete fusion in 16.3 %. There were, however, no significant differences in either the primary or secondary outcome measures between patients with low- and high-grade HO.

HO around cervical disc prostheses has been reported with both semiconstrained and nonconstrained devices [1113, 15, 18]. However, the occurrence of complete fusion (grade 4) with nonconstrained devices is rare. Tu et al. [18] have previously reported this phenomenon with the Bryan cervical disc (Medtronic Spine and Biologics). Recently, it was also assessed in one clinical trial and in one case report using the same arthroplasty device as in the present study [16, 17].

A possible mechanism for HO development is related to increased height and range of motion (ROM) of the operated level [27]. Devices that are nonconstrained, as used in the present trial, cannot stop motion mechanically and are reliant on perispinal soft tissue and compression across the disc space to hinder extreme motion. Compared with other studies, the occurrence of HO in the present trial was higher than previously reported [1013, 18, 28]. Zhou et al. [11] included nine studies in their meta-analysis of HO in both semiconstrained and nonconstrained devices. In six of the studies, nonconstrained implants were used. Follow-up ranged from 24 to 96 months and the occurrence of HO ranged from 37.5 to 62 %. An early study of HO with the Bryan Cervical Disc [12] found that 17.8 % of the patients had developed HO 12 months after surgery. Yi et al. [13] assessed HO after 20 months in both nonconstrained and semiconstrained devices. With the nonconstrained devices, Bryan and Mobi-C (LDR Medical, Troyes, France), HO was found in 21.0 and 52.5 %, respectively. With the semiconstrained device, ProDisc-C (Synthes, Inc., West Chester, PA), HO was found in 71.4 %. Skeppholm et al. [17] used the same arthroplasty device as in the present study and found that HO caused complete fusion and very limited motion in 5 and 8 %, respectively, after 40 months. On the other hand, Qizhi et al. [29], who used the Discover device in two-level disc surgery, found no HO after 32,4 months. Tu et al. [18] reported 50 % HO with the Bryan device at a mean 19 months follow-up, and Suchomel et al. [28] found high-grade HO in 63 % after 4 years with the ProDisc-C. Thus, different degrees of constraints seem to influence the development of HO, but HO also differs among devices with the same degree of constraint and in different reports concerning the same device. In the present study, HO occurred in all implanted devices and the degree of high-grade HO was approximately the same as reported by Suchomel et al. [28], but in a much shorter observation period. Possible explanations can be related to implant design, suboptimal implantation of the prosthesis, incorrect size of the device, or the individual surgical technique, even though all surgeons had good experience with the particular arthroplasty device.

A recent meta-analysis comparing multi-level and single-level ACDA found that the occurrence of HO did not depend on the number of levels operated on [30]. The presence of ASD, on the other hand, has recently been found to significantly correlate with the development of HO [31].

Park et al. found that surgical technique influenced the development of HO [32]. In their study, two surgeons performed all operations; however, they had different techniques for trimming endplates. One surgeon used a fluted ball-type burr, while the other used a diamond-type burr. The study found that the use of fluted ball-type burr resulted in significantly more HO. In the present study, only diamond burrs were used to trim the endplates. Nevertheless, HO was seen in all patients 2 years after surgery.

Several other possible causal factors regarding HO have been discussed, such as not treating patients with nonsteroidal anti-inflammatory drugs (NSAIDs) after surgery. The use of NSAIDs to prevent HO after total hip replacement has been reported previously [33]. The study protocols of clinical trials for cervical arthroplasty undertaken by the US Food and Drug Administration included the perioperative use of NSAIDs as an attempt to prevent the occurrence of HO. One study has reported a trend toward decreased HO formation in patients who used NSAIDs after cervical disc arthroplasty compared with those who did not, but the difference was not statistically significant [34]. NSAIDs were not used routinely in the present trial and further studies should assess the role of NSAIDs in the development of HO after cervical disc arthroplasty.

Other predisposing factors that have been discussed are age and gender. Male gender has previously been reported to correlate with HO formation [35] and could be a contributing factor regarding the observed difference in HO occurrence compared with other reports. However, the present male/female ratio was not much different from the other studies. There was no relationship between high- and low-grade HO and age or gender in the present study.

Motion between two vertebrae occurs around a point described as instantaneous axis of rotation (IAR). The IAR is commonly located in the posterior half of the upper portion of the inferior vertebral body, the central region of the intervertebral disc, or the middle region of the subjacent vertebrae. However, there is not one axis of rotation in the cervical spine. The IAR identifies the rotation of one vertebra relative to another at a given point in time and will change when the motion of the vertebral body consists of both a translational and a rotational component. Artificial cervical discs should have an axis of rotation that mimics the kinematics of the normal spine to restore the physiologic range of motion and disc height and to transmit axial loading forces from the superior to the inferior vertebral body [36]. Some arthroplasty devices, like the DISCOVER® and the ProDisc-C, have a ball-and-socket single-articulating design with a fixed center of rotation (COR). With such devices, a posterior positioning of the implant in the disc space is important. Others, like the Bryan prosthesis, have double articulation surfaces and independent translation, which allows for a mobile COR. With a mobile COR, various positions of the device may theoretically maintain physiologic kinematics. Whether an altered COR due to implantation of an arthroplasty device has clinical consequences in the long term is not yet known. However, an imprecise position of the implant may cause a negatively altered COR and perhaps influence HO formation. Assuming that replication of a physiologic COR is an important design feature of a cervical prosthesis, Koller et al. found that ideal surgical preparation and ideal positioning of the implant are most important to preserve the segmental COR and balance [19].

Although the difference in HO formation was considerable, the grade of HO did not influence the clinical outcome in the present trial. However, our results do not correspond with the findings of a recent meta-analysis where patients with high-grade HO felt significantly less pain than patients with low-grade HO [10]. The same study also concluded that the presence of HO did not influence the clinical outcome.

Fusion naturally prevents motion, and very limited motion has recently been shown with HO [17]. Based on the results from the present trial, it seems that the benefit of cervical disc arthroplasty, namely preservation of physiological motion, is of limited relevance with respect to the clinical outcome. This is consistent with previous results regarding arthroplasty in the lumbar spine [37].

Conclusion

High-grade HO and spontaneous fusion 2 years after surgery were seen in a significant number of patients, but the degree of ossification did not influence the clinical outcome. However, conclusion about the effect of HO on the development of adjacent-level degeneration and clinical outcome in the long term, cannot be drawn. Trials with even longer follow-up are needed for more definite answers concerning HO and its impact on mobility as well as clinical outcome. The main goal of cervical arthroplasty is preservation of motion, which for a significant number of patients in the present study was not achieved with the device under the applied surgical techniques used in the NORCAT.