Introduction

Anterior cervical discectomy and fusion (ACDF) is widely performed for the treatment of cervical diseases, estimated to provide relief for more than 90% of radicular and myelopathic complaints [1]. However, ACDF has been associated with the development of new degeneration at levels adjacent to the fused segments [2, 3]. This operation affected normal cervical spine alignment, and loss of mobility at one functional spinal unit increased the load sustained by the remaining units [4]. Hilibrand et al. [5] classified degeneration of adjacent segments as “adjacent segment degeneration” (ASDeg) and “adjacent segment disease” (ASDis) to formulate unified standards and to avoid confusion when researching these problems. ASDeg was defined as radiographic changes at the adjacent segments, whereas ASDis was used to refer to the development of new clinical symptoms, such as mechanical neck pain or coronal-sagittal imbalance. However, the etiology and symptomatology of these adjacent segment changes have remained intensely controversial. Some experts believe it is the natural progress of cervical disc disease [6,7,8], while others insist fusion can change the biomechanics of adjacent segments, accelerating adjacent segment degeneration/disease (ASD) [2, 9].

Total disc replacement (TDR) on cervical vertebra has been greatly improved over the last decade. Based on a large amount of biomechanical testing, TDR can theoretically decrease the incidence of adjacent segment degeneration by maintaining normal disc kinematics [10,11,12]. But few clinical studies, especially controlled trials, have specifically investigated ASDeg, ASDis, or reoperation after TDR or ACDF together with less relevant evidence-based medicine, including several systematic reviews that could not determine whether TDR is superior in ASD due to the poor qualified studies [13,14,15].

To focus on this issue, many studies on ASD have been published in recent years, and performing meta-analysis is necessary to describe the results. The present study aimed to determine whether TDR is superior to ACDF in reducing the incidence of ASD.

Materials and methods

We performed this meta-analysis using the guidelines of the Cochrane Collaboration [16].

Criteria for selecting studies for the meta-analysis

Types of studies

In view of the currency of randomized controlled trials (RCTs), the highest-grade evidence, comparing TDR with fusion, only RCTs were evaluated.

Types of participants

The study population included patients (>18 years) with radiculopathy or myelopathy cervical spondylosis, or other degenerative diseases. The two treatment groups were similar demographically, with no statistically significant differences on the variables of age, sex, or work status. Patients had failed active conservative management for at least 6 months.

Types of interventions

Jawahar et al. [17] indicated the non-statistical significance of single or double segment degenerative disc diseases for the prevalence of ASDeg and ASDis. Therefore, we compared the results of surgical treatment of single or double disc diseases treated by TDR or ACDF.

Types of outcomes studied

According to Hilibrand’s [5] definitions of ASDeg and ASDis, the incidence of ASDeg and ASDis can be described as direct results and primary outcomes. Reoperation on adjacent segments indirectly reflected the rate of ASD, and we adopted it as a secondary evaluation standard.

Search strategy and selection criteria

The databases used to search included PUBMED, EMBASE, Cochrane Library, CBM (China Biological Medicine Database), CNKI, and Wanfang Data.

Since the first study describing a commercially available TDR device was published in 2002 [18], the range was from January 2002 to December 2017. The following keywords were used: cervical vertebrae, total cervical disc replacement OR TDR OR arthroplasty OR prostheses OR dynamic stabilization device AND anterior cervical discectomy and fusion OR ACDF OR cervical arthrodesis AND “randomized controlled trials.”The inclusion criteria were (1) RCTs of cervical degenerative diseases involving single or double segments underwent TDR and ACDF; (2) definite, diagnostic, and direct evidence for ASD; (3) a minimum of a two-year follow-up; (4) at least a minimum of 30 patients per population; (5) containing specific data information on ASD for meta-analysis. Incompatible studies that may have been excluded were (1) case reports; (2) reviews; (3) studies with follow-up time less than 2 years; and (4) just with undesirable result although referring to ASD such as only a mention on ASD with no specific data, secondary surgeries not totally resulting from ASD, and confounding subjects of ASDis and ASDeg for a hardly data extraction. (Additional file 1: Table S1).

Data extraction and management

Both reviewers (SX and ZQZ) assessed potentially eligible trials and extracted information independently from each potential study. Any discrepancies were resolved through a third reviewer (YLQ) to reach consensus. Extracted data included the general characteristics and outcome measures. General characteristics included study design, first author, sample size, intervention, and types of artificial total disc. Measures of outcomes included the number of ASDeg or ASDis and reoperation (Additional file 2: File S1).

Risk of bias assessment

Two investigators independently graded each eligible study. We used the Cochrane Handbook for Systematic Reviews of Interventions, version 5.0 [19] for RCTs. The following domains were assessed: randomization, blinding (of patients, surgeons and assessors), allocation concealment, and follow-up coverage. Each domain of quality assessment was classified as adequate (A), unclear (B), or inadequate (C). If all domains were A, the study was A-level; if at least one domain was B, the study was B-level; if at least one domain was C, the study was C-level (Additional file 3: Table S2).

GRADE approach

The GRADE (the grades of recommendation, assessment, development, and evaluation) approach was used to evaluate the strength of evidence [20]. Based on parameters, the quality assessment was classified as very low, low, moderate, or high according to the GRADE handbook(version 3.2), with the GRADE profiler software (version 3.6). A Summary of Findings Table (SoF Table) was used to explain the final results.

Data analysis

Review Manager Software (RevMan Version 5.3) was used to conduct the statistical analysis.

Measures of treatment effect

Only dichotomous outcomes were presented in this study; the odds ratio (OR) and 95% confidence intervals (95% CI) were calculated for outcomes.

Assessment of heterogeneity

Results were regarded as statistically significant if P < 0.05. I2 was used to estimate the size of the heterogeneity [21]. I2 < 50% indicated low heterogeneity, and the results of comparable groups could be pooled using a fixed-effects model.

Subgroup analysis

Subgroup analysis that could reduce statistical heterogeneity to facilitate factor definition was worthwhile. If the overall heterogeneity was I2 < 50%, we could still divide studies into subgroups depending on professional principles and clinical meaning.

Bias of publication

We constructed a funnel plot for overall outcomes to assess publication bias and to examine the relationship between sample size and the effect.

There was no protocol.

Results

Description of the studies

The process of identifying relevant studies is summarized in Fig 1. Three hundred seventy-eight references were obtained from the databases mentioned and a total of 11 studies [17, 22,23,24,25,26,27,28,29,30,31] met inclusion criteria with a total of 2632 patients: 1185 underwent ACDF and 1447 underwent TDR. As some studies were continuations of previous articles, we used the latest publication to avoid duplication. Thus, the search range was from 2002 to 2016, but the 11 included studies were published between 2010 and 2016.We recorded the characteristics of the 11 included RCTs in Table 1.

Fig. 1
figure 1

Selection process for meta-analysis of the studies

Table 1 Characteristics of the included studies

Risk of bias in the studies

According to the quality assessment criteria recommended by the Cochrane Handbook for Systematic Reviews of Interventions [19], 9 out of 11 were of high quality. One study was A-level quality [22], 8 articles were B-level [17, 23, 25, 26, 28,29,30,31], and 2 articles were C-level [24, 27] Fig. 2. The review authors’ judgments about each risk of bias item for each included study: + is “yes”, − is “no”, ? is “unclear”.

Fig. 2
figure 2

Risk of bias summary. The review authors’ judgments about each risk of bias item for each included study: + is “yes”, − is “no”, ? is “unclear”

Measures of overall outcomes

In this meta-analysis, the rates of ASDeg and ASDis were described as the direct outcomes, reoperation on adjacent segments was adopted as the indirect standard. All 11 RCTs used unified standards on ASD in line with Hilibrand’s [5] definitions. None of the studies simultaneously involved the three results of the rates of ASDeg, ASDis, and reoperation; 7 RCTs [23,24,25,26,27, 30, 31] mentioned ASDeg, 3 RCTs [17, 22, 24] mentioned ASDis, and 5 RCTs [22, 23, 28, 29, 31] mentioned reoperation. If both direct outcomes and indirect outcomes were involved in a study, the former was preferentially adopted.

After a meta-analysis, there was no statistical heterogeneity among all 11 studies (I2 = 0%). With the fixed-effects model, the overall rate of ASD was lower in the TDR group (20.2%) compared with ACDF group (25.6%), and the difference was statistically significant (OR = 0.6; 95% CI [0.38, 0.73]; P < 0.00001) with no heterogeneity (I2 = 0%), which was showed in Fig. 3.

Fig. 3
figure 3

Results of the meta-analysis for the incidence of adjacent segment degeneration/disease and reoperation. M-H Mantel–Haenszel, CI confidence interval

Adjacent segment degeneration (ASDeg)

Seven studies [23,24,25,26,27, 30, 31] reported ASDeg. The rate of ASDeg was lower in patients who underwent TDR, and the difference was statistically significant (OR = 0.58, 95% CI [0.46, 0.72]; P < 0.00001), which was showed in Fig. 4.

Fig. 4
figure 4

Results of the meta-analysis for adjacent segment degeneration (ASDeg).M-H Mantel–Haenszel, CI confidence interval

Adjacent segment disease (ASDis)

Three studies [17, 22, 24] reported ASDis. The rate of ASDis was similar in two groups (8.8%, 7.6% respectively) with no statistical significance (OR = 0.97, 95% CI [0.56, 1.69]; P = 0.91) and it is shown in Fig. 5.

Fig. 5
figure 5

Results of the meta-analysis for adjacent segment disease (ASDis). M-H Mantel–Haenszel, CI confidence interval

Reoperation

Five studies [22, 23, 28, 29, 31] and reported reoperation on adjacent segments. The reoperation rate was lower in patients with TDR (2.4%) than in patients who underwent ACDF (4.5%) (OR = 0.52, 95% CI [0.30, 0.87]; P = 0.01), which is shown in Fig. 6.

Fig. 6
figure 6

Results of the meta-analysis for reoperation for adjacent segments. M-H Mantel–Haenszel, CI confidence interval

Subgroup analysis

Subgroup analysis was performed according to follow-up time. Table 2 listed the average follow-up time spanned 24–84 months. We divided the follow-up time into two periods: < 5 years and ≥ 5 years.

Table 2 Subgroup analysis according to follow-up time

Eight studies involving < 5 years of follow-up [17, 22,23,24, 26, 27, 29, 30] showed that the rate of ASDeg was lower in TDR (P = 0.001), but the rates of ASDis and reoperation were not significantly different (P = 0.98 and P = 0.1, respectively). Three studies involving ≥ 5 years of follow-up [25, 28, 31] showed that the rate of ASDeg was much lower in TDR group (P = 0.0002).

Subgroup analysis was also performed according to the trial sites. The trial sites in the 11 RCTs were from two countries: the USA and China. We divided 11 studies into two subgroups in Table 3.

Table 3 Subgroup analysis according to study sites

Seven studies performed in the USA [17, 22,23,24, 28, 30, 31] showed the rate of ASDeg and the reoperation rate for adjacent segments were lower on TDR (P < 0.0001 and P = 0.03, respectively) but not ASDis (P = 0.98). Four studies performed in China [25,26,27, 29] showed the rate of ASDeg was lower in patients who underwent TDR than ACDF (P = 0.03) but not the rate of reoperation (P = 0.13).

The meta-analysis of ASDin Fig. 7 showed no evidence of publication bias.

Fig. 7
figure 7

Funnel plot for the occurrence of ASD

The GRADE of this meta-analysis

The SoF Table presents the grade of the ultimate outcome (ASD) under the intervention of TDR and ACDF according to academic and clinical experiences as well as the quality grade of this meta-analysis in Table 4. According to the GRADE [20], the grade of the ultimate outcome (ASD) was critical, and the overall grade quality of our meta-analysis was moderate.

Table 4 Preview SoF table of the GRADE for this meta-analysis

Discussion

ACDF has been recognized as a classic surgical treatment of cervical disease [32, 33], but limitations in range of motion, increased stress on adjacent segments made it defective at simulating physiology. Many publications have reported their observations of this and summarized pathological causal factors of ASD [33,34,35]. Some experts believe it is a natural process; Hilibrand’s results indicated that ASD was indeed a common problem but may reflect the natural history of the underlying cervical spondylosis [5]. Some have suggested that the alignment, curvature, and activity of the cervical spine are relevant factors that result in ASD [33, 35]. Takeshima [36] concluded cervical dynamic change may increase the adjacent intervertebral stress and accelerate degeneration of adjacent segments.

Since 2002, progressive superiority of TDR in biomechanics theory has been applied in clinical practice and the more technological mature is accepted. The mechanical difference between TDR and ACDF made us increasingly concerned about ASD and hereafter meaningful comparison. But poor qualified studies led to lots of bias and many patients unfit to experimental or control group caused unsatisfactory efficacy, in addition substantial costs made it difficult to perform a multi-center RCT.

There were several articles referred to ASD between TDR and ACDF, but few conclusions indicated ASD has radiological or clinical statistical difference [37,38,39]. Probable reasons were (1) ASD was a natural process unrelated to surgery, corresponding to the view of Hilibrand and Herkowitz [7, 8]; (2) inconsistencies in surgical indications between TDR and ACDF, non-controlled trials, and demographical differences could bring about clear bias. Therefore, RCTs of high grade were optimal to perform a meta-analysis, then prospective cohort studies. Many meta-analysis publications comparing TDR to ASD have had ambiguous outcomes. Yang, B.et al. [14] mentioned the defect of including only five studies without stratification on factors and exclusion of publication bias.

This meta-analysis contained two poor quality studies [24, 27] of inadequate (C) grade, and most were high quality with low risk of bias. Overall, there was a statistically confirmed lower occurrence of ASD in TDR group and TDR could be considered as a treatment of deferring ASD in comparison with ACDF, so was ASDeg and reoperation in subgroup. The positive outcome of rate may be based on cost-effectiveness analysis, patients believed the efficacy of TDR should be better than ACDF with a higher cost, leaving them a bias of no-attribution to ASDis on TDR with relevant symptom and prefer not to selection a second surgery. With extended follow-up, it may be difficult to explain the difference in reoperation rate; Coric, D [23] mentioned it was ambiguous to draw conclusions with fewer patients added lower incidence on ASDis. The rate of ASDis of no significance likely resulted from the inadequate number of positive population, in addition, ASDis occurred postpone as a symptomatology compared with ASDeg, with no difference between the two groups till current endpoint.

In subgroup analysis, the difference in incidence of ASDeg was still statistically different whether it was shorter than 5 years or not. Furthermore, the results were better with 5 years of follow-up than follow-up within 5 years, probably implying that the longer the follow-up, the more superiority in TDR. But Davis, R J [30] mentioned that the underlying mechanism defining the relationship between decreased radiographic degeneration in patients treated with TDR remains uncorrelated, and further long-term follow-up should continue to correlate these results. The incidence of ASDeg was statistically different between TDR and ACDF both in the USA and China, and TDR may have an advantage over ACDF. However, no matter in China or the USA, there are both tendency in clinical trials towards to the positive results with a larger selection bias, and overall, ensuing a certain lack of representation in incidence of ASD, which may be also related to the policy on cost reimbursement.

There were some limitations in this meta-analysis. Eleven studies may still limit our assessment of potential publication bias and more relevant studies should be included. Then, the various types of disc may affect the occurrence of ASD, so it should be stratified for further analysis. However, there were too many types of discs referred in the literatures, 8 kinds mentioned in 11 documents, and the kind of prosthesis mentioned by Nunley, P D is unclear [24] and even the document by Jawahar, A [17] includes three discs (Kineflex-C/Mobi-C/Advent) at the same time. Therefore, it was difficult to process a stratified analysis with inconsistency and disorganization.

In addition, we did not but it was of vital importance to perform a cost-effectiveness analysis, and most countries was dealing with a nearly unaffordable costs of health care. Qureshi SA [40] suggested in single-segment operation between TDR and ACDF indicated TDR must remain functional for at least 14 years to establish greater cost-effectiveness than ACDF. Ament, J D et al. [41, 42] in cost-effectiveness analysis with two-level segments reaffirmed TDR a stability of the model and the sustainability of this intervention. In this meta-analysis, the difference referred to the ASDis between TDR and ACDF is 2.1% and the NNT is 43.5 but 11 articles did not involve the detailed calculations on cost-effectiveness issues. It is of no meaning if we offered huge cost to make up a little disadvantage, and it could have resulted in an overestimation of the effectiveness of interventions.

The overall grade quality of our meta-analysis was moderate according to the GRADE, and we suggest adopting TDR for the reduction of incidence of ASD to a certain degree. TDR can reduce the rate of reoperation and ASDeg compared with ACDF with probable obvious advantages as the time prolonged basing the prerequisites of a larger sample, but the result should be accepted with caution.

Conclusion

TDR decreased the rates of ASDeg and reoperations compared with ACDF, and the superiority may be more apparent overtime. TDR can be selected purely in terms of mitigation on ASD, but the overall efficacy through cost-effectiveness analysis that values. The overall grade quality of our meta-analysis was moderate according to the GRADE, and we cautiously and slightly suggest adopting TDR.