Background

Anterior cervical discectomy and fusion (ACDF) technique has long been recognized the gold standard for severe spinal cervical spondylosis, although the cervical artificial disk replacement technique has been invented that possesses the ability of maintaining motion at the repaired spinal level [1, 2]. The most vital implant in ACDF is the fusion mass to facilitate interbody fusion. The ACDF technique was first performed with CICA in 1954 by George Smith and Robert Robinson and reported in 1957 at the 24th AAOS (The American Academy of Orthopedic Surgeons) [3]. In June of that same year, Ralph Cloward reported the ACDF technique using SA derived from a fresh cadaver [4]. Due to a wide range of complications of CICA and the scarcity of SA, a metal interbody fusion cage, called BAK (Bagby and Kuslich) cage, was first used for patients in 1983 after the application of the first interbody fusion cage in cervical interbody fusion in horses in 1979 [5, 6]. Intervertebral fusion cages have been flourishing and evolving since then, along with the ongoing debate on the implant selection in ACDF. CICA has been recognized universally as the gold standard for ACDF for its high degree of histocompatibility and osteoinduction ability. However, our clinical experience implied that the higher osteo-properties of the CICA were observed in early postoperative fusion, but not in the long-term fusion rate. It would be strong evidence for indication subdivision of CICA/SA [7]. Therefore, our study explored whether a CICA or SA was better than a cage mainly based on the fusion rates and other clinical parameters.

Methods

The normalization meta-analysis was carried out based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [8] (https://prisma-statement.org/) and Cochrane Handbook for Systematic Reviews of Interventions (https://training.cochrane.org/handbook). The meta-analysis has been registered in PROSPERO (Identifier: CRD42022345247).

Search strategy

In order to a comprehensive search, The literatures derived from PubMed, Embase, Cochrane Library and Web of Science in the span from the establishment of the database to May, 2022 were extracted based on the above criteria by only a few terms such as “cervical”, “iliac crest autograft”, “allograft” and “cage”. The specific search strategies for each database were exhibited in Table S1.

Inclusion and exclusion criteria

The following eligibility criteria for the selection of studies were adopted. (1) Patients who were diagnosed with cervical spondylosis according to clinical symptoms combined with image examination and had no response to conservative treatment and received ACDF using CICA/SA and cage, with or without fixation; (2) RCTs (randomized controlled trials), case–control studies and cohort studies comparing interbody fusion rates between patients using CICA/SA and cage in ACDF. All the studies would be assessed by exclusion criteria: (1) Patients with vertebral fracture, tumor or infection; (2) The ACDF surgery was performed on animals; (3) The autograft applied in ACDF was bi-/tricortical autogenous iliac crest bone, instead of cancellous iliac autograft or cortical autogenous bone from other parts of the body, such as fibula, humerus, tibia, femur and sternum; (3) The surgery procedure including vertebrectomy, en-bloc or corpectomy; (4) The operative segments involving in the lumbar/thoracic spine, atlantoaxial joint or occipitocervical joint. (5) The hardness of the material of implants in the cage group was uncertain, such as artificial bone combined with gelatin sponge; (6) A posterior approach was applied besides ACDF. (7) A small patient population (n ≤ 5). (8) Review, systematic review, meta-analysis, case report, comment and patent were excluded. (9) The studies whose full text were not available.

Literature selection

Two researchers conducted study selection independently according to the search strategy and inclusion and exclusion criteria. Cross-check regarding the final eligibility criteria of literature on the shortlist was carried out after screening. Any inconsistency would be resolved through discussion and negotiation. For serious differences, a third senior researcher would be involved in and determine the final fate of related literature.

Data extraction

The EXCEL established in advance was used for data extraction, including the first author, publication time, study design, number of patients and segments, cage material, the substance packed in cages, the primary outcome fusion rates of iliac crest autograft group and cage group, other outcomes including hardware (implants, screws and plates) complication except nonunion and subsidence, such as graft collapse, graft dislodgement, broken screw and screw breakage; subsidence of implants; operative duration; hospitalization time; blood loss; neck VAS; arm VAS; JOA; interbody height; disk height; Odom rating and NDI, in which interbody height was defined as the length between the inferior end-plate of the caudal vertebral body and the superior end-plate of the cranial vertebral body; and disk height represented the length between the anterior border and inferior border of the responsible disc; the rates of grade “excellent” and “good” evaluated by Odom’s criteria were extracted for meta-analysis. All of the parameters were collected and checked by two independent authors with the resolve-by-consensus method. The results of outcomes except fusion rates were extracted from the final follow-up or the latest data that were able to be acquired.

Duplicate and multiple combinations of two or several reported subgroups into a single group were utilized in our study. According to the Cochrane Handbook for Systematic Reviews of Interventions, for dichotomous outcome data including fusion rates, hardware complication, subsidence of implants and Odom rating, total samples and number of events in the two groups can be combined separately. For continuous outcome data in our study, the following formulae were utilized for a combination of sample number (N), means (M) and standard deviation (SD) of group 1 and group 2.

$$N_{{\text{combination}}} = N_1 + N_2$$
$$M_{{\text{combination}}} = \frac{N_1 M_1 + N_2 M_2 }{{N_1 + N_2 }}$$
$${\text{SD}}_{{\text{combination}}} = \sqrt {{\frac{{(N_1 - 1){\text{SD}}_1^2 + (N_2 - 1){\text{SD}}_2^2 + \frac{N_1 N_2 }{{N_1 + N_2 }}(M_1^2 + M_2^2 - 2M_1 M_2 )}}{N_1 + N_2 - 1}}}$$

In our study, multiple combinations of subgroups, such as the patients in the iliac crest autograft group with plating; the patients in the iliac crest autograft group without plating; the patients in the cage group with plating; the patients in the cage group without plating, were combined by the simplest strategy that applied the above formula sequentially.

Assessment of risk of bias

Assessment of Risk of Bias was implemented by two independent authors using the Cochrane risk-of-bias tool for randomized trials (version 2.0) (RoB 2.0) [9] and the Newcastle–Ottawa scale (NOS) (https://www.ohri.ca//programs/clinical_epidemiology/oxford.asp) for RCTs and case–control studies/cohort studies, respectively. However, out of the particularity of our study, in other words, the patients, surgeons and the primary outcome assessors were all aware of the implants received by the patients. The Patients were required to sign a consent form for any implant before surgery. And measurement or ascertainment of the outcome was confirmed by imaging examination, through which the assessors could be aware of specific implants the patients received. Therefore, there was no score for “Selection of Controls” from the module of selection in NOS for case–control studies, resulting in a maximum NOS score of 8 for case–control studies. However, the difference in fusion rates caused by the non-blinded method was negligible as fusion rate was an objective indicator that was imaging-based. When the number of studies included was 10 or more, Egger’s test was carried out to determine the publication bias.

Statistical analyses

Stata version 13.1 was devoted to the meta-analysis. Statistical heterogeneity among studies was estimated by the Cochran Q test (P < 0.1 representing a statistical difference) and I2 statistic (I2 > 50% representing a statistical heterogeneity, I2 > 75% meaning a large heterogeneity). The fixed effect model was applied to estimate the pooled proportion when no statistical heterogeneity showed (P > 0.1 and I2 < 50%), using the random effect model otherwise (P < 0.1 or I2 ≥ 50%). The odds ratio, the weighted mean difference (WMD) and its 95% confidence interval (CI) were generated for the pooled effect for continuous or dichotomous outcome data, respectively. In addition, no quantitative summary would be performed if there is a limited literature (n < 3) [10].

Result

Identification of studies

The comprehensive literature search yielded 3366 articles from the above four databases. And the literature selection was conducted according to PRISMA (Fig. 1). Finally, 34 studies [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44] were included, in which 30 [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30, 34,35,36,37,38,39,40,41,42,43] articles with the fusion rates of segments were utilized for all outcome analysis, the other four studies [31,32,33, 44] with fusion rates of case number were utilized for the combination of secondary outcomes merely.

Fig. 1
figure 1

PRISMA 2020 flowchart for the literature search

Study characteristics

A total of 34 researches comprising of 9 RCTs [11, 13, 17,18,19, 23, 27, 32, 36], 6 cohort studies [14, 16, 20, 24, 38, 41] and 19 case–control studies [12, 15, 21, 22, 25, 26, 28,29,30,31, 33,34,35, 37, 39, 40, 42,43,44] were included for meta-analysis (Table 1). Since the fusion rates in all the 3 groups, iliac crest autograft group, SA group and cage group, were estimated in the research conducted by Kao and his coworkers, there were 23 [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33] and 12 studies [21, 34,35,36,37,38,39,40,41,42,43,44] for the comparison between iliac crest autograft and cage, SA and cage, respectively. Some outcomes, such as hospitalization time and blood loss in Zhou et al. [29], were not collected as the number of subgroups were unknown, which was indispensable for data combination using the above formula. It is worth mentioning that some parameters such as segmental angle and cobb angle that were measured in partial literature were not extracted for meta-analysis as the insufficient amount of literature or a total difference for parameter define.

Table 1 Characteristics of the included studies comparing iliac crest autograft/allograft with cage

Risk of bias

Due to the particularity of the intervention, ACDF, all nine RCTs were estimated as high-risk mostly attributing to the domain 2 in RoB 2.0 that evaluates the deviations from intended interventions (Fig. 2). All 6 cohort studies and 19 case–control studies were considered high quality for their NOS scores > 6 points (Table 2), despite a maximum NOS score of 8 for case–control studies for their particularity. There were ten or more studies for the parameters including the fusion rate at 12 months postoperatively, hardware complication, and Odom rating in comparing CICA with cage, whose P values of Egger’s tests were 0.452, 0.013 and 0.923, respectively.

Fig. 2
figure 2

Risk-of-bias assessment by RoB 2.0

Table 2 NOS scores of case–control studies and cohort studies

Sensitivity analysis

The one-by-one elimination method is employed to perform a sensitivity analysis to assess the stability of the results of the meta-analysis. According to the Cochrane Handbook for Systematic Reviews of Interventions (https://training.cochrane.org/handbook), the best reporting method for sensitivity analysis is to create a summary table. So, we carried out sensitivity analysis, and all the 95% CI were summarized in Table 3. As these results shows, no matter which literature was omitted, the final conclusions of fusion rate (postoperative 3, 6, 12 month both in CICA VS cage and SA VS cage section) remained unchanged, except the fusion rate in postoperative 24 month in CICA VS cage section. Therefore, our sensitivity analysis shows that the results of our meta-analysis were stable.

Table 3 Sensitivity analysis for the meta-analysis of fusion rates

Fusion rate

There were 3 [12, 19, 24], 6 [11, 13, 16, 19, 22, 24], 10 [11, 15, 16, 20,21,22,23,24,25,26], and 3 [11, 17, 18] studies comparing CICA with cage for the segmental fusion rates summary at postoperative 3, 6, 12, and 24 months, respectively, after excluding these researches whose fusion rates were 100% in both groups. And finally, 4 [34, 36, 38, 41], 3 [36, 38, 41], 5 [21, 35, 39,40,41], 2 [37, 39] studies comparing SA with cage were using for the segmental fusion rates summary at postoperative 3, 6, 12, and 24 months, respectively. No difference of fusion rate was found in the section of CICA combined SA vs cage (Fig. 3). The forest plots (Fig. 4) showed that the segmental fusion rates of CICA were higher than cages at 3 (P = 0.184, I2 = 40.9%) and 6 (P = 0.147, I2 = 38.8%) months postoperatively, but not 12 (P = 0.988, I2 = 0.0%) and 24 (P = 0.055, I2 = 65.6%) months postoperatively. And there was no significant difference in segmental fusion rates between SA and cage at none of 3 (P = 0.047, I2 = 62.2%), 6 (P = 0.179, I2 = 41.9%) and 12 (P = 0.049, I2 = 58.0%) months after operations (Fig. 5).

Fig. 3
figure 3

Forest plots of fusion rates between CICA combined SA and cage. No difference of fusion rate was found in the section of CICA combined SA vs. cage

Fig. 4
figure 4

Forest plots of fusion rates between CICA and cage. The forest plots showed that the segmental fusion rates of CICA were higher than cages at 3 (A) and 6 (B) months postoperatively, but not 12 (C) and 24 (D) months postoperatively

Fig. 5
figure 5

Forest plots of fusion rates between SA and cage. No significant difference in segmental fusion rates between SA and cage at none of 3 (A), 6 (B) and 12 (C) months after operations

Secondary outcomes between CICA and cage

Besides fusion rates, statistical significance was also found in some outcomes (Table S2) between CICA and cage, including hardware complication (OR = 9.66, CI = 3.76 to 24.87, P = 0.00), operative duration (WMD = 16.05, CI = 5.03 to 27.07, P = 0.004) and Odom rating (OR = 0.60, CI = 0.39 to 0.91, P = 0.016), hospitalization time (WMD = 0.50, CI = 0.11 to 0.89, P = 0.013), blood loss (WMD = 23.62, CI = 2.25 to 44.99 P = 0.030), interbody height (WMD = − 1.73, CI = − 3.23 to − 0.23, P = 0.023), disk height (WMD = − 0.70, CI = − 1.31 to − 0.10, P = 0.023), while subsidence (OR = 0.68, CI = 0.24 to 1.98, P =0.469), neck VAS (WMD = 0.20, CI = − 0.15 to 0.55, P = 0.263), JOA (WMD = − 0.08, CI = − 0.34 to 0.17, P = 0.517) and arm VAS (WMD = 0.03, CI = − 0.33 to 0.40 P = 0.861) had no significant difference between CICA and cage in ACDF. In other words, the iliac crest autograft was inferior to cages in terms of hardware complications, operative time, blood loss, hospitalization time, interbody height, disk height and Odom rating.

Secondary outcomes between SA and cage

The hardware complication (OR = 10.48, CI = 2.97 to 36.96, P = 0.000) of using SA was significantly higher than the cage, but not the hospitalization time (WMD = 0.05, CI = − 0.37 to 0.47, P = 0.662), disk height (WMD = 0.09, CI = − 0.30 to 0.48, P = 0.649), NDI (WMD = − 0.01, CI = − 0.65 to 0.64, P = 0.624), Odom rating (OR = 0.76, CI = 0.19 to 3.01, P = 0.695) (Table S2).

Discussion

CICA vs. cage

In our study, we found that the fusion rates of using a CICA at 3 and 6 months after surgery were higher than using a cage in ACDF, and the significant difference vanished at 12 and 24 months postoperatively. It demonstrated that the patients performed ACDF using CICA fusion fused earlier, but no advantage in the long-term total fusion rate compared to the cages, which was consistent with the finding in the study directed by Tantammaroj and his coworkers [16] that the fusion rate of patients using CICA was 97.5% at postoperative 6 months but remained unchanged until 24 months postoperatively, and the fusion rate of cage group was 96.77% at 1 year though only 70% at half-year postoperatively. However, compared to the cage, using CICA showed less clinical significance in all the other parameters in our study. What the results implied new indications of implant choice in ACDF is that cages are suggested to be applied for most seniors due to their long-term high fusion rate similar to CICA and better secondary outcomes, while those juniors who urgently need a hurry reintegration to work, athletic contest or society are supposed to be fused with CICA. Future high-quality RCTs regarding the hardware complications between CICA and cage in younger patients would further demonstrate the deduced indications.

SA vs. cage

SA was also an osteoinductive, osteoconductive and osteogenic scaffold facilitating new bone formation with a high fusion rate similar to autograft but without complications at the donor site [45, 46]. The fusion rate of one-year fusion rates between 4063 patients using iliac crest autograft and 2067 patients using allograft in ACDF was found significant difference (P < 0.05), including subgroup analysis grouping by the number of segmental levels, diabetes or not, tobacco or not [47]. Vadim Goz and his coworkers carried out a retrospective cohort study that compared the complications between 7135 patients using SA and 10,648 using cages and found that the morbidity of revision within 2 years of the SA group was higher than cage group (P < 0.05) [48], which was consistent with our meta-analysis that the hardware complication in SA group was higher than cage group. Paradoxically, our meta-analysis stood for none of them in the fusion rates at 3, 6 and 12 months postoperatively. Therefore, more high-quantity RCTs focusing on the comparison of fusion rates between SA and cage are warranted.

In our study, statistically significant heterogeneity was found in fusion rate at postoperative 3 (I2 = 62.2%, P = 0.05), 12 (I2 = 58.0%, P = 0.05) month in SA vs. cage section, and 24 (I2 = 65.6%, P = 0.06) month in CICA vs. cage section. These high heterogeneities were related to the factors such as the presence or absence of cage filler and internal fixation, and its material, and the differences in the judgment of fusion based on X-ray/CT in different studies. Besides RCTs, a pooling of types of observational studies including cohort studies and case–control studies also contributed to the high heterogeneity.

Several limitations to the current study needed to be considered. (a) There were only 9 RCTs in all included 34 studies, as the highest level of evidence in clinical researches, deficiency of RCT and pooled it with observational studies cause higher heterogeneity in the fusion comparison between SAs with cages. (b) We did not search for and included gray literature. (c) A comprehensive comparison among CICA, SA and cage was not carried out due to the insufficient literature including CICA vs SA section. (d) Even if it was based on imaging results, the judgment of total fusion was a process with a certain degree of subjectivity, which may have affected differences across the various studies.

Conclusion

Applying CICA/SA has an advantage on faster fusion than using cages but not the long-term fusion rate in ACDF, which might be inferred that those juniors who urgently need a hurry reintegration to work, athletic contest or society are supposed to be fused with CICA. Future high-quality RCTs regarding the hardware complications between CICA and cage in younger patients are warranted for the deduced indication.