Predictive factors for long-term outcome of anterior cervical decompression and fusion: a multivariate data analysis
- First Online:
- Cite this article as:
- Peolsson, A. & Peolsson, M. Eur Spine J (2008) 17: 406. doi:10.1007/s00586-007-0560-2
- 315 Views
We conducted a prospective randomized study to investigate predictive factors for short- and long-term outcome of anterior cervical decompression and fusion (ACDF) as measured by current pain intensity on the Visual Analogue Scale (VAS) and by disability using the Neck Disability Index (NDI). Current understanding about how preoperative and short-term outcome data predict long-term outcome is sparse, and there are few studies involving analysis of short-term follow-up using multivariate approaches with quantification of the relative importance of each variable studied. A total of 95 patients were randomly allocated for ACDF with the cervical intervertebral fusion cage or the Cloward procedure. The mean follow-up time was 19 months (range 12–24) for short-term follow-up and 76 months (range 56–94 months) for long-term. Background factors, radiologically detected findings, physiological measurements, treatment type, pain, and disability were used as potential predictors. Multivariate statistical analysis by projection to latent structures was used to investigate predictors of importance for short- and long-term outcome of ACDF. A “preoperative” low disability and pain intensity, non-smoking status, male sex, good hand strength, and an active range of motion (AROM) in the neck were significant predictors for good short- and long-term outcomes. The short-term outcome data were better at predicting long-term outcome than were baseline data. Radiologically detected findings and surgical technique used were mainly insignificant as predictors. We suggest that the inclusion criteria for ACDF should be based on a bio-psycho-social model including NDI. NDI may also be regarded as an important outcome measurement in evaluation of ACDF.
KeywordsPrognostic factorsOutcomeCervical radicopathyClowardCage
Anterior cervical decompression and fusion (ACDF) for cervical disc disease has been shown to be successful, but still a large number of patients remain symptomatic afterwards [13, 18, 22, 23, 28, 36]. Therefore, tools for determination of predictive factors of surgical outcome in cervical radiculopathy are of great importance. The best results from ACDF, mainly based on pain intensity or Odom, have been reported for young male patients with soft disc disease in one segmental level and a short duration of symptoms [3, 7, 10], radicular pain without additional neck or lumbar pain [9, 10], and correlation between radiologically detected and clinical findings .
The preoperative predictive value of objective variables such as radiologically detected findings, active range of motion (AROM) in the neck, handgrip strength, and factors of importance for functional outcome on the Neck Disability Index (NDI) has been determined only in a small study of 23 patients  and for the short-term follow-up period (mean 19 months) of the present series . The short-term follow-up of this series  and the small study  also provided the basis for the only studies applying multivariate statistical analysis with quantification of the relative importance of each variable studied. The results of the linear multiple regression analysis (MLR) showed that, in terms of postoperative pain intensity and NDI, the preoperative predictors for a good outcome in ACDF are male sex, non-smoking status, greater segmental kyphosis, and low pain and disability . We have also shown, in terms of postoperative arm pain, neck pain, NDI, and general health that non-smoking status, a low pain level, and a normal rating using the Distress and Risk Assessment Method (DRAM) are the best preoperative predictors of a good 3-year outcome of ACDF .
Information about preoperative factors predicting long-term outcome, however, is sparse. To our knowledge, no group has explored the use of short-term outcome data for predicting long-term outcome of ACDF, which would aid in identifying patients needing further treatment.
Multivariate statistics have been used in this study. One benefit of this procedure is that variables can be scaled and mean-centered implicating that each variable will have the same impact, i.e., possibility, to affect the model. Another benefit is that a patient or cluster of patients can be related to a variable profile rather than a single variable. This also means that clusters of subjects can be related to a specific response variable or (clusters thereof). Further, correlation patterns can be identified but also a ranking of the most influential variables (and consequently less influential ones).
Three terms are used in the analysis; principal component analysis (PCA), projection to latent structures by means of partial least squares (PLS) and variables influence on projection (VIP). PCA creates a correlation model of all X-variables showing how the observations are related and if there are outliers. PCA also gain an understanding in the relationships among the variables. PLS is a regression extension of PCA, a regression modeling between two blocks of data (X and Y block). The aim of using PLS is to predict one or more response variables (Y) from the predictor data (X). PLS thus, aims to answer the question what X gives (explains) Y. The VIP parameter is a summarizing tool which describes the relative importance of the predictors in order to rank the importance of each X in relation to a chosen response variable(s).
PLS is more stable than MLR in the presence of high correlations among the independent variables, and fewer patients are needed to achieve high power compared to the MLR . PLS could also handle several outcome variables (Y-variables) and describe the spatial structure and thus internal relationship among several Y-variables .
The purpose of the present study was to investigate predictive factors for short- and long-term outcomes of ACDF with the cervical intervertebral fusion cage (CIFC) and the Cloward procedure (CP) with autograft, as measured by current pain intensity using the Visual Analogue Scale (VAS) and disability using NDI.
Patients and methods
Patients and inclusion/exclusion criteria
After obtaining informed consent, we randomized 103 patients (years 1995–1998) into either the CIFC (n = 52) (AcroMed, Cleveland, Ohio)  or the CP groups (n = 51) . To ensure randomization, for each patient the attending nurse blindly drew a note from a pair of notes indicating either CP or CIFC. Thus, throughout the investigation, each patient enrolled had a 50% likelihood of being operated on using CIFC or CP. The randomization resulted in a similar distribution of age, gender, number of levels operated on, duration of symptoms, and smoking habits between the two groups .
All patients had preoperative MRI and clinical findings of cervical nerve root compression. Eighty-nine percent of patients had undergone conservative treatment before surgery. The inclusion criteria were at least 6 months (mean 26 months (SD21)) duration of radiculopathy and neck pain of degenerative origin with compatible MRI and clinical findings (arm pain was the primarily symptom for most patients). Exclusion criteria were myelopathy, psychiatric disorder, drug abuse, and previous spine surgery. Eight patients (three randomized to the CIFC and five to the CP groups) changed their minds and did not undergo an operation, leaving 95 patients remaining in the study.
Preoperatively and at the one- and 2-year follow-ups, all patients underwent a standard clinical examination, had radiographs (antero-posterior, lateral, and oblique) taken, and answered questionnaires. About 86% of the patients completed the one- and/ or 2-year follow-ups (mean 19 months, range 12–24), as reported previously [23, 32]; for the present study, we call these follow-up periods the “short-term follow-up.”
At a mean long-term follow-up of 76 months (range 56–94 months), a questionnaire was sent to all 95 patients who had undergone surgery . The mean age at follow-up was 53 years (range 36–73). A total of 83 patients (87%), 40 in the CP group and 43 in the CIFC group, answered the questionnaires. Of the 12 patients not responding (six from the CIFC group and six from CP), eight patients did not return questionnaires despite several reminders, three had died from causes unrelated to the surgery, and one man had sustained a whiplash injury 6 weeks after ACDF and was therefore excluded.
A total of 52 of the 83 patients were operated at one level, 28 patients at two levels, and three patients at three levels. Eighty-three percent of the patients were operated on C5/6, C6/7 or C5/7 levels. Postoperatively all patients used a Philadelphia collar for 6 weeks, and most of them received general physiotherapy in primary care after removal of the collar.
Current pain intensity on Visual Analogue Scale and disability on Neck Disability Index (NDI) before surgery (baseline) and at long-term follow-up
Pain, mean (SD)
NDI, mean (SD)
The study was approved by the Ethics Committee at the Faculty of Health Sciences, Linköping University.
Measurements used in the prediction model
Background data included: sex (1 = male, 2 = female), age, smoking habits (1 = yes, 2 = no), localization of current problems (1 = neck, 2 = arm/neck and arm), duration of the current episode in months (1 = 6 to <12 months; 2 = 12 to <36 months; 3 = 36 months or more), and use of analgesics (1 = yes, 2 = no).
The kind of treatment, either CP (=1) or CIFC (=2), and number of levels operated on (one = 1, or two/three = 2) was noted.
Cervical Measurement System (CMS) (David Back Clinic International, Vantaa, Finland) equipment was used to measure AROM in the neck in the three conventional movement planes of the cervical spine (sagittal, frontal, and transverse). The CMS helmet consists of a plastic frame with two gravity goniometers, a compass, and two inclinometers attached to the frame. The dial meters are marked in two-degree increments. The use of CMS has been shown to be reliable and valid . The placement of the CMS, the test position, and the test procedure were standardized .
Strength for the right and left handgrips was measured with a Vigorimeter (Gebrüder Martin, Tuttlingen, Germany) with a large-sized bulb in kiloPascals (kPa). The Vigorimeter consists of a rubber bulb connected to a manometer and has been shown to be reliable .
Radiographs (antero-posterior, lateral, and oblique) were obtained preoperatively and postoperatively at short-term follow-up. One radiologist and one spine surgeon independently assessed fusion status with no knowledge of the clinical outcome. In case of a different opinion between the two observers, a combined assessment was made and classification agreed upon. The fusion was classified into four types according to presence or absence of bridging bone in the front of the fusion device and/or through the disc space. Type 1A was defined as bridging bone anterior and through disc space; 1B as bridging bone anteriorly but not through disc space; 2A as no bridging bone anteriorly but through disc space; and 2B as no bridging bone at all. The result was classified as pseudarthrosis (=2) if the 2B condition was observed at any level; otherwise, it was classified as fused (=1) .
Segmental height was measured in millimeters with a ruler at the most anterior aspect of the treated segment. Variations in the magnification were compensated for by relating the treated segment height to the antero-posterior length of C2 or C7 .
Segmental lordosis/kyphosis was measured with a protractor at the motion segment that was operated on and defined as the angle between the cranial and caudal endplates of the upper and lower vertebrae, respectively. If several adjacent segments were treated, the segmental lordosis/kyphosis was defined as the angle between the end plates cranially and caudally to the levels operated on .
Current pain intensity (before surgery and at short- and long-term outcomes) was quantified by a horizontal 100-mm VAS (0 = no pain, 100 = worst imaginable pain) . Pain drawings of the front and the back of the body were coded by a senior orthopedic surgeon as organic (=1), possibly organic (=2), possibly non-organic (=3), and non-organic (=4) .
Neck-specific disability (before surgery and at short- and long-term outcomes) was quantified using the NDI. The 10 sections of the NDI (pain intensity, personal care, lifting, reading, headaches, concentration, work, driving, sleeping, and recreation) are scored from 0 to 5, added together, and transformed into percentages (0% = no pain or difficulties, and 100% = highest score for pain and difficulty on all items) .
The global outcome “effect of surgery,” as assessed by the patient, was at long-term follow-up measured on a six-grade scale (1 = complete relief of problems, 2 = much better, 3 = better, 4 = unchanged, 5 = worse, and 6 = much worse).
Fulfillment of the expectations for surgery at long-term follow-up was measured on a four-grade scale (1 = yes, completely; 2 = yes, partially; 3 = no, not at all; or 4 = do not know).
This study is based on PCA  and PLS . The statistical tool used was SIMCA-P+ 11.5 (Umetrics). PCA detects if a large number of variables can be summarized by a few latent ones through the use of linear combinations. Such latent variables are called principal components (PC). The cross-validation method is used to decide the amount of significant components to be included in a significant model. The procedure iteratively omits part of the data, generates a new model of the remaining data, and predicts the omitted data on the newly developed model. This procedure is iteratively performed until all data have been omitted and modeled. Hence, the cross-validation method is a stability procedure which stops the autogeneration of calculating components to be incorporated. The predictive power is calculated according to the squared difference between predicted and observed values.
To summarize, when interpreting the two plots, the key questions are: (A) How are the observations and/or variables related and are there correlation patterns? (B) Which variables have the strongest impact on the model, i.e., which variables are best explained according to structured variance captured in the model plane?
Two further concepts are used to describe the model results: R2 and Q2. R2 describes the goodness of fit; the fraction of the sum of squares of all the variables explained by a principal component. Q2, in turn, describes the goodness of prediction; the fraction of the total variation of the variables that can be predicted by a PC using cross-validation methods. Outliers were identified using the two powerful methods available in SIMCA-P: score plots in combination with Hotelling’s T2, which identifies strong outliers, and distance to model in X space (DModX), which identifies moderate outliers.
PLS is a regression extension of PCA providing information about “what X gives Y?” The task is to investigate whether there is a relationship between a point in the predictor space (X) and the same point in the response space (Y). In addition to regression coefficients used in describing the relationship between the X and Y spaces, “variable influence on projection” (VIP) is used. VIP is a parameter that summarizes the importance of the X-variables for both the X and Y models . VIP is a weighted sum of squares of the PLS weights, taking into account the amount of explained Y-variance in each dimension. In this way, the VIP parameter identifies the relative importance of the X when predicting Y.
PLS was used to predict pain intensity and NDI at short- and long-term follow-up (Y-variables). For prediction of short-term outcome (Y-variables), baseline/preoperative data (sex, age, smoking habits, pain localization, symptom duration, analgesics, neck AROM, hand strength, segmental height, segmental kyphosis/lordosis, pain intensity, NDI, kind of treatment, and number of levels operated on) were used (X-variables). For prediction of long-term outcome (Y-variables), both baseline/pre-operative and short-term data (pain intensity, NDI, healing status, and development of kyphosis) were used as predictors.
VIP values >1.0 were regarded as significant . The VIP parameter is not used to point out an absolute cut-off number, but used with the purpose of demonstrating the most important variables explaining the prediction variables. In the same vein, in the loading plot a cut-off at 0.20 were used to present the most influential variables in the model. Descriptive statistics have been presented in earlier studies [23, 26, 32].
Overall PCA analysis
Loading all data for principal component 1 (P1) and 2 (P2), respectively, (cut-off 0.20)
Segmental height, baseline
Horizontal AROMb, baseline
Hand strength, right, baseline
Hand strength, right baseline
Hand strength, left, baseline
Pain intensity, short-term
Pain intensity, long-term
Effect of surgery, long-term
Expectations fulfilled, long-term
Horizontal AROM baseline
Effect of surgery, long-term
Number of levels operated on
Hand strength, left
Expectations fulfilled, long-term
Prediction of short-term outcome of pain intensity and NDI
Outcome regressors according to variable influence on projection (VIP) coefficients (principal component 1) according to short-term and long-term prediction. VIP > 1.0 were regarded as significant
Horizontal AROMb, baseline
Pain intensity, short-term
Pain intensity, baseline
Pain intensity, baseline
Hand strength, right, baseline
Sagittal AROM, baseline
Horizontal AROM, baseline
Hand strength, right, baseline
The second component was insignificant and did not improve the predictive power.
Prediction of the long-term outcome of pain intensity and NDI
For predicting the long-term outcome of NDI and pain intensity, as above, a new two-component model was calculated. The first component used 19% of the X-variation (R2X = 0.19) to describe 37% of the Y-variance (R2Y = 0.37) and predict 29% of the Y-variance (Q2Y = 0.29). The VIP analysis showed that short-term NDI and pain intensity were the most important predictive variables, followed by pre-operative NDI and pain intensity, smoking status, sagittal and horizontal AROM, sex, and hand strength (right) (Table 3).
The second component was insignificant and did not improve the predictive power of NDI and pain intensity.
In analysis of the drop-outs at the long-term follow-up, there were no significant differences in background data or subjective or objective measurements either before surgery or at the short-term follow-up between those who answered the questionnaire and those who did not.
In many studies, a global outcome measurement of ACDF has been used [3, 10, 13, 26, 36], in spite of criticism . In the present study, in predicting the long-term variables “global effect” and “expectations of surgery fulfilled,” the model could not capture a strong structured variance according to these variables, and thus, these variables were not taken into consideration in the results. The prediction analysis (PLS) was instead performed according to the well-explained variables of “pain intensity” on VAS and “NDI.”
In a summary of the results of the present study, NDI was the most influential variable explaining the model. NDI was also the most important preoperative as well as short-term outcome predictor of the short- and long-term outcomes of pain intensity and NDI. These facts show that a low disability based on NDI before surgery, and even more so at the short-term outcome, is a useful predictor for a successful long-term outcome of ACDF. The preoperative NDI seems to be important in patient selection for ACDF. High NDI at the short-term outcome indicates patients at risk who need further treatment and support. Consequently, these results have possible implications for a patient’s future and socioeconomically. The importance of NDI verifies previously reported results from MLR analysis .
When predicting long-term outcome of pain intensity and NDI, preoperative and short-term low NDI and pain intensity, good neck AROM and hand strength, non-smoking status, and being male were the most important variables. The same factors, with the addition of a high segmental kyphosis, appear as the most important for short-term outcome. These results verify earlier reported predictors for the short-term follow-up .
Apart from ACDF patients, pain and disability have previously been established as important predictors for outcome, for patients with non-specific neck pain, whiplash-associated disorders, or low-back pain [4, 6, 11, 16, 30].
Smoking has earlier been shown to be a negative factor in the clinical outcome of ACDF , as well as a risk factor for developing disc disease . A further question is whether smoking also is a risk factor for developing pseudarthrosis after ACDF [2, 24, 35]. An earlier short-term MLR analysis  showed that healing status is of minor importance for short-term outcome of pain intensity and did not explain the variability of NDI or Odom. However, based on unpaired comparisons of the long-term outcome, CIFC patients with a healed fusion have less pain and NDI than either CIFC patients with pseudarthrosis or CP patients with a healed fusion . In the present study, healing status had no significant predictive influence on the clinical outcome either at short- or long-term follow-up. Possibly, smoking habits may be associated with outcome based on factors other than the obvious biological effects of tobacco.
In agreement with earlier studies [3, 7, 10, 23], male sex was reported to be of importance in a better outcome of ACDF. In this study, sex also was an important variable when predicting short- and long-term outcomes. Men’s higher fusion rate (p = 0.02) and tendency for less widespread pain (p = 0.08) in the present study and their earlier reported greater neck muscle strength  and endurance  may be related to these differences between the sexes.
In an earlier prediction of the short-term outcome, using MLR analysis, neck AROM, and handgrip strength were weak predictive factors for outcome . However, in this study using PCA and PLS analysis, these variables proved to have a greater importance. A possible explanation could be the use of a method that works well with a higher variable-to-patient ratio and even provides a tool for handling inter-correlations between variables. Hand strength can be looked upon either as a measure of general health or as a measure of injury or disease [15, 17]. The latter suggestion would support hand strength as a specific measure for patients with cervical disc disease with radiculopathy and is easy to obtain in clinical practice. Men were earlier reported to have twice the handgrip strength of women ; this factor in MLR analysis could consequently be a reflection of sex rather than of performance. Neck AROM has earlier been reported to be similar for both sexes in healthy individuals . The objective variables, and probably also NDI and pain intensity, may to some extent reflect psychological factors such as fear of movement and coping with pain and thus are of importance for the outcome. A reflection of this is that we in another material  showed preoperative DRAM to be an important factor in the 3-year outcome of ACDF with respect to arm and neck pain, NDI, and general health. In the present study, preoperative DRAM was unfortunately not obtained. Other factors not quantified in the present study, such as neck muscle strength and endurance, stress-related factors, leisure time, and general health, might also be important predictors of outcome.
In line with Peolsson et al. , the radiologically detected findings and treatment variables were of minimal importance as predictors for the short- or long-term outcome. Single-level surgery has earlier been shown as not influencing the outcome of pain intensity, NDI, and Odom in the short-term outcome of ACDF . Thus, in both the short-term  and long-term outcomes of pain intensity and NDI in the present study, the number of surgery levels had no importance. Zoëga et al.  reported patients who had two-level surgery with the Smith-Robinson technique to be improved in the Million index, Oswestry index, and pain intensity in both the arm and neck. For patients who had undergone single-level surgery, there was no significant improvement . However, several other studies have reported more successful results on pain intensity or Odom in patients with disc disease in one level [3, 7].
In the MLR analysis of the short-term outcome of ACDF, the degree of preoperative kyphosis was the most important factor for pain intensity . In the PLS analysis of the short-term outcome of pain intensity and NDI in the present study, preoperative kyphosis still had some importance. The identification of higher preoperative kyphosis as a predictor of importance for the short-term outcome cannot easily be explained. The kyphosis may reflect a truly symptomatic segment by the disengagement of the facet joints. When PLS analysis in this study was used, neither preoperative kyphosis nor development of kyphosis at short-term outcome was influential on the long-term outcome.
Duration of symptoms seems to have minor importance for the clinical outcome in the present study. The result may be due to the inclusion criteria of at least 6-month duration of symptoms before ACDF. The long duration might have jeopardized the outcome of surgery. Earlier predictive studies [3, 7, 10] have shown a short duration of symptoms to be a prediction of a good outcome of surgery. In these studies [3, 7, 10] the duration time of symptoms before surgery varied from a couple of days up to 25 years and because of univariate statistical analysis there were no controls for other inter-correlated confounding factors for the result. The result of the present study could only be generalized into patients with long standing symptoms before surgery.
The result concerning the explained Y-variance of the outcome variables could of course be seen as low. However, considering investigating a biological material influenced by a multitude of possible factors we argue that a model incorporating X-variables which predicts nearly 30% of the Y-variation can also to a certain extent be seen as high. The results are also in line with previous studies of predictors both after ACDF and in patients with non-specific neck-pain in primary care [16, 23]. It is also true that other variables not included in the present study may be of importance for the outcome.
A limitation of the PCA methodology as well as for MLR and bivariate correlation analysis is that they presume a linear relationship between variables. Thus, non-linear relations may be present but are not captured by the PCA methodology.
In non-specific neck pain patients, Kjellman et al.  reported that different predictive factors appeared depending on the kind of outcome variable chosen. That fact and the different statistical analysis used could explain differences in predictors among studies, showing that it is important to use a broad assessment in the evaluation of predictive factors; in the present study, both pain intensity and disability were used as outcome measures.
A “preoperative” low neck-specific disability, low pain intensity, non-smoking status, male sex, good preoperative hand strength, and neck AROM were significant predictors for a good long-term outcome of pain intensity and NDI after ACDF. Short-term outcome measures of NDI and pain intensity were better predictors of the long-term outcome than were baseline values for these parameters. Radiologically detected findings and the surgical technique were, except for preoperative kyphosis in the short-term outcome, insignificant as predictors of both short- and long-term outcomes. NDI was not only overall the most important factor in explaining short- and long-term outcomes, but also was the factor with the highest impact in explaining the total prediction model. NDI may be regarded as an important outcome measurement in evaluation of ACDF. In addition, we suggest that the inclusion criteria for surgery should be based on a bio-psycho-social model including NDI. We also suggest that other variables than those studied may be important for the outcome of ACDF.
The authors especially thank secretary Inga-Lill Lindberg and MD Carl-Henrik Hybbinette for their support. The study has received financial support from the Faculty of Health Sciences at Linköping University and from the Research Council of Southeastern Sweden (FORSS).