Predictors of surgical outcome and their assessment
- 416 Downloads
The relatively high rate of failed back surgery has prompted the search for “risk factors” to predict the result of spinal surgery in a given individual. However, the literature reveals few unequivocal predictors and they often explain a relatively low proportion of variance in outcome. This suggests that we have a long way to go before being able to rest easily, having refused someone surgery on the basis of unfavourable baseline characteristics. The best recommendation is to ensure, firstly, that the indication for surgery is absolutely clear-cut (i.e. that surgically remediable pathology exists) and then to consider the various factors that may influence the “typical” outcome. Consistent risk factors for a poor outcome regarding return-to-work include long-term sick leave/receipt of disability benefit. Hence, every effort should be made to keep the individual in the workforce, despite the ongoing symptoms and plans for surgery. In patients with a particularly heavy job, consultation with occupational physicians might later ease the patient’s way back into the workplace. Patients with degenerative disorders and/or comorbidity should be counselled that few of them will have complete/lasting pain relief or a complete return to pre-morbid function. Patients with a high level of distress may benefit from psychological treatment, before and/or accompanying the surgical treatment. The opportunity (time), encouragement (education and positive messages), and resources (referral to appropriate support services) to modify risk factors that are indeed modifiable should be offered, and realistic expectations should be discussed with the patient before the decision to operate is made.
KeywordsPredictors Risk factors Surgery Outcome Questionnaires
Failed back surgery is a problem that has become sufficiently widespread to even warrant its own special conferences, with recent reviews reporting failure rates ranging from 5 to 50% . The substantial suffering of patients with failed back surgery syndrome, the associated costs to society, and the not inconsiderable complication rates associated with spinal surgery, per se, have prompted the search for predictors of outcome, in an attempt to better identify individuals who are likely to benefit from surgery. The development of “pre-screening” tools has also been encouraged, to assist with the patient selection procedure and the promotion of realistic expectations on behalf of the patient [57, 65].
the design of the study and the statistical methods used to identify predictors;
the outcome measures employed, the means by which a “successful outcome” is defined, and the proportion of patients in the investigated group that typically achieve a successful outcome;
the number and type of predictor factors subjected to examination in any given study, and their prevalence within the group under investigation;
the specific pathology or surgical procedure under investigation and the defining characteristics of the patients with that pathology.
Predictor study designs
Retrospective studies (“looking back in time”) involve the examination of results at a given time after surgery, analysed in relation to the available data recorded at baseline in the patients’ medical records. Occasionally, additional data will be collected at the time of the investigation regarding the current status or the situation (recalled) as it was before the surgery. As this type of study is not pre-planned, the most appropriate baseline information has often not been obtained. Further, the method of data collection has not been standardized and data are frequently missing. Often, if some particular attribute has not been noted in the medical records, it is assumed not to have been present, e.g. , which is not always the case. It is also not known whether any putative risk factors may already have been used in clinical practice as part of the selection criteria, hence reducing both the frequency of observation of these factors and their chances of achieving a statistically significant association with the outcome. Finally, if the “predictors” themselves are collected retrospectively, then the post-surgical status can heavily bias the accuracy of recall of the situation before surgery. Retrospective studies are hence the least robust type of study for identifying risk factors.
As their name suggests, prospective studies (“looking forward in time”) involve characterizing the baseline status of patients before surgery, and then following up the patients over time, in order to relate the baseline characteristics to the ultimate outcome. Prospective studies are carried out with a planned research design, allowing sufficient information to be gathered to investigate the phenomenon in question. Patient record keeping may be modified, specific questionnaires may be introduced, or additional measurements may be taken during routine patient management in order to collect the necessary data. As patients are made aware that they are participating in a study, their permission can be sought to obtain additional contact information, in order to reduce losses to follow-up. The duration of the follow-up may influence both the proportion of patients showing a successful outcome and also the predictors of the outcome. With new surgical techniques/implants, a follow-up of at least 2 years is typically required, as the development of inadvertent or unexpected complications must be included as part of the overall outcome assessment. However, for other procedures, in which the operation seeks to remedy a physical, mechanical obstruction (e.g. decompression surgery), the outcome can arguably be examined as soon as the patient has recovered from the direct effects of the operation per se. This can be expected to occur within a maximum of 6 months after surgery. In fact, longer follow-ups in these patients, although of interest for documenting the natural history of treatment of the condition, may introduce additional sources of error as far as the identification of predictors is concerned. It is conceivable that other factors (independent of the original surgical intervention) may influence the patient’s rating of “outcome” in the long term, especially if the latter is based on self-ratings of current pain, disability, or quality of life (see later; outcome measures). Further, it cannot be assumed that all the predictors assessed at the baseline are necessarily intrinsic to the patient and will remain stable over time; especially changes in work-status/social situation or the development of comorbidities may influence the patient’s rating of their status at the time of the follow-up, such that the assessments made at baseline no longer accurately characterize the patient. Arguably, predictor analyses should be carried out at repeated follow-ups (to identify “stable” and consistent predictors), and any potentially important but “labile” variables should be re-assessed simultaneously.
Statistical methods used to identify predictors
The simplest, but also most limited, studies of predictors involve splitting patients into (usually) two groups—good outcome and bad outcome—and examining the significance of the difference between these groups in relation to the mean values or presence/absence of various baseline variables (sometimes referred to as “case-control” studies) for each putative predictor. In doing so, one can learn something about the distinguishing features of the two groups, or the relative risk of a poor outcome in the presence of a given risk factor, but only little about an individual’s overall likelihood of a certain outcome. Nonetheless, the variables identified may be of use in guiding the development of more precise screening tools in further studies. When the outcome is measured as a continuous variable (e.g. improvement in pain, reduction in disability score), correlational analyses may be performed between the individual putative predictors and the chosen outcome measure to identify the strength of the relationship between the two. With this simple bivariate approach, any interactions between the effects of two or more variables on the outcome will not be identified; sometimes, the combination of two variables may represent a risk factor, although neither alone is significant . Further, carrying out repeated individual analyses tends to result in the identification of numerous risk factors that are, in reality, delivering the same information. For example, duration of symptoms and depression may each be identified as important predictors of outcome, but they are so closely related to one another—i.e. patients who have endured persisting symptoms are often those that become depressed—that they end up delivering almost identical information in explaining the variance in outcome. Only multivariate analyses can account for this overlap of information (sometimes referred to as “collinearity”), since variables are entered into the predictive model only if they are able to uniquely explain any variance in outcome, i.e. able to add something more to the predictive model.
The most commonly utilized multivariate analysis method is logistic multiple regression. This is used to predict categorical outcome (good/bad) from a range of continuous and categorical variables. Often, bivariate analyses (outcome vs each predictor) are carried out first, in order to determine which of the many predictor variables should be examined in the multivariate model (i.e. to reduce the number of variables entered). The aim of logistic regression analysis is to find a subset of explanatory variables that can be combined to best predict outcome. For future patients, values for these explanatory variables can then be entered into the logistic regression equation (or “predictive model”) to estimate the likelihood of a good (or bad) outcome for the given patient (a practical example of this is given in ). The coefficients from the predictive model are sometimes used to assign the variables a “cut-off score” in order to develop simple screening tools for use in clinical practice. Clearly, the sensitivity, specificity, and positive and negative predictive value of such tools must be adequate, and the model must be validated in an independent cohort, or its reliability and generalizability to other clinical settings may be limited. This is rarely done. In interpreting the results of logistic regression analyses, it is important to note how much variance in outcome is explained by the chosen predictor variables, rather than just whether the association achieves a given level of statistical significance. There are abundant studies in the literature in which highly statistically significant predictors of outcome have been identified, but on closer examination these actually explain a very low proportion of the variance in outcome (e.g. [54, 75, 76]). This makes such predictors of limited use in clinical practice.
What constitutes a “successful outcome?”
The proportion of patients that can be considered a success after surgery as well as the factors that might predict a good outcome depend to a large extent on how success is defined [3, 80]. The success of outcome is likely best considered in relation to the predominant aim of the surgery. Hence, for decompression surgery for a herniated disc or spinal stenosis, the most important outcome may be the reduction of leg pain or sensory disturbances and/or walking capacity, whereas for “chronic degenerative low back pain,” the relief of low back pain (LBP) will primarily govern the degree of success. For all of these conditions, the ability to regain normal function in activities of daily living will also be of importance, although this typically follows with time once the main symptoms have resolved. In the case of deformity surgery, pain or disability may not be an issue, and factors other than symptoms (such as cosmetic appearance, prevention of progressive worsening, and associated systemic complications) may determine the “success” of surgery. The success may also depend on the age-group and working status of the group under investigation as well as the answer to the question “who’s asking?”—when viewed from the economic point of view, outcomes concerned with work capacity may be of greatest importance for younger patients of working-age.
Global assessments’ scores often give the most direct answer to the question “did the operation help?” and allow the patient to interpret the question in relation to his or her own particular pre-surgical problems and expectations of surgery. For the purposes of predictor studies, multiple response categories for this question (commonly between three and seven responses, ranging from “the surgery helped a lot” through to “the surgery made things worse,” or “excellent result” through to “bad result”) are often collapsed to dichotomize the data into “good” and “poor” outcome groups. Some authors consider that all responses greater than a “neutral” outcome (i.e. no change) should be considered as a positive result, while others argue that for elective surgical procedures a notable improvement should be required (i.e. more than “helped a little” or “fair result”) to consider the operation a success .
In predictor studies in which continuous variables such as the Roland Morris score, Oswestry Disability Index, or pain visual analogue scales (VAS) are used as the primary outcome measure, some indication of the cut-off value corresponding to a “good outcome” is required, i.e. the value of the minimal clinically relevant change score. To determine the value of such cut-off scores, the method of Receiver Operating Characteristics (ROC) is commonly used. The ROC curve synthesizes information on sensitivity and specificity for detecting improvement (according to some dichotomized external criterion) for each of several possible cut-off points in change score . Thus, sensitivity and specificity can be calculated for a change score of one point, two points, and so on. This method is analogous to evaluating the predictive power of a diagnostic test in which the instrument (questionnaire) change score is the diagnostic test and the global outcome (dichotomized as described above) is used to represent the gold standard . Using such methods, it has been shown that a “good outcome” cut-off score for the 0–100 Oswestry Disability Index is approximately 10 points  or an 18% reduction of the pre-surgery score ; for the pain VAS approximately 20 points (on a 100-point scale) , and for the 0–24 point Roland Morris disability approximately 4 points [11, 62]. The minimal clinically relevant changes for generic health scales, such as the SF36, and other secondary outcome measures, such as psychological distress, have been less well investigated. However, these tend to be less responsive to surgery [10, 42] and often the minimal clinically relevant change borders on the value for the minimal detectable difference (i.e. 95% confidence intervals for the measurement error) for these instruments , rendering difficult the identification of “real change” as opposed to “random error” in a given individual.
Instruments used in the prediction of outcome
Psychometric testing with reliable and valid questionnaire instruments is a very efficient way to collect a great deal of information about the patient undergoing spinal surgery. Questionnaires provide a means of cross-checking clinical interview data. It is recommended that the latter be carried out prior to questionnaire assessment, as the trusting relationship built up during the clinical interview increases the chances of patients responding honestly to the questionnaires. Further, an adequate process of informed consent, in which patients are clearly informed why the questionnaire data are being collected and who will have access to them, increases the chances of patients giving truthful answers as opposed to answers they may perceive to be “desirable.” Nevertheless, it is a common finding that the information derived from the clinical interview and the psychometric testing is not always consistent, e.g. a patient not showing any concerns about surgery during the interview may actually reveal himself to be quite fearful as assessed by the psychometric testing. It is this additional information derived from the psychometric testing that renders it so useful. Other advantages include its objectivity, i.e. the results do not depend on the person who analyses the information, and its standardized nature. A knowledge of normative values for each questionnaire allows individual scores to be interpreted in relation to the scores of many others and hence provides clinicians with information as to whether an individual’s scores are extreme, i.e. should be considered to be prognostic regarding the outcome or not. In acknowledgement of the evidence amassed in recent years from high quality research, decisions regarding treatment should consider the results of psychometric assessments.
Many of the instruments commonly used to assess risk factors for a poor surgical outcome have been discussed at length in other contributions in this special issue or in recent review articles. For instance, instruments assessing demographic and medical factors are discussed in the article by Häfeli and colleagues. The contribution by Elfering covers work-related instruments. Other recently published reviews have covered in detail the use of self-report questionnaires for current pain and pain history, disability, and general health status [47, 66]. Hence, the instruments that will be discussed in the present article are restricted to the most important individual psychosocial characteristics, i.e. personality characteristics, emotional antecedents and reaction to pain, and the way individuals attribute pain to certain work or physical activity factors.
This description of instruments is by no means exhaustive, but includes those that have consistently displayed prognostic value in relation to the outcome of spine surgery.
Personality is defined by the American Psychological Association as “deeply ingrained patterns of behaviours, which include the way one relates to, perceives, and thinks about the environment and oneself.” From this definition it is understandable that personality may influence individual reactions to pain and spinal surgery. The Minnesota Multiphasic Personality Inventory (MMPI)  and its revision, the MMPI-2, have been used in the area of chronic pain and surgery outcome for more than 50 years . The core and most commonly used scales of the MMPI and MMPI-2 comprise three validity scales and ten clinical scales . In the context of chronic pain and the outcome of spine surgery, two scales, hypochondriasis (Hs) and hysteria (Hy), have been shown to be valuable predictors of outcome . Both scales were originally constructed to identify patients whose psychopathology is manifested in physical symptoms in the absence of organic pathology. In spine surgery, however, where the indisputable presence of organic pathology is a pre-requisite for surgery, the psychoanalytical theory behind the scales is neglected and both scales are instead considered to address sensitivity to pain. In one study, it was shown that when Hs and Hy were assessed prior to discography of both disrupted and normal discs, patients with significant Hs and Hy scores (T>75) were more likely to report pain regardless of the status of the discs injected [7, 8]. The MMPI takes 45–75 min to administer, some items are difficult to respond to, and the calculation of test scores is complicated. An alternative, the Maudsley Personality Inventory (MPI) , also known as the Eysenck Personality Questionnaire, comprises 80 items for three scales of introverted/extroverted personality, emotional stability (neurotic tendencies), and false discovery. The MPI takes around 10 min to complete and is available in many languages. The most recent and widely accepted approach to personality is the Big-Five factor model. The five-factor model of personality comprises the dimensions of neuroticism, extraversion, openness, agreeableness, and conscientiousness [19, 63]. Short measures include the 45-item bipolar adjective rating list developed by Ostendorf and colleagues [72, 73], which was further reduced to 30 items by Schallberger and Venetz . The latter demonstrated that the reduced version is satisfactory in terms of factorial structure and internal consistencies of the scales. Each scale consists of six bipolar items on a six-point scale, with each pole ranging from “very” (1 and 6), “quite” (2 and 5), and “rather” (3 and 4). Principal components analysis with a forced five-factorial solution replicated the factorial structure, with 27 out of 30 items loading on the appropriate factor .
Patients who suffer from (chronic) pain and undergo surgery are often depressed. While depression, chronic pain, and surgical outcome seem to be linked, the direction of causal pathways is not clear. Depression appears as an antecedent and as a consequence of chronic pain in approximately equal measure (for further details, see later): Atkinson et al. showed depression to precede chronic pain in 42% of patients and to be a consequence in 58% . The MMPI includes a depression subscale, but the instruments more commonly used to assess depression are the 21-item Beck Depression Inventory , the CES-D scale , and the 23-item Zung Depression Inventory . All instruments display adequate psychometric properties and have been translated into several languages. Following the work of Greenough et al. , the modified Zung scale is now commonly combined with the 13-item Modified Somatic Perception Questionnaire (MSPQ)  by simple addition of the scores of each questionnaire to provide an accurate measure of psychological disturbance. The so-called Distress and Risk Assessment Method (DRAM) , developed in a sample of 567 chronic pain patients in Scotland, also includes the combination of both instruments. Completion of the 45-item instrument takes approximately 10 min. The DRAM is used to categorize patients into one of four groups: normal; at risk; distressed-depressive with high Zung scores but moderate MSPQ scores; and distressed-somatic with the reversed pattern of scores. It is considered to be a good screening tool for patients with risk of poor outcome in spinal surgery, although not all studies have been able to confirm its predictive power in all patient groups (see later). Depression has also been assessed using the Psychological General Well-Being Questionnaire . This questionnaire consists of a total of 22 questions on the following six subscales: anxiety, depression, well-being, self-control, health, and vitality. Patients rate each question on a six-point Likert scale. The three-item measure of depression asks whether participants (a) felt depressed, (b) felt downhearted and blue, and (c) felt sad, discouraged, and hopeless during the past month.
Several studies have explored the validity and the predictive power of fear-avoidance beliefs (about activities of daily living and work) in relation to outcome. There is increasing evidence in the literature that the Fear-Avoidance Beliefs Questionnaire (FABQ)  is a valid and reliable tool that can be helpful in predicting treatment outcome in LBP and in surgically treated patients . Pfingsten et al.  and Staerkle et al.  both reported the validation of a German version of the FABQ and Chaory et al. have validated it in French . Further research is required to examine whether a splitting of the FABQ work scale into “work as a cause” and “work prognosis” factors, as proposed by Pfingsten et al. , is better in predicting surgical outcome than the single scale proposed by the original authors and supported in the German version of Staerkle et al. .
To some extent, the diversity of available instruments and the versions within instruments represents a considerable obstacle in attempting to compare results across studies. As such, some sort of standardization is highly recommended. This is the motivation behind current efforts to standardize both extended batteries of questionnaires (e.g. the Deutsche Schmerzfragebogen, DSF [25, 68, 90]), and the very brief “core” or screening sets for use in the clinical routine (e.g. [61, 95]).
Predictors of outcome of spinal surgery
The present review focuses specifically on predictors of the outcome of surgical treatment for spinal disorders, although there is some suggestion that there may be a certain amount of overlap with the factors that determine the prognosis of non-operative treatment of LBP , especially in relation to the chronic stage and for some of the psychological attributes such as depression and fear-avoidance beliefs and certain medical factors such as the number of previous treatments undertaken.
In examining the literature on predictors of surgical outcome, we will focus mainly on studies carried out in the last 10–15 years (>1990). Recent imaging modalities and operative techniques have advanced so much since the 1980s that negative explorations are now quite rare and the clinical presentation is more straightforward ; hence, studies using diagnostic techniques and/or operative methods that are no longer state-of-the-art may identify predictors that are of little relevance today.
The most commonly examined predictors of surgical outcome can be loosely categorized into the following groups: biological/demographic, work-related, psychosocial, and medical (Table 1). In addition to these, and increasing in popularity as a relatively unexplored avenue for explaining some of the variance in outcomes, is the notion of the “patients’ expectations of surgery” [57, 65].
As alluded to in the earlier sections, one must bear in mind a number of factors when examining the agreement between studies for the variables identified as “predictors.” Firstly, predictors can only be found among the variables that are examined in the first place; and secondly, the failure to evaluate potentially important predictor variables in some studies can lead to over-estimation of the importance of the variables that are examined or to emphasis being placed on different but closely related variables carrying similar information. Further, in studies of very small groups of patients, the sample sizes for different outcome groups may be too small (especially in relation to the size of the “poor outcome” group, which tends to contain just a minority of patients) to sufficiently power the study and allow it to identify potentially relevant, real differences.
In most of the following studies only the statistical significance of the associations between outcome and risk factors was given, and, only rarely, the extent of the variance in outcome accounted for. The implications of this will be discussed again later.
Biological/demographic and health behaviour/lifestyle variables
Numerous retrospective studies have shown a negative association between the patient’s age at surgery and outcome, although most of the prospective studies have shown no influence of age (see Table 1) or have even found improved outcomes in older patients (cervical spine) . In part, the role of age may be explained by the outcome measure being investigated: where work issues are concerned, then it is more likely that older age at operation will result in less positive results with regard to return-to-work. It is also unclear in many studies (especially when bivariate analyses were used) whether the duration of symptoms was controlled for. The latter is one of the strongest predictors of a poor outcome (see later), and especially in chronic disorders tends to show a correlation with age. Hence, age may be acting in part as a marker for symptom duration, where the latter has not been simultaneously accounted for.
Gender is also highlighted by many retrospective studies as a potential predictor of outcome, although most prospective studies have failed to find such an association. Those that do, tend to show that men have a better outcome than women (see Table 1). An association with “maleness” is difficult to explain: postulated mechanisms include the notion of gender acting as an indirect marker for various (negative) psychological factors , biological differences in the healing potential of men and women, or (with respect to fusion) gender-related differences in the mechanical loading/muscle compressive forces promoting new bone growth .
Health behavioural/lifestyle factors
Few studies have examined “health behavioural” or “lifestyle” factors as predictors of outcome, although it is conceivable that these could be important in determining an individual’s response to major surgery. Intuitively, one might imagine that a higher level of pre-surgical physical fitness would allow a more rapid return to normal functioning after surgery. To the authors’ knowledge, fitness or the participation in regular exercise has been examined in only one retrospective study  and was not found to be associated with outcome after percutaneous lumbar discectomy. Results from the authors’ own studies suggest that the regular participation in exercise/physical activity for many years prior to the operation (but not necessarily exercise habits at the time of the intervention)—i.e. exercise as a “lifetime habit”—is significantly associated with a positive outcome after decompression surgery (unpublished observations).
Smoking is a relatively frequently examined predictor factor, especially in relation to the outcome after spinal fusion. In some studies it has been shown to have a negative impact on outcome, whereas in many others it has had no effect (Table 1). It has been suggested that tobacco use must be examined as a dose–response relationship in order to reveal associations that can be obscured by expressing it as a dichotomous variable (yes/no to a smoking habit) . While the inhibitory effects of nicotine on fusion itself have been established [2, 31], it is also possible that smoking may simply reflect other factors—such as negative health behaviour (low physical activity levels, alcohol use), lower education/social level, manual job—and thereby act as a marker for these in determining the outcome. Interestingly, even in a subgroup of patients with no signs of pseudoarthrosis, smoking still predicted clinical outcome and return-to-work in patients undergoing fusion .
Work-related predictors include such variables as worker’s compensation, disability pension, work-status before surgery, duration of sick leave, and heaviness of the job.
The majority of studies that have examined the effect on outcome of the involvement in disability pension claims or worker’s compensation issues have confirmed that these have a negative impact on the result of surgery, especially in relation to return-to-work or “global outcomes” (see Table 1 and also [21, 35, 36, 54, 55, 92]). In one large high quality study, however, workers’ compensation showed no effect with the outcome in multivariate models . The authors suggested that the strength of such an association may in part depend on the social insurance system in the given country . One large retrospective study showed that while compensation status was predictive of the 2-year outcome after fusion, it no longer had any influence (in terms of back-specific function scores) after 10 years .
Although rarely examined in prospective studies, retrospective studies have shown that the involvement of a lawyer in compensation claims has a consistent negative predictive value for various outcomes after spinal fusion [21, 22, 54]. Cynics may interpret this finding as evidence for the premeditated instruction to magnify symptoms for the purposes of secondary gain; some studies have even shown that lawyers may advise their clients how to respond to psychological assessments in order to better their chances of success with their disability claims (see discussion in ). Others have suggested that litigious patients experience an increased somatic sensitivity to pain as a consequence of financial incentives and social–contextual variables .
Long pre-operative sick leave is a consistent negative predictor of return-to-work [40, 71, 89] and of global outcome, overall satisfaction or back-specific function [48, 82]. This highlights the importance of providing timely intervention once a clear-cut diagnosis that can be remedied by surgery has been made (see later).
Job heaviness (physically strenuous work) has been examined as an independent predictor in only a few studies, and the results appear to be somewhat conflicting: in one retrospective study on herniated disc patients, heavy manual work was a negative predictor of overall outcome and post-operative work-status 10 years after lumbar discectomy . A prospective study of patients with chronic degenerative LBP revealed a similarly negative relationship in relation to outcome measured with a combined global score , whereas a further study on fusion patients  and two others on discectomy patients showed no influence of heavy work on the outcome [16, 98]. Intuitively, it may be expected that, while work-status may not necessarily govern the degree of pain and disability reported after surgery, it may well influence an individual’s chances of returning to a job requiring the performance of heavy manual duties.
Psychological and sociological factors
Psychological factors are one of the mostly commonly investigated predictors of surgical outcome, although their overall importance still remains equivocal and may be dependent on the spinal disorder in question .
Some of the early studies carried out in the 1980s showed slight to moderate associations between certain scales on the MMPI (most commonly Hs, Hy, depression, and admission of symptoms scales) and outcome after disc surgery/fusion. These studies encouraged the development of scoring systems, which included MMPI measures, to assist in predicting surgical outcome from various baseline indicators [8, 83, 91]. In view of the various psychometric and practical problems associated with use of the MMPI in pain patients , new or modified methods of assessing psychological characteristics have been introduced, which focus primarily on the measurement of depression, anxiety, and/or heightened somatic awareness. More recently, other psychological characteristics have become of interest as potential predictor factors, such as coping strategies [8, 33], fear-avoidance beliefs (about work and physical activity) , and various workplace psychological factors (stress, satisfaction, “resigned” attitude, etc.) . Overall, these have led to mixed results, in terms of their ability to reliably predict the outcome.
Using pain drawings and inappropriate signs, Greenough et al. [35, 36] reported in two retrospective studies that “psychological distress” was predictive of a poor outcome after anterior fusion. Van Susante et al.  used a “psychogenic back pain score” to examine prospectively the outcome after lumbosacral fusion of three types of patient group: organic, uncertain, and psychogenic. It was shown that the “organic” group had a much better outcome in terms of pain, disability, and medication use than did the “psychogenic” group. In patients undergoing discectomy, depression was found to be a significant predictor of global outcome [53, 80] and return-to-work . A recent prospective study by Trief et al.  investigated the influence of baseline depression, state anxiety, somatic anxiety, and hostility on the outcome after lumbar spine surgery [mostly fusion (68%) and decompressive laminectomy (30%)]: using multivariate analyses, the DRAM, which classifies patients as either “normal,” “at-risk,” or “distressed,” was found to be a significant predictor of outcome in terms of work-status, change in back pain and leg pain, and the “daily activities” and “work–leisure activities” scales of the Dallas Pain Index.
Junge et al.  found that certain aspects of pain behaviour (search for social support) were significantly associated with a poor global outcome in patients undergoing disc surgery; although depression did not show a significant association, there was a tendency for higher baseline values in patients with a poor outcome and depression was therefore included in the pre-screening tool developed by the group. In prospectively studying patients undergoing discectomy  or fusion , two studies failed to reproduce the findings of Trief et al. , in that the DRAM scores were found to have no predictive power in relation to back-function (Oswestry Disability Index). Similarly, neither depression  nor pain drawings  were able to predict outcome (any domain) after fusion for chronic LBP (Table 1). Greenough et al.  were also unable to reproduce their earlier findings  in a later retrospective study on patients undergoing posterolateral surgery. Notably, in all these studies, psychological disturbance was improved after surgery in patients with a good outcome. No association could be found between depression and outcome in studies on spinal stenosis patients undergoing decompression [51, 64].
In a large group of patients followed up after spinal surgery (for mixed diagnoses), Staerkle et al.  showed that fear-avoidance beliefs at baseline were a significant predictor of work-loss at 6 months. They uniquely explained 12% variance in the outcome, which was approximately one-third of the total variance explained by the whole predictive model (socio-demographic variables, pain variables, and fear-avoidance beliefs together explained 37% variance), representing a moderate effect size.
It has been suggested that the poor results of surgery reported in psychologically disturbed patients may reflect intervention in patients who did not have surgically remediable pathology , and this appears to have been verified by the maFny recent studies of Carragee (see ). This group has shown that patients with acute and subacute sciatica in association with a clearly identifiable, severe disc herniation have a very high chance of dramatic and lasting improvement with surgery and that standard psychometric tests in these patients fail to predict the outcome. Even severe emotional distress in patients who underwent early, appropriate surgical intervention did not correlate with adverse outcomes, although the same psychometric profile in patients with chronic sciatic pain and disability did predict worse outcomes compared with less emotionally distressed patients with the same level of chronicity. It was concluded that with prolonged pain and emotional distress, adverse and possibly self-perpetuating psychological and social changes may significantly decrease the impact of disc surgery .
All in all, and in view of the conflicting evidence, it would not appear prudent to recommend that patients with a surgically remediable pathology be denied surgery simply on the basis of their pre-operative psychological status. Nonetheless, it may be a useful strategy to identify patients with long-lasting symptoms and a high level of distress who might benefit from an additional psychological treatment before and/or accompanying surgical treatment; decreased levels of distress may then increase the impact of surgical treatment.
Low social functioning (as measured with quality of life instruments) was identified as a significant negative predictor of re-operation rate in a retrospective study on fusion patients , and of global outcome, pain, and quality of life in a mixed group of spine-surgery patients .
In patients undergoing lumbar disc surgery, job level was found to be a significant predictor of combined global outcome . An interesting study on military personnel undergoing cervical disc surgery showed that both position (rank) and duration of the individual’s military career (but not economic forms of secondary gain per se) were significant predictors of return to active duty . In some studies, a low education level and/or low income have been shown to predict a negative surgical outcome in terms of either the total costs associated with workers’ compensation , return-to-work , or global outcome/function [48, 56, 98]. It has been suggested that because individuals with a better education, a higher income, and at a higher level on the job-ladder tend to have greater responsibilities, personal investment may override the discomfort caused by any residual post-operative symptoms and encourage a return-to-work .
Occupational mental stress and job-related resignation have been shown to be negatively associated with return-to-work and post-operative pain relief/disability, respectively . Job-related resignation reflects a “resigned” attitude to work-related troubles, job continuation despite dissatisfaction, the notion that the current situation must be accepted because things might otherwise be worse, and that expectations are limited as an employee . The significance of the impact of job satisfaction on return-to-work is well documented in the back-pain literature [20, 95].
Social support from the spouse , search for social support (as a pain behaviour) , and family reinforcement of pain  have all been associated with a more negative outcome after surgery. It is suggested that this kind of “support”—in which relatives take over the patient’s jobs or responsibilities, encourage rest, and provide more attention when the pain appears greatest —serves to reinforce the illness status and thereby encourages the adoption of “passive” behaviour [27, 80].
Diagnosis-specific clinical factors
Few studies have been able to identify clinical variables that are predictive of outcome after spinal surgery. Hagg et al.  reported no significant predictive effect on the outcome after fusion of various baseline pain-provocation (flexion/extension), trunk flexibility, and neurological tests, with the exception of abnormal motor function, which was associated with a poorer outcome. One study has shown that pre-operative sensory deficit is associated with a good outcome (in terms of back-specific function), but the relationship was only evident 28 months after surgery and not at the 3 or 12 month follow-ups , suggesting it may have been a spurious finding. In the same study, the presence of a positive SLR test at <30° was associated with an unfavourable outcome at each time-point, and significantly so at 12 months. In contrast, Kohlboeck et al.  showed that, pre-operatively, the Lasegue sign was a good indicator of a successful outcome. Junge et al. considered the deficiency of reflexes to be predictive of a better outcome in their pre-screening instrument developed for disc surgery patients .
The recent widespread use of the MRI scan in the assessment of spinal disorders has considerably improved the ability of surgeons to understand spinal pathology, especially in relation to disc herniation . In two studies, Carragee and Kim showed that in patients with sciatica, the anterioposterior length of the herniated disc material and the ratio of disc area to canal area seen on MRI  as well as the degree of anular competence and type of herniation seen intraoperatively  had a stronger association with surgical outcome (pain, function, medication use, and satisfaction) than did any clinical or demographic variables. Other studies have shown that patients with an uncontained herniated disc had a better functional outcome 1 year after surgery than did those with a contained herniation . Using multiple regression analysis of a range of medical variables (including MRI findings) and psychosocial variables, Schade et al.  reported that MRI-identified nerve-root compromise and the extent of herniation were the strongest independent predictors of global surgical outcome 2 years after surgery in patients undergoing lumbar discectomy. In contrast, return-to-work could not be predicted by any clinical or imaging variables and was instead determined by various psychosocial factors.
Sun et al.  retrospectively compared the outcome after adjacent two-level lumbar discectomy in patients with radicular pain attributable to nerve-root impingement either with or without concomitant osseous degenerative changes at the same level. The proportion of patients with an excellent/good global outcome (MacNab classification) was significantly higher in the group with only a herniated disc (86%) compared with the group in which osseous changes were also present (57%).
One large study showed that low disc height (less than 50%) was one of the most significant positive predictors of outcome (back-specific function) in patients with degenerative chronic LBP undergoing spinal fusion . In contrast, Peolsson et al. [75, 76] found that disc space narrowing was without any prognostic significance for functional outcome. In patients undergoing lumbar fusion, a surgical diagnostic severity score, based on pre-surgical imaging, had no predictive power for either disability status, global outcome, or physical or social functioning subscales of the SF20 .
In the study of Peolsson et al. [75, 76], pre-operative segmental kyphosis at the level to be operated was the strongest predictor of pain and disability 2 years after cervical decompression with fusion, although the proportion of the explained variance was low.
A consistent predictor of poor outcome for various different diagnoses and types of outcome is the duration of symptoms prior to the operation (Table 1). In studies that failed to identify this association, closely related variables (e.g. long-term sick leave, work-disability claim) were often chosen for inclusion in the multivariate model, especially in predicting return-to-work [40, 89].
Prior operations on the spine has been identified as a risk factor for poor outcome in a couple of studies [50, 64], although, interestingly, satisfaction with repeat operations is purportedly higher when there is a history of good results from previous operations and no epidural scarring requiring surgical lysis .
The number of affected (or operated) levels is often assumed to be negatively associated with the outcome, although only few (mostly retrospective) studies have actually demonstrated such a relationship with regards to disability status after fusion [21, 29, 50], the long-term clinical outcome after laminectomy , or the risk of requiring subsequent fusion after discectomy . This relationship is believed by some to be related to resulting post-operative spinal instability . A number of other studies on various diagnostic groups have been unable to confirm this association at all [1, 39, 76, 84]. Again, identifying the correct surgically treatable lesion(s) may be of greater importance; if this is not done, then increasingly poor results can obviously be expected as increasingly more levels are wrongly operated.
Many studies have shown that, especially in older populations of patients, poor general health in terms of other joint problems or systemic diseases (comorbidity) appears to have a significant negative influence on the outcome of spinal surgery [14, 48, 51]. However, some studies have failed to find any clear association [40, 84]. Perhaps the poor patient-rated outcomes in comorbid patients reflect, in part, a cross-contamination of the outcome instruments (especially those assessing function ), leading to an over-estimation of the true back-specific disability. Either way, it is important to make patients with comorbidity aware that the operation is being carried out for the specific spinal lesion identified and that it will not serve as a panacea for all their ongoing medical problems.
All the factors assessed so far for their role in determining the outcome of surgery are somewhat “extrinsic” to the surgical procedure itself. The assumption tends to be that the surgeon himself is infallible and that the only reason for failure relates to inherent characteristics of the patient himself. Certainly, surgical skill is an aspect that is difficult to examine within the context of clinical trials, but we must concede that a certain proportion of failures are attributable not to the patient but to failure of the technique used, the hardware, and surgical complications. Furthermore, it is incumbent upon the surgeon to perform an accurate diagnostic work-up and to critically assess the indications for surgery; any shortcomings in this respect will naturally increase the potential for an unsatisfactory result. A recent study, in which the rates of surgery for herniated disc and spinal stenosis were compared across different spine service areas in the State of Maine (USA), found that the rates varied up to fourfold among the areas examined . Interestingly, the outcomes for patients in the area with the lowest surgery-rate were significantly superior to those in the high surgery-rate areas (79% vs 60% with marked/complete pain relief, respectively) . The patients in the higher-rate areas generally had less severe symptoms at baseline than did those in the lowest-rate area. The authors concluded that the variability may have been related to differences in physicians’ preferences or thresholds for severity with regard to recommending an operation and their criteria for the selection of patients. Waddell et al. have argued that distress may increase the pressure for surgery, and that inappropriate symptoms and signs may obscure the physical assessment, leading to a mistaken diagnosis of a surgically treatable lesion . In this instance, psychological factors may affect the outcome of surgery indirectly if inappropriate illness behaviour leads to inappropriate surgery .
As far as technical success is concerned, one of the most commonly assessed surgical outcomes is the achievement of arthrodesis after fusion surgery, although it has long been a matter of debate whether the presence of pseudoarthrosis has any influence on the subsequent patient-orientated outcome. Some studies have shown that pain relief in particular is greater when solid fusion is achieved [13, 76, 97], although it explains only a small proportion of the variance in pain outcome (4% ). In one recent study of interbody cage lumbar fusion, although 84% patients achieved solid fusion, only approximately 40–50% patients demonstrated a successful outcome in terms of pain, quality of life, global outcome, and work-disability status . Other retrospective studies have indicated that the presence of radiological arthrodesis has no influence on either back-function [37, 74] or work-disability status  after fusion.
It is extremely difficult to identify unequivocal predictor factors that can be used to accurately predict the outcome of surgery. Many predictor factors are contentious, or are at least very specific to the patient profile, the diagnosis, and the surgical technique under investigation (and perhaps even to the institution in which the investigations were carried out). Moreover, the length and type of follow-up appear to play such a decisive role, as does the scientific quality of the study in which the predictor factors are investigated, that it becomes almost impossible to provide a simple recipe for predicting the outcome of surgery with any certainty on an individual basis. Some predictor models or screening tools have been developed [8, 40, 48, 83], but few  have been investigated in a different patient group or under conditions that differ from those in which they were originally developed, limiting their applicability for general use. Moreover, the proportion of variance in outcome explained by even a combination of the strongest predictors is often relatively low, suggesting that we have a long way to go before being able to rest easily having refused someone surgery on the basis of unfavourable baseline characteristics.
Generally consistent predictors of poor-outcome (see also Table 1)
Long duration of symptoms
Severity of pathology on MRI (for disc herniation only)
Comorbidity/other joint problems/poor general health
Psychological distress (e.g. depression, anxiety), especially in patients with chronic pain
Family reinforcement of pain, especially in patients with chronic pain
Smoking (especially for fusion)
Long-term sick-leave/work disability
These modifications, per se, might ultimately result in a greater satisfaction with surgery—if satisfaction is, indeed, determined by having had one’s expectations fulfilled. Most spinal surgery is carried out for disorders that are not life-threatening, and while time may be of the essence for disorders with a very clear-cut diagnosis [69, 71, 79], there are also many that do not require immediate surgical treatment. This is not to suggest that a simple wait and see policy be adopted without further intervention; instead, active measures to minimize risk factors should be taken in order to best prepare the patient for a potential future surgical procedure, and evidence-based conservative treatments should be persevered with in the meantime. Recent studies suggest that many of the latter are as good as surgery for some of the less well-defined indications (e.g. chronic LBP due to degenerative changes) commonly dealt with by spinal fusion [12, 28] and these treatments may be worth considering as an alternative in patients for whom the outcome of surgery is uncertain.
- 9.Block AR, Gatchel RJ, Deardorff WW, Guyer RD (2003) The psychology of spine surgery. American Psychological Association, WashingtonGoogle Scholar
- 11.Bombardier C, Hayden J, Beaton DE (2001) Minimal clinically important difference Low back pain: outcome measures. Pain 28:431–438Google Scholar
- 12.Brox JI, Sorensen R, Friis A, Nygaard O, Indahl A, Keller A, Ingebrigtsen T, Eriksen HR, Holm I, Koller AK, Riise R, Reikeras O (2003) Randomized clinical trial of lumbar instrumented fusion and cognitive intervention and exercises in patients with chronic low back pain and disc degeneration. Spine 28:1913–1921PubMedCrossRefGoogle Scholar
- 18.COST B13 Action (2004) Guidelines for the management of chronic low back pain. www.backpaineurope.orgGoogle Scholar
- 19.Costa PTJ, McCrae RR (1985) The NEO Personality Inventory manual. Psychological Assessment Resources, OdessaGoogle Scholar
- 24.Dupuy HJ (1984) The Psychological General Well-Being (PGWB) Index. In: Assessment of quality of life in clinical trials of cardiovascular therapies. Le Jacq, New York, pp 170–183Google Scholar
- 25.Dworkin RH, Turk DC, Farrar JT, Haythornthwaite JA, Jensen MP, Katz NP, Kerns RD, Stucki G, Allen RR, Bellamy N, Carr DB, Chandler J, Cowan P, Dionne R, Galer BS, Hertz S, Jadad AR, Kramer LD, Manning DC, Martin S, McCormick CG, McDermott MP, McGrath P, Quessy S, Rappaport BA, Robbins W, Robinson JP, Rothman M, Royal MA, Simon L, Stauffer JW, Stein W, Tollett J, Wernicke J, Witter J (2005) Core outcome measures for chronic pain clinical trials: IMMPACT recommendations. Pain 113:9–19PubMedCrossRefGoogle Scholar
- 28.Fairbank JHF, Frost H, Wilson-MacDonald J, Yu LM, Barker K, Collins R (2005) Spine Stabilisation Trial Group: Randomised controlled trial to compare surgical stabilisation of the lumbar spine with an intensive rehabilitation programme for patients with chronic low back pain. The MRC Spine Stabilisation trial BMJ 330(7502):1233Google Scholar
- 32.Graham JR (1990) The MMPI-2: assessing personality and psychopathology. Oxford University Press, New YorkGoogle Scholar
- 35.Greenough CG, Taylor LJ, Fraser RD (1994a) Anterior lumbar fusion A comparison of noncompensation patients with compensation patients. Clin Orthop 30–37Google Scholar
- 40.Hagg O, Fritzell P, Ekselius L, Nordwall A (2003a) Predictors of outcome in fusion surgery for chronic low back pain A report from the Swedish Lumbar Spine Study. Eur Spine J 12:22–33Google Scholar
- 41.Hagg O, Fritzell P, Hedlund R, Moller H, Ekselius L, Nordwall A (2003b) Pain-drawing does not predict the outcome of fusion surgery for chronic low-back pain: a report from the Swedish Lumbar Spine Study. Eur Spine J 12:2–11Google Scholar
- 42.Hagg O, Fritzell P, Nordwall A, Group SLSS (2003c) The clinical importance of changes in outcome scores after treatment for chronic low back pain. Eur Spine J 12:12–20Google Scholar
- 43.Hathaway SR, McKinley JC (1951) The Minnesota Personality Inventory manual revised. The Psychological Corporation, New YorkGoogle Scholar
- 59.Main CJ, Spanswick CC (1995) Personality assessment and the Minnesota Multiphasic Personality Inventory 50 years on: do we still need our security blanket?. Pain Forum 4:90–96Google Scholar
- 61.Mannion AF, Elfering A, Staerkle R, Junge A, Grob D, Semmer NK, Jacobshagen N, Dvorak J, Boos N (2005) Outcome assessment in low back pain: how low can you go? Eur Spine J (in press)Google Scholar
- 62.Mannion AF, Junge A, Grob D, Dvorak J, Fairbank JCT (2005) Development of a German version of the Oswestry Low Back Index. Part 2: Sensitivity to change after spinal surgery. Eur Spine J (in press)Google Scholar
- 71.Nygaard OP, Kloster R, Solberg T (2000) Duration of leg pain as a predictor of outcome after surgery for lumbar disc herniation: a prospective cohort study with 1-year follow up. J Neurosurg Spine 92:131–134Google Scholar
- 72.Ostendorf F (1990) Sprache und Persönlichkeitsstruktur: Zur Validität des Fünf-Faktoren-Modells der Persönlichkeit [Language and personality structure: the validity of the five-factor model of personality]. Roderer, RegensburgGoogle Scholar
- 73.Ostendorf F, Angleitner A (1992) On the generality and comprehensiveness of the five-factor model of personality. Evidence for five robust factors in questionnaire data. In: Modern personality psychology. Critical reviews and new directions. Harvester Wheatsheaf, New York, pp 73–109Google Scholar
- 81.Schallberger U, Venetz M (1999) Kurzversionen des MRS-Inventars von Ostendorf (1990) zur Erfassung der fünf “grossen” Persönlichkeitsfaktoren [Brief versions of Ostendorf’s MRS inventory for the assessment of the Big-Five personality factors]. Universität Zürich, Zürich: Berichte aus der Abteilung Angewandte Psychologie 30:1–51Google Scholar
- 86.Stärkle R, Mannion AF, Junge A, Elfering A, Grob D, Dvorak J, Boos N (2002) The influence of baseline psychological factors on outcome after spine surgery. SIROT, San DiegoGoogle Scholar
- 90.Turk DC, Dworkin RH, Allen RR, Bellamy N, Brandenburg N, Carr DB, Cleeland C, Dionne R, Farrar JT, Galer BS, Hewitt DJ, Jadad AR, Katz NP, Kramer LD, Manning DC, McCormick CG, McDermott MP, McGrath P, Quessy S, Rappaport BA, Robinson JP, Royal MA, Simon L, Stauffer JW, Stein W, Tollett JJW (2003) Core outcome domains for chronic pain clinical trials: IMMPACT recommendations. Pain 106:337–345PubMedCrossRefGoogle Scholar
- 95.Waddell G, Burton AK, Main CJ (2003) Screening to identify people at risk of long-term incapacity for work A conceptual and scientific review. Royal Society of Medicine Press, LondonGoogle Scholar