There is a growing body of evidence that psychosocial variables have a significant ability to predict the outcome of medical treatment. In particular, there is considerable evidence that psychosocial variables can affect the outcome of invasive procedures such as spinal surgeries (Boersma & Linton, 2005; DeBerard, Masters, Colledge, & Holmes, 2003; den Boer, Oostendorp, Beems, Munneke, & Evers, 2006; den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006; Gatchel & Mayer, 2008; Gatchel, Mayer, & Eddington, 2006; Hagg, Fritzell, Ekselius, & Nordwall, 2003; LaCaille, DeBerard, Masters, Colledge, & Bacon, 2005) and spinal cord stimulation (SCS) (Burchiel et al., 1996; Giordano & Lofland, 2005; Heckler et al., 2007), especially when the procedure is performed to reduce pain (Gatchel, 2001; Gatchel & Mayer, 2008). The relationship between psychosocial variables and medical outcomes is complex, however, and numerous psychosocial predictors have been identified (Beltrutti et al., 2004; Block, Ohnmeiss, Guyer, Rashbaum, & Hochschuler, 2001; Doleys, Klapow, & Hammer, 1997; Gatchel, 2001; Williams, 1996). Overall there is strong evidence that a collaborative biopsychosocial model is superior to the traditional biomedical model of patient care (Gatchel, Peng, Peters, Fuchs, & Turk, 2007).

A recent extensive review of the literature concluded that psychometric tests are roughly equivalent to medical tests in their ability to diagnose and predict outcomes (Meyer et al., 2001), and are sometimes superior. For example, a recent study found that psychometric assessment was better than either MRI’s or discography in predicting future back pain disability (Carragee, Alamin, Miller, & Carragee, 2005). Similarly, research sponsored by the World Health Organization found psychopathology to be a stronger contributor to disability than disease severity (Ormel et al., 1994). In another study, psychosocial variables predicted delayed recovery correctly 91% of the time, without using any medical diagnostic information (Gatchel, Polatin, & Mayer, 1995). Psychosocial variables have been found to be especially important in the assessment of chronic pain, and pain related disability (den Boer, Oostendorp, Beems, Munneke, & Evers, 2006; Schultz et al., 2004).

If the focus is narrowed to spinal surgeries and interventional procedures, poor outcomes have been found to be associated with a variety of psychosocial variables. Numerous studies have concluded that psychosocial factors were successful in predicting the results of lumbar surgery (Block et al., 2001; den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006; Epker & Block, 2001; Gatchel, 2001; Schofferman Anderson, Hines, Smith, & White, 1992), with one study predicting lumbar surgery outcome correctly 82% of the time using psychosocial predictors (Block et al., 2001). Similarly, a recent review of the literature also found that psychological factors were able to correctly predict the outcome of SCS over 80% of the time (Giordano & Lofland, 2005). It is not surprising that in a survey conducted in 1996, some type of psychological screening was performed in about 70% of clinics involved in implantable devices (Nelson, Kennington, Novy, & Squitieri, 1996). A similar survey in 2005 found that 100% of clinics used some type of psychological assessments for patients being considered for implantable devices for pain (Giordano et al., 2005), perhaps because psychological evaluation prior to SCS is now required by multiple evidence-based medical guidelines (American College of Occupational, Environmental Medicine, 2008; Colorado Division of Worker Compensation: Chronic Pain Task Force, 2007; Work Loss Data Institute, 2008).

A systematic review of the literature found that the variables with the strongest support as predictors of poor surgical outcome are depression, anxiety, somatization, pain, job dissatisfaction, functioning, days away from work, low education, and passive coping (den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006). Additionally, a number of studies have suggested that litigation (Bernard, 1993; DeBerard, Masters, Colledge, Schleusener, & Schlegel, 2001; Epker & Block, 2001; Junge, Dvorak, & Ahrens, 1995; LaCaille et al., 2005; Taylor et al., 2000) and insurance compensation or worker’s compensation (Bernard, 1993; Deyo, Mirza, Heagerty, Turner, & Martin, 2005; Epker & Block, 2001; Glassman et al., 1998; Greenough, Taylor, & Fraser, 1994; Groth-Marnat & Fletcher, 2000; Klekamp, McCarty, & Spengler, 1998; Mannion & Elfering, 2006; Taylor et al., 2000) are also associated with poor surgical outcome. Other identified risk factors for poor surgical outcome include anger (Dvorak, Valach, Fuhrimann, & Heim, 1988; Herron, Turner, & Weiner, 1988), neuroticism (Hagg, Fritzell, Ekselius et al., 2003), psychological distress (Andersen, Christensen, & Bunger, 2006; Derby et al., 2005; Deyo et al., 2005; Graver, Haaland, Magnaes, & Loeb, 1999; Van Susante, Van de Schaaf, & Pavlov, 1998), psychological trauma in childhood (Schofferman, Anderson, Hines, Smith, & Keane, 1993; Schofferman et al., 1992), chemical dependency (Spengler, Freeman, Westbrook, & Miller, 1980; Uomoto, Turner, & Herron, 1988), spousal reinforcement of pain behaviors (Block et al., 2001), no support from spouse (Schade, Semmer, Main, Hora, & Boos, 1999), self-perception of pre-surgical good health (Katz et al., 1999), fear of movement or reinjury (den Boer, Oostendorp, Beems, Munneke, & Evers, 2006), negative outcome expectancy (den Boer, Oostendorp, Beems, Munneke, & Evers, 2006), lack of optimism (Cashion & Lynch, 1979), job stress (Schade et al., 1999), maladaptive beliefs about pain (Burchiel et al., 1995; den Boer, Oostendorp, Beems, Munneke, & Evers, 2006; Samwel, Slappendel, Crul, & Voerman, 2000), history of maladjustment (Block et al., 2001), and lack of English proficiency (Doxey, Dzioba, Mitson, & Lacroix, 1988; Dzioba & Doxey, 1984).

While a number of psychosocial variables appear to be clearly supported by the literature, some controversy remains about a number of other variables. For example, while some studies have found tobacco use to be a predictor of poor outcome from fusion surgery (Andersen et al., 2001; LaCaille et al., 2005; Manniche et al., 1994), others have not (Christensen et al., 1999). Similarly, while some studies have found pain drawings to be predictive of a poor outcome from spinal surgery (Dzioba & Doxey, 1984; Takata & Hirotani, 1995), other studies have not (Hagg, Fritzell, Hedlund et al., 2003). An area where research is lacking is to what extent the psychosocial risk factors vary across medical procedures such as SCS, discectomy and fusion.

Overall, the evidence strongly supports adopting a biopsychosocial approach to the evaluation of candidates for invasive procedures for spinal pain. Over the last 20 years, several protocols for the psychological selection of candidates for elective invasive procedures for pain have been proposed. Please refer to Tables 1 and 2 for an overview of recommended exclusionary and cautionary criteria by various authors.

Table 1 Summary of exclusionary biopsychosocial risk factors for treatment
Table 2 Summary of cautionary biopsychosocial risk factors for treatment

den Boer’s Criteria

The predictive value of biopsychosocial risk factors with regard to the outcome after lumbar disc surgery was examined in a systematic review by den Boer and colleagues (den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006). This study reviewed all articles examining biopsychosocial risks for poor lumber surgery outcome, and selected 11 that met strict scientific criteria. The study identified nine variables that were consistently associated with a poor surgical outcome: pain, functioning, depression, anxiety, somatization, passive coping, job dissatisfaction, low education, and longer time off of work (Table 2).

These risk factors are especially important, as these are the ones for which, at this point in time, there is the most scientific evidence. It should be recognized that while den Boer’s approach is a rigorously empirical one, there are noteworthy gaps in the literature. At the time of this writing, there are no SCS or spinal surgery studies known to us that have investigated how the outcome of those procedures may be influenced by the presence of specific types of severe psychopathology. For example, we can find no research about how the outcome of invasive procedures for pain or injury might be influenced by being imminently suicidal or homicidal, paranoia, brain injury, mania, borderline personality, methamphetamine addiction, dissociative disorders, posttraumatic stress disorder, obsessive-compulsive disorder, and many other conditions. Since den Boer’s approach is based on the literature, these risk factors are not addressed.

Unfortunately, den Boer’s findings have only limited clinical application. While the variables identified by den Boer appear to be important ones to assess, the studies reviewed utilized a variety of measures. Consequently, no recommendations about measures were made, no instructions are given for generating an overall risk score, nor were treatment recommendations made. Thus, while den Boer and colleagues have published the most empirically-based review to date, this approach is not yet at a point where it has clear clinical implications.

Block’s Model of Presurgical Psychological Assessment

One of the most influential methods of presurgical biopsychosocial assessment was developed by Block and colleagues (Block, 1996; Block, Gatchel, Deardorff, & Guyer, 2003; Block et al., 2001). Although Block’s method of presurgical psychological assessment is based on literature review, unlike den Boer, it did not employ a systematic method of literature review. Block and colleagues identified three groups of risk factors, which were psychosocial risk factors, medical risk factors (Block, 1996; Block et al., 2001) and more recently “adverse clinical features” (Block et al., 2003). Unlike den Boer’s criteria, this approach offers a method of assessing risk by tallying the number of risk factors that are present. Block assigns each of the identified psychosocial and medical risk factors a point value based on the judged strength of research findings, and the risk ratings from each of these areas are employed in a clinical algorithm (Block et al., 2003). As with the approach of den Boer and colleagues (den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006), there are no single exclusionary risk factors that in and of themselves are so extreme as to contraindicate an elective surgical procedure. An overview of these criteria is listed in Table 2.

Block’s method (Block et al., 2003) has numerous strengths. First of all, it is based on assessing factors that research studies have found to affect the outcome of spinal surgery. Second, unlike den Boer’s approach, Block’s method incorporates a scoring system. Third, the scores obtained using Block’s method can be used in a clinical decision tree. Fourth, Block and colleagues tested their approach empirically on a group of spinal surgery patients, and found it to be successful 82% of the time (Block et al., 2001). However, Block’s approach is not alone in this success. A review of research studies on psychological predictors of SCS outcome found that a variety of psychological evaluation methods enjoyed a similar success rate with SCS outcome (Giordano & Lofland, 2005), but at this time no one model of assessing patients for SCS seems to be preferred.

Block’s method has some weaknesses and shares den Boer’s empirical Achilles heal. Since there is no research on the impact of severe psychopathology on surgical outcome, many such risk factors are not specifically assessed by Block’s system. Consequently, Block’s system works best when assessing patients without severe or unusual forms of psychopathology, and when weighing the effects of numerous mild to moderate risk factors. However, patients with only a single severe disorder may still receive a positive appraisal. For example, patients exhibiting a paranoid delusion, factitious self-injury, extreme litigiousness, or blatant drug seeking would not be rated as being at risk psychologically using Block’s system, if these symptoms did not appear within the context of a number of other symptoms as well.

Secondly, although some of Block’s risk factors are measured by psychological questionnaires, it does not provide clear definitions about what constitutes a positive finding for many of its criteria. For example, while some of Block’s criteria, like worker’s compensation, have clear definitions; other criteria such as “job dissatisfaction” and “abnormal pain drawing” are not clearly defined. This probably limits the inter-rater reliability of these determinations.

Third, Block’s criteria, for the most part, do not consider the degree to which a risk factor might be present. For example, anxiety is scored as being either present or not, without regard to the degree of anxiety present.

Fourth, Block’s method puts the psychologist in the role of rating medical risk factors. However, physicians are better trained to make many of these determinations, and their opinions should be sought if possible.

Fifth, Block’s approach mentions SCS only in passing (Block et al., 2003), and never references at all approaches specifically developed for SCS (e.g. Beltrutti et al., 2004; Doleys & Olsen, 1997a; Kidd & North, 1996; Nelson et al., 1996; Williams, Gehrman, Ashmore, & Keefe, 2003). Instead, Block’s method focuses on general spinal surgery research, and it is unclear how this applies to other medical treatments.

Overall, Block’s method of assessing a combination of mild to moderate risk factors appears to be a useful predictor of lumbar surgery outcome for patients without severe psychopathology. This method also has the distinct advantage of having extensive documentation regarding how to apply it to clinical practice (Block et al., 2003).

Models of Presurgical Psychological Assessment for Spinal Cord Stimulation

In contrast to the empirical approaches of den Boer (den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006) and Block (Block et al., 2003), selection criteria developed specifically for SCS have generally used a clinically based approach that is more loosely based on research. These models pay much more attention to the problem of serious psychopathology, and also address risk factors specific to SCS (Beltrutti et al., 2004; Doleys & Olsen, 1997b; Nelson et al., 1996; Williams et al., 2003).

It was reported by Beltrutti and colleagues (2004) that in 1993, North and colleagues suggested that certain psychological and behavioral characteristics should exclude a patient from consideration for SCS, even if that patient was a good candidate from a medical perspective. Three years later, Kidd and North (1996) proposed more detailed psychological exclusion criteria (Table 1).

In 1996, Block published the first version of his presurgical selection criteria. Evolving on a separate but parallel path that same year, Nelson and colleagues reviewed the literature on patient selection for SCS, and proposed two tiers of psychological criteria for the selection of patients (Block, 1996; Nelson et al., 1996). The first tier involved exclusionary criteria similar to that proposed by Kidd and North (1996). This tier consisted of extreme psychological criteria, any one of which was believed to be sufficient to exclude the patient from consideration for SCS treatment. In contrast, Nelson’s second tier consisted of less serious cautionary risk factors, where the presence of multiple such problematic findings was thought to increase the chance of a poor outcome. This assessment of cautionary risk factors is similar to that proposed by Block and colleagues (Block, 1996; Block et al., 2001, 2003). Surprisingly though, studies utilizing Block’s approach do not reference the SCS literature, and the SCS literature does not reference Block’s approach (Beltrutti et al., 2004; Doleys et al., 1997; Nelson et al., 1996; Williams, 1996).

In 1997, Doleys and colleagues reviewed the literature on the psychological evaluation of SCS patients, and in that same year also published their own criteria for evaluating risk in implantable pain therapies (Doleys & Olsen, 1997a). In a manner similar to Nelson (Nelson et al., 1996), Doleys and colleagues also recommended the use of two tiers of risk factors (Tables 1, 2). More recently, Williams and colleagues (2003) proposed another model similar to those of Nelson (Nelson et al., 1996) and Doleys (Doleys & Olsen, 1997a), except that it was more detailed in nature (Tables 1, 2).

In 2004, the European Federation of International Association for the Study of Pain Chapters presented a consensus document on exclusionary criteria for SCS (Beltrutti et al., 2004) This model was simpler than that proposed by Nelson (Nelson et al., 1996), Doleys (Doleys & Olsen, 1997a) and Williams (Williams et al., 2003) (Table 1), in that it focused only on the first tier of exclusionary risk factors, and not on the second tier of cautionary risks. Overall, the weakness of this and other SCS approaches to patient selection is that they all lack something which is central to Block’s (Block et al., 2003) method: a defined method of tallying the cumulative effect of multiple mild to moderate risk factors, and determining what constitutes a high score. The SCS methods leave the overall estimate of risk entirely to clinical judgment, and this is a significant weakness with regard to clinical applicability.

Moving Towards a Convergent Model

An inspection of Tables 1 and 2 suggests that while there are some differences between the various approaches to the evaluation of candidates for invasive spinal procedures, overall there are extensive commonalities. While it would be premature to say that a consensus exists, these commonalities do suggest that the opinions in the field appear to be converging on a set of criteria that should be evaluated. Even though Block’s approach was developed independently from the other models, many of Block’s criteria have counterparts in the other rating systems. Further, Tables 1 and 2 show significant commonalities between the four SCS approaches listed. In some respects, the format of these tables may make the degree of commonality seem less than it actually is. For example, a variable that is an exclusionary factor in one system may be defined in a less extreme manner as a cautionary risk factor in another. In these cases, the difference in identified risk factors is only a matter of degree. In other cases, differences are the product of alternate aspects of the same construct. For example, “severe doctor-patient conflict” and “threatening behavior” are separate constructs in different systems, which are somewhat different yet clearly conceptually related. When these sorts of commonalities are also considered, the degree of actual convergence of these protocols can be seen to be even greater.

It has previously been theorized that biological, psychological and social factors interact over the natural history of chronic pain disorders (Bruns & Disorbio, 2005). This model used a “vortex” paradigm to illustrate how this interaction can sometimes lead to the development of intractable pain conditions (Fig. 1), where the patient seems to enter a “downward spiral” and does not respond to treatment. The biopsychosocial vortex provides a paradigm of how pain disorders become intractable, and how to intervene. Using the vortex paradigm risk factors are organized into physical symptoms that appear at onset, affective reactions to illness or injury, psychological vulnerability risk factors, social environment risk factors, and the resulting expression of the illness or injury symptoms (Bruns & Disorbio, 2005).

Fig. 1
figure 1

Biopsychosocial vortex paradigm

In the present study, this model was used to organize the risk factors identified by research and the biopsychosocial protocols reviewed in this paper. Using the conceptual framework supplied by the vortex paradigm and a two-tiered approach, this list of risk factors attempts to create a synthesis of the risk factors identified by the Beltrutti (Beltrutti et al., 2004), Block (Block et al., 2003), Doleys (Doleys et al., 1997), Kidd (Beltrutti et al., 2004), Nelson (Nelson et al., 1996), and Williams (Williams, 1996) protocols. This resulted in the list of risk factors in Tables 3 and 4. It should be noted that while some of these risk factors are psychosocial in nature, others can be identified only through medical examination.

Table 3 Exclusionary risk score components from BBHI 2 and BHI 2 measures
Table 4 Cautionary risk score components from BBHI 2 and BHI 2 measures

It was hypothesized that the risk factors summarized in Tables 3 and 4 would be able to predict indications of a poor outcome that included both objective (unemployment due to injury) and subjective measures (the perception that treatment has been ineffective) in patients post spinal surgery. Further, it was hypothesized that these predictions would hold true in other groups of medical patients as well. It was also hypothesized that significantly higher risk levels would be observed in identified at-risk populations. Specifically, it was hypothesized that patients would score significantly higher on these measures than members of the community, and that patients with chronic conditions would score significantly higher than patients with acute conditions. Further, it was hypothesized that the resultant estimates of cautionary and exclusionary risks would correlate significantly with psychometric measures that have been associated with delayed recovery. Lastly, it was hypothesized that these risk factors would be unrelated to race or gender, and would exhibit short-term stability.

Methods

Measures

To assess the variables listed in Tables 3 and 4, this study utilized the Battery for Health Improvement 2 (BHI 2), and a shorter version of this test, the Brief Battery for Health Improvement 2 (BBHI 2). Based on information documented elsewhere (Bruns & Disorbio, 2003; Disorbio & Bruns, 2002), these tests were selected based on the fact that they (1) were developed for the assessment of patients with injury and pain, and underwent an extensive validation process; (2) assess most of the criteria identified by den Boer, Block, and the SCS literature; (3) have both medical patient and community norm groups; (4) have a standardized published form; (5) are short enough to be practical in the clinical setting (35 min for the BHI 2, and 10 for the BBHI 2); (6) have undergone multiple, favorable independent peer reviews by the Buros Institute (Hayes, 2007; Kavan, 2007; Sime, 2007; Vitelli, 2007); (7) have been integrated into clinical protocols (Bruns, Disorbio, & Hanks 2007; Disorbio, Bruns, & Barolat, 2006); (8) are based on a biopsychosocial theory (Bruns & Disorbio, 2005); (9) have been found to predict the outcome of multidisciplinary treatment for pain (Freedenfeld, Bailey, Bruns, Fuchs, & Kiser, 2002) and (10) have been identified by various authors as being tests to consider for this purpose (American College of Occupational, Environmental Medicine 2008; Belar, Deardorff, & American Psychological Association, 2009; Deardorff, 2006a, b; Devlin, Ranavaya, Clements, Scott, & Boukhemis, 2003; Work Loss Data Institute, 2008).

The MMPI-2 (Butcher, 1989) has been widely used to assess medical patients, especially those with chronic pain (Keller & Butcher, 1991). The MMPI 2 was also administered to some of the patient subjects with particular attention being given to the Hs, D and Hy scales. The MMPI 2 Hysteria-Obvious score (Hy-O) was also utilized in this study, as the MMPI-2 Hy scale includes “subtle” items thought by some to reduce the validity of this scale (Mihura, Schlottmann, & Scott, 2000; Osberg & Harrigan, 1999; Sellbom, Ben-Porath, McNulty, Arbisi, & Graham, 2006). As the Hy-O Scale omits these “subtle” items, this scale was included as it may more closely approximate the core construct of the Hy scale.

Subjects

The BHI 2 was administered to 777 patients undergoing rehabilitation who were in treatment for pain or a physical injury, and were from 30 states in all geographical regions of the continental US. The BBHI 2 consists of a subset of the BHI 2 test items, and was also scored. Patients were recruited by posters or fliers provided to them by their providers, and were drawn from a variety of settings, including acute physical therapy, work hardening programs, chronic pain programs, physician offices, and vocational rehabilitation settings. These patients were also drawn from various payor systems (Medicare/Medicaid, private insurance, worker’s compensation, and auto insurance). A total of 527 of these patients were selected for the BHI 2 and BBHI 2 normative patient sample and this sample was found to approximate U.S. census data for race, education, gender and age (Bruns & Disorbio, 2003). The MMPI-2 was administered to 398 of these patients.

A community norm group was also established by administering the BHI 2 to 1,487 community subjects from 16 states in all geographical areas of the continental USA. These subjects were recruited by newspaper advertisements and posters. They were stratified according to race, education, age, and gender and subjects were recruited to match these demographics. No subject was excluded on the basis of past or present medical or psychological diagnoses. A detailed description of these groups is available elsewhere (Bruns & Disorbio, 2003).

Of the patient subjects in this study, 229 were identified as suffering from chronic pain (defined as lasting longer than 6 months), while 262 had acute conditions lasting less than 6 months. Additionally, 129 patients were suffering from head injuries, 264 were in the worker’s compensation system, and 278 were litigating over their healthcare. Finally, using both community and patient groups, 176 subjects reported having undergone spinal surgery, while 397 reported having undergone arm/hand or leg/foot surgery. Each of these groups was assessed separately.

Besides the completed BHI 2, additional data collected included the following: age, gender, highest level of education (less than high school graduate, high school graduate, some college, or college grad or higher), employment status (employed, unemployed due to injury, unemployed for other reasons), ethnicity (white versus all others which were collapsed in to a single “nonwhite” group), litigation status (yes versus no), insurance type (Medicare/Medicaid, personal injury, private health insurance, or worker’s compensation), and medical setting (acute physical therapy, pain program, or work hardening). Additionally, subjects were also asked whether they felt that doctors had done anything to help the patient thus far. This item is on the Doctor Dissatisfaction scale, but as it was being used as an outcome variable, it was removed for the purposes of this study to eliminate this confounding effect.

The rehabilitation and community groups were administered the BHI 2 in a confidential manner. To maintain confidentiality, patients were given a packet of questionnaires that were assigned a random ID number. No records were kept regarding what ID number a patient or non-patient was assigned, and the data was processed by persons having no contact with or knowledge of the patients. All subjects signed an informed consent indicating that the information would be used for research purposes only, and that no results or feedback from the BHI 2 would be given.

As both authors are in independent practice, and not affiliated with a university or medical center this study was exempt from federal regulations regarding IRB approval. Nevertheless, all ethical principles were observed during the study.

Procedure

Cautionary and Exclusionary Risk Scores were calculated from the measures identified on the BBHI 2 and BHI 2. The method used to do this is as follows: For each Exclusionary Risk Factor identified in Table 3, corresponding measures from the BHI 2 or BBHI 2 were identified where available. If the measure was a scale, an exclusionary risk was scored as positive if it was observed in only about 1% of patients (T > 72 or T < 30). If the measure identified was not a scale score, but rather a content area score, it would be scored as positive if it reached the “Very High” level, which is approximately the 95th percentile or a T-Score of 67 and the highest score indicated for BHI 2 content area measures. Some cells in Table 3 included more than one measure, and risk factors were scored as positive if one or both measures (as indicated) reached a patient T-Score of greater than 72. In some cases, such as litigation and pain range, risk factors were too prevalent by themselves to be considered as Exclusionary Risk Factors, and so they were paired with other risk factors to approximate the 99th percentile. If the measure was a critical item, it was scored as positive if the response to it only occurred in about 5% or less of the patient normative sample.

Similarly, for each cautionary risk factor identified in Table 4, corresponding measures from the BHI 2 or BBHI 2 were identified where available. In contrast to the exclusionary risks, however, measures were rated as indicating a cautionary risk if they were observed in only about 16% of patients (which was plus or minus one standard deviation or T > 59 or T < 41). If the measure identified was not a scale score, but rather a content area score, it would have scored as positive if it reached the “High” level, which is approximately the 84th percentile or a T-Score of 60. The overall risk score then was the number of cells with positive findings in the BBHI 2 and BHI 2 variable columns in Table 3. Similarly, BBHI 2 and BHI 2 Cautionary Risk scores were calculated by applying the same methodology to the risk factors identified in Table 4.

The means of the resulting Cautionary and Exclusionary Risk scores were compared to following groups: patient versus community, acute patient versus chronic patient male versus female, and white versus nonwhite. Additionally, the patients were divided into groups based on employment status and perceived treatment efficacy, and the means of these groups were compared for the following patient populations: spinal surgery, upper or lower extremity surgery, worker compensation, acute injury, chronic pain, head injury, and injury litigants.

Results

The Cautionary and Exclusionary Risk scores correlated strongly with each other, with the BHI 2/BBHI 2 Cautionary scores correlating .86, and the BHI 2/BBHI 2 Exclusionary scores correlating .81. These scores also correlated as predicted with MMPI-2 scale scores, with the BHI 2 Cautionary Risk score and the MMPI-2 HY-O score having the highest correlation. These correlations are listed in Table 5.

Table 5 Correlation of risk scores with MMPI-2 scales often used in patient selection N = 398

Norms for the Cautionary and Exclusionary Risk scores for both the BHI 2 and BBHI 2 were calculated using the BHI 2 normative samples for both patient and community subjects. As expected, the Exclusionary Risk Score for both patients and community exhibited both a median and mode of zero, and the same was true for the BBHI 2 Exclusionary Scores. As noted previously, these are all rarely occurring, extreme indicators. In contrast, the more moderate Cautionary Risk Scores were considerably more prevalent, even in the community norm group (Table 6).

Table 6 Norms and reliability of risk scores for patients and community members

The test–retest reliability of the Cautionary and Exclusionary Risk scores for both the BHI 2 and BBHI 2 were assessed using the 82 patients who had previously been administered these tests twice over the course of about a week for determining the BHI 2 and BBHI 2 scale reliabilities (Bruns & Disorbio, 2003; Disorbio & Bruns, 2002). Overall, the BBHI 2 version of the Cautionary Risk score produced a test–retest reliability of .85, and an Exclusionary Risk score reliability of .92. Similarly, the BHI 2 version of the Cautionary Risk score produced a test-retest reliability of .89, and Exclusionary Risk score reliability of .91.

When comparing the Cautionary Risk scores of the patient versus community normative groups, using t-tests the mean of the patient group was significantly higher than that of the community group for both the longer BHI 2 Cautionary Risk score (p = .000, df = 1,208) and the shorter BBHI 2 Cautionary Risk score (p = .000, df = 1,235; see Table 6). The two Exclusionary Risk scores were also compared for the patient and community normative groups. However, as the two Exclusionary Risk Scores were both highly skewed (Table 6), a nonparametric Mann–Whitney U test was used for comparisons. Using this method, patients were observed to have significantly higher scores (p = .000) than community subjects for both the BHI 2 and BBHI 2 Exclusionary Risks.

Using the same methodology, the mean Cautionary and Exclusionary Risk scores were compared for the acute versus chronic patient groups. The mean of the chronic group was significantly higher than that of the acute group for both the longer BHI 2 Cautionary Risk score (t = 5.57, df = 333, p = .000) and the shorter BBHI 2 Cautionary Risk score (t = 6.11, df = 338, p = .000). Using a Mann–Whitney U test, the scores of the chronic group were significantly higher than the acute group for both the BHI 2 (p = .003, Z = −2.35) and BBHI 2 (p = .019, Z = −2.93) versions of the Exclusionary Risk score.

The scores of the BHI 2 and BBHI 2 Cautionary Risk scores were compared for 1,252 patients and community members for male (N = 564) versus female (N = 688) gender, and white (N = 972) versus nonwhite (N = 280) race. Using t-tests, there were no significant differences in the BHI 2 or the BBHI 2 Cautionary Risk scores based on either race or gender. Similarly, the scores of the BHI 2 and BBHI 2 Exclusionary Risk scores for the same groups were compared using a Mann–Whitney U tests. Here too, there were no significant gender or race-based differences.

Table 7 examines the BHI 2 and BBHI 2 Cautionary Risk Scores of patients who are either employed, or had become employed due to an injury. Patients who were unemployed for other reasons were excluded from this analysis. The subjects tested were broken down into seven different groups, which were subjects who had previously undergone spine surgery, subjects that had previously undergone hand, arm, foot or leg surgery, patients in the Worker’s Compensation system, patients with acute injuries, patients with chronic pain, patients with head injuries, and patients who are also in litigation over their healthcare. Subjects in the worker compensation, acute injury, chronic pain, head injury and injury litigant groups were all patients. In contrast, both community and patient subjects were included in the surgery groups if they had the corresponding surgeries previously. Overall, patients who were unemployed exhibited a significantly higher mean level of Cautionary Risks on 11 of 14 comparisons across seven patient groups (p < .05). During this analysis and elsewhere, Levene’s test for equality of variances was used to correct for all t-tests where the two groups being compared by t-test were found to have significantly different variances. Correction for multiple comparisons was not used as these analyses were based on seven different groups of subjects.

Table 7 Cautionary risk scores and employment status

In Table 8, the same method was employed to determine if these groups differed with regard to whether they perceived any of the treatment offered by their doctors as being helpful or not. In this case, the risk scores of patients who reported that nothing their doctors had done had helped were significantly higher in all 14 comparisons (p < .05).

Table 8 Cautionary risk scores and negative treatment perceptions

While the means of new patient groups were compared using t-tests for the Cautionary Risk Scores, this method was not appropriate for the Exclusionary Risk scores, as their distribution was highly skewed. Consequently, a Mann–Whitney U analysis was determined to be more appropriate. Table 9 shows a Mann–Whitney U analysis of the Exclusionary Risk Factors for employment status. Here, 7 of 14 comparisons determined that unemployed patients had significantly higher Exclusionary Risk scores (p < .05), with two other comparisons trending towards significance (p < .06). Using the same method, 13 out of 14 comparisons made using the seven patient groups found that patients with negative treatment perceptions had significantly higher (p < .05) Exclusionary Risk scores (Table 10).

Table 9 Exclusionary risk scores and employment status
Table 10 Exclusionary risk scores and negative treatment perceptions

Discussion

The reliability, norms and potential race and gender bias of presurgical psychological criteria assessment has not been previously addressed in the literature. In this study, the BHI 2 and BBHI 2 Cautionary and Exclusionary Risk scores were all found to exhibit high short-term reliability. Further, these scores were also based to a considerable extent on scales and measures with established reliability (Bruns & Disorbio, 2003; Disorbio & Bruns, 2002). As always, however, determinations regarding validity are more complex.

Establishing the content validity of the Cautionary Risk and Exclusionary Risk scores began with a literature review, which suggested a convergence of evidence and opinion that a number of biopsychosocial risk factors were likely to negatively impact surgical outcome. To begin to establish construct validity, one must begin with a construct that helps to explain how all of the disparate biopsychosocial variables combine to influence treatment outcome, and this was provided by the Vortex Paradigm (Fig. 1). As the data in this study was gathered at one point in time, predictive validity could not be established. However, concurrent validity was supported by significant correlations with relevant MMPI-2 variables and retrospective and concurrent indications of outcome in multiple groups of patients and community members.

All of the psychosocial assessment protocols reviewed in this study were developed for the assessment of patients who were candidates for spinal surgery or SCS. These types of protocols have not been developed for most other types of patients, such as those in acute settings or having upper extremity injuries. However, it would seem that the Cautionary Risk and Exclusionary Risk identified in this study could potentially impact a patient’s response to a wide variety of medical treatments. Consequently, in addition to patients who had undergone spinal surgery, other patients groups were assessed in this study including: post upper and lower extremity surgery, worker’s compensation, acute injury, chronic pain, head injury, and injury litigants. Overall, the results suggest that Cautionary Risk and Exclusionary Risk scores are associated with both the patient’s subjective perception of treatment efficacy and employment status in a wide range of patient groups. These findings offer additional support for the concurrent validity of the Cautionary and Exclusionary Risk scores.

In general, in this study the Cautionary and Exclusionary Risk scores predicted employment status somewhat less well than satisfaction with treatment. It may be that attempts to use psychological test variables to predict injury-related unemployment is complicated by the fact that societal factors also influence employment status. For example, having worker’s compensation insurance status could guard against unemployment while the patient is in treatment, as the employer may be more motivated to provide light duty work. In other situations, though, even a highly motivated patient may be unable to find employment due to a lack of available jobs. In contrast, the perception of treatment efficacy is a psychological state, which may be more easily predicted by other psychological factors. This may help to explain why in some groups in this study, measures that were strongly associated with perceptions of treatment efficacy did not predict unemployment status. It may be that in the case of some outcome variables, the socioeconomic aspect of the biopsychosocial condition may have more influence on outcome than do the medical or psychological ones.

The validation data from the BHI 2 and BBHI 2 tests made it possible to develop both patient and community norms for these risk scores. By using these tests’ normative samples, it is possible to begin to estimate the degree to which both common and extreme psychosocial risk factors are present in both the typical patient and typical person in the community. More importantly, this also makes it possible to begin to apply psychometric rules of measurement, and to add precision when making two important determinations: (1) what level of biopsychosocial risk is unusual in a patient, and (2) to what degree are mild biopsychosocial risks observed in the average patient and “normal” persons in the community? These determinations provide important benchmarks that may help to determine which risk levels are truly unusual, and which are not.

Assessing Risk in Candidates for Invasive Procedures

Clinical determinations about whether to recommend surgery or SCS involves a complex decision-making process. Most authors have employed the concept of a presurgical psychological evaluation (Block et al., 2003; Doleys et al., 1997; Nelson et al., 1996; Williams, 1996). However, if we take a broader perspective, this process may be more accurately described as a collaborative biopsychosocial evaluation performed jointly by psychologists and physicians. As pointed out in Tables 3 and 4, both medical and psychological opinions are required.

When conducting an evaluation of this type, the appearance of an extreme Exclusionary Risk score on a psychometric measure should not in and of itself be regarded as sufficient to exclude a patient from the proposed treatment. For example, if a patient receives a positive Exclusionary Risk score due to reports of homicidal ideation, that should not by itself be regarded as a definitive finding, as false positives can occur due to random responding, literacy problems, or exaggeration. Instead, such a finding should alert the professional to confirm by interview the extent to which the patient may have violent ideation or impulses, and a judgment must be made regarding whether the risks posed by this are serious enough to override any indications for surgery that may be present.

The degree of surgical necessity that is present is a consideration that has not to this date been adequately addressed in the presurgical psychological evaluation literature. To the extent that surgery is necessary to preserve life or function, psychosocial risks generally do not play a role in the surgical decision-making process. Take for example an intoxicated patient who presents in the Emergency Department having sustained a burst fracture to a lumbar vertebrae during a suicide attempt. Given the risk of paralysis, surgery will likely be performed regardless of the patient’s psychological state. In this case however, a postsurgical psychological consult could develop a plan to address these psychological concerns during rehabilitation.

On the other hand, if medical imaging reveals only a bulging disc of uncertain significance, the decision to proceed with an elective invasive procedure may be influenced heavily by the patient’s complaints of subjective symptoms such as pain. Under such circumstances, it is important to appreciate that in many cases the goal of these invasive procedures is to change the patient’s verbal behavior, cause the patient to report less pain or greater satisfaction with care, or in other ways behave in a less disabled fashion. Ultimately, the decision about whether or not to perform surgery or other invasive procedures is a medical decision. However, when surgery is thus used to change behavior, the presurgical assessment of psychosocial risk factors is paramount. Between these two examples, however, lies an expansive gray zone, where the objective indications for surgery must be weighed against the medical and psychosocial risks for a poor outcome. This is a process that is best performed collaboratively by both physicians and psychologists.

The Collaborative Decision Making Process

The collaborative biopsychosocial decision-making process cannot succeed if either the psychologist or physician fails to appreciate the other’s role. For example, if a surgeon thinking medically, states to the patient, “You are an excellent candidate for this surgery, and it will alleviate your pain. Unfortunately, though, the insurer requires that you see a psychologist first to make sure the pain isn’t all in your head.” This sets up a very adversarial process with the psychologist, who is then placed in the impossible position of being asked to potentially block a surgical procedure that the patient has already been told is medically indicated. If the surgeon thought of pain from a biopsychosocial perspective, however, the psychological consult would be seen to be as much a part of the evaluation process as an MRI or an X-ray.

Similarly, the psychologist can make the reverse mistake. A psychologist could evaluate a patient with a high level of psychosocial risks, and offer a psychological opinion against surgery without reviewing the medical indications for surgery, or the degree of surgical necessity. The psychologist who is performing such evaluations could benefit greatly from reviewing the medical records, and ideally speaking with the referring physician about the case.

It is worth noting that invasive medical treatments sometimes involve sharp contrasts. While a lumbar fusion is a destructive, irreversible surgical process, the patient can evaluate any benefits of SCS in a trial, an implanted unit can be reprogrammed, and if necessary the unit can be removed. The fact that some medical procedures can have irreversible consequences and significant risks is a matter that must be considered, and when present, may suggest a more cautious consideration. In this regard, the risk score norms in this study may be helpful, as they offer an objective means of quantifying the degree to which consensus risk factors are present.

Lastly, if a patient is excluded from medical care for psychological reasons, it is important to remember that many of the psychological risk factors identified in this study are treatable psychological conditions. For example, a patient with chronic pain could be excluded from an elective surgical procedure due to severe depression and suicidality, as suicidality is a potentially fatal condition, and as such takes priority. It would be incorrect, though, to conclude that this is a permanent disqualification. In this case, if the depression could be adequately treated, it may no longer constitute a risk factor, and the patient should be reassessed.

Future Directions

Of all the methods reviewed in this paper, only den Boer and colleagues recognized that the outcome of surgery is multidimensional in nature by looking separately at changes in pain, disability and work capacity (den Boer, Oostendorp, Beems, Munneke, Oerlemans et al., 2006). Beyond this, successful outcome could alternately be defined in terms of successful fusion, improvement in quality of life, patient satisfaction, decrease in opioid use, or reduction in medical utilization. The multidimensional nature of treatment outcome is illustrated by one study that found while an objectively successful fusion occurred in 84% of lumbar fusion patients, nearly half were dissatisfied with their outcome, and many were totally disabled at follow-up (LaCaille et al., 2005). Further, den Boer’s findings suggest that risk factors that predict one type of outcome may not necessarily predict others.

While the data for this is lacking, it would seem a reasonable hypothesis that if the goal of medical treatment is to help the patient to return to work, the degree of job dissatisfaction might be especially relevant to motivation to return to that place of employment. On the other hand, if treatment is attempted with the hope of reducing a patient’s reliance on opioid pain relievers, addictive tendencies and a past history of substance abuse would seem likely to play a greater role. To the extent that outcome of a particular medical intervention is determined to be closely associated with a specific psychosocial variable, that psychosocial variable will need to be weighted more heavily. Given the complexity of these determinations, it seems unlikely that it will be possible to construct a single set of psychosocial criteria that would be the optimal predictor for all medical procedures. That being said, while sets of criteria, like that of den Boer, Block, or the ones developed in this study may have broad clinical utility, the next step in research may be to use methods such as logistic regression to weight risk variables for particular outcome goals. Ultimately, though, clinical determinations made by these biopsychosocial evaluations remains a complex decision-making process, which cannot be accomplished by the mechanistic application of an algorithm.

There are several weaknesses to this study. While the components of the Cautionary and Exclusionary Risk scores have been suggested by numerous prior empirical studies and clinical consensus to be important variables to assess presurgically, and while this study identified relationships between the Cautionary and Exclusionary Risk scores and both objective criterion (employment status) and subjective criterion (patient judgment of the helpfulness of medical treatment), this information was based on concurrent and retrospective information. The predictive validity of the Cautionary and Exclusionary Risk scores themselves will require further study.

While cautionary risk factors lend themselves readily to prospective research, it is much more difficult to study exclusionary risk factors. For example, if a patient is imminently suicidal, appropriate treatment would likely involve psychiatric hospitalization. It is hard to imagine conducting an SCS research study that removed suicidal patients from a psychiatric hospital for an SCS implantation, in order to determine if imminent suicidality does in fact lead to a poor outcome. Given that research on exclusionary factors could involve significant risks to patients, it is unlikely that these conditions will undergo systematic research with regard to medical outcomes, and may instead need to be established by clinical consensus.

Conclusions

This study attempted to lay down a foundation for a standardized risk assessment process that was (1) derived from a research-based paradigm of delayed recovery, (2) addressed issues related to reliability and validity, (3) included the development of norm-based scores, and (4) addressed practical matters pertaining to the application of these findings to the collaborative healthcare setting.

Based on the studies reviewed here, there appears to be converging evidence and expert consensus regarding what biopsychosocial risk factors can potentially influence the outcome of medical treatments such as spinal surgery and SCS. This study approached the assessment of risk using a two-tiered, standardized convergent model that was organized by a biopsychosocial paradigm. Standardized Cautionary and Exclusionary Risk scores were developed based on the identified risk factors, and the resultant scores can be compared to both community and patient norm groups. Data from multiple groups of patients and community subjects provided evidence of concurrent validity. Additionally, these risk scores were found to be highly reliable, and unrelated to race or gender.

Numerous challenges remain. Most importantly, the risk scores developed in this study are general in nature, and are calculated using only a basic process: a tally of the number of scores above a threshold level. It seems very likely that prediction could be improved for any specific patient group by longitudinally assessing the desired outcome, and using stepwise regression techniques to determine the best predictive equation. Potentially, this process could help to identify biopsychosocial risk levels that could compromise a patient’s ability to benefit from medical treatment. Once identified, appropriate interventions could ameliorate these risks, and leave the patient better prepared to be successful. Alternately, these assessments may lead to the consideration of other treatments that are more likely to be effective. In the long run, this may make a significant contribution to improving patient care through a more effective collaboration of medical and psychological caregivers.