Introduction

Spinal decompression (including removal of herniated discs) and spinal fusion are frequently performed surgical procedures in patients with back and/or leg pain due to degenerative lumbar spine disease. In the UK, the frequency of this type of surgery increased from 25 to 49 per 100,000 in the population over the period 1999–2013 [1]. In the USA, the surgery rate is considerably higher (135 per 100,000 in 2013), but does not show a further increase [2].

Unfortunately, there is a considerable proportion of patients who experience recurrent or remaining pain following initial spine surgery, ranging from 3 to 34% at follow-up between 6 and 24 months after surgery, and 5 to 36% upon long-term evaluation (> 2 years) [3, 4]. A recent population-based cohort study in England reported that over the period 2007–2012 on average 20.8% of lumbar surgery patients experienced persistent post-operative pain [5]. Unsatisfactory results after spinal surgery are often referred to as “failed back surgery syndrome” (FBSS), but this term has been criticised because it unilaterally puts the blame on the operation as the cause of the problem, while the aetiology is much more complex and often multifactorial [6,7,8,9].

In a systematic review of 40 studies on lumbar discectomy, the predictive value of 95 preoperative factors for post-operative clinical outcomes was explored [10]. The study revealed 17 factors associated with a positive surgical outcome including more severe leg pain and better mental health status. A negative association with surgical outcome was seen for some anatomic characteristics, but also for patient-related social factors such as worker’s compensation. For 61% of the factors, including age and sex, the results were not significant or conflicting [10]. A secondary analysis of the data from a randomised controlled trial (RCT) and subsequent cohort study on lumbar discectomy showed that the 1- and 3-year risk of recurrent pain was substantially lower in patients with complete initial resolution of leg pain [11]. Other predictive factors that are currently studied include the origin and nature of the pain (nociceptive versus neuropathic pain) [12,13,14], sagittal balance [15, 16], and anatomic characteristics such as the presence of root compression [17] and the type of stenosis [18].

For patients with persisting or recurrent pain after spinal surgery (PPSS), a variety of treatments is used, including conservative treatments (pain medication, physical therapy, psychological rehabilitation, and graded activity), neurostimulation, minimally invasive treatments (a.o selective nerve root blocks, facet and sacroiliac joint infiltration/denervation, epidural injections, often primarily used as a diagnostic procedure), and re-operation. Clinical studies on most of these treatment modalities are scarce and often of limited quality. Most (randomised controlled) clinical studies have been performed for spinal cord stimulation (SCS) [19] and some minimally invasive treatments [20]. A recent systematic review suggested epidural adhesiolysis to be effective in the short term [21]. SCS was shown to be efficacious in studies with a follow-up of 2–3 years, and proved to be more efficacious than conventional medical management and re-operation in distinct patient groups [19, 21]. For other treatments, including re-operation, the authors considered the evidence from available studies to be poor or inconclusive [21]. Similar outcomes were found in a recent comprehensive review in this journal, though the researchers suggested that there is also sufficient evidence to recommend active exercise as a treatment option [22].

The availability of many treatment modalities and limited evidence from clinical studies has induced large practice variations. This variability is further enlarged by the heterogeneity of the patient population and the (potential) involvement of different specialties in the management of PPSS (general practitioners, pain specialists, neurosurgeons, orthopaedic surgeons, neurologists, and rehabilitation physicians). A number of guidelines and algorithms focusing on, or including recommendations on the management of PPSS have been published [6, 23,24,25,26,27]. However, these are often either not very specific, or too much focused on single treatments or specialties. This study sought to establish patient-specific recommendations for the management of PPSS from a multidisciplinary perspective, combining evidence from clinical studies and practice experience of an international expert panel.

Materials and methods

The appropriateness of treatments for persisting pain after spine surgery (PPSS) was assessed using the RAND/UCLA Appropriateness Method (RUAM) [28]. This modified Delphi method has been used to establish appropriateness criteria for surgical, medical, and diagnostic procedures in various fields of medicine [29]. It aims at integrating scientific knowledge and clinical insights of experts to produce detailed statements “regarding the appropriateness of performing a procedure at the level of patient-specific symptoms, medical history, and test results” [29]. The method is particularly helpful when evidence from clinical studies is insufficient to cover the heterogeneity of patients seen in daily clinical practice [29]. The RUAM consists of a structured and iterative process of individual (independent and anonymous) rating rounds and plenary discussion meetings. Results of various methodological and outcome studies support the reliability, internal consistency, and clinical validity of the RUAM [30,31,32,33,34,35].

Panel composition

Panel composition was based on an equal representation of the 3 specialties that are most involved in the treatment of patients with PPSS: neurosurgery, pain medicine, and orthopaedic surgery. Individual panellists were selected on the basis of their scientific and clinical expertise, as well as their involvement in guideline development in the field of PPSS. Furthermore, a reasonable geographic spread over Europe was pursued. The panel included 6 neurosurgeons, 6 anaesthetists/pain specialists, and 6 orthopaedic surgeons from 9 European countries (Belgium, France, Germany, Italy, Spain, The Netherlands, Sweden, Switzerland, and the UK).

Literature overview

A literature study was conducted to support shaping the starting points of the study, and to ensure that participants had access to the same body of evidence during the panel process. To avoid interpretation bias, materials were provided as an overview, rather than as a review.

Panel process

The flow of the panel study is depicted in Fig. 1. During the first panel meeting (Amsterdam, October 2016), the panel discussed the patient population to be considered, treatments to be included, and factors that may be relevant to treatment choice.

Fig. 1
figure 1

Flow diagram of the panel study

The panel agreed to restrict the study to patients with:

  1. (a)

    Persisting, recurrent, or new pain after previous spinal surgery for degenerative disease

  2. (b)

    Symptom duration of ≥ 6 months after disc or decompression surgery, and ≥ 12 months after spinal fusion

  3. (c)

    At least moderate symptoms (cf. VAS ≥ 4) with at least moderate impact on daily functioning (based on, e.g. the Oswestry Disability Index or Roland Morris Disability Questionnaire) and quality of life (based on, e.g. the SF36)

  4. (d)

    Absence of “red flags” (e.g. infectious or malignant lesions, bowel or bladder paralysis)

  5. (e)

    Age ≥ 18 years

  6. (f)

    Absence of absolute contraindications for active treatment (e.g. unfit for surgery, pregnancy, spine infection, and coagulation disorder)

  7. (g)

    Absence of severe psychological disease and/or distress

The panel acknowledged the importance of psychological aspects of chronic pain, but wished to focus on somatosensory aspects. For that reason, the psychological dimension of PPSS and its impact on treatment choice was not considered in this study.

Selected treatment options for the first round were conservative treatment, spinal cord stimulation, re-operation, and “specialised diagnostic evaluation” (including facet denervation, root blocks, and spinal endoscopy). The panel identified a number of variables that may be relevant to treatment choice: previous spinal surgery (decompression/fusion), onset of pain (remaining/recurrent), location of pain (leg/back/mixed), type of pain (neuropathic/nociceptive/mixed), concordance of signs and symptoms with anatomic abnormalities (yes/no/uncertain), and age (< 50, 50–70, > 70 years).

The panel extensively discussed the definitions of neuropathic and nociceptive pain to be used. It was agreed that the taxonomy of the International Association for the Study of Pain (IASP) is a good concept [36], but that its application in PPSS is sometimes problematic, and may confuse the appropriateness ratings. For that reason, it was decided to use the term neuropathic (like) pain in the ratings, thereby including a broader group of patients with neuropathic signs and symptoms than would fit within the IASP definition. This approximates more the reality of presentation of complaints of patients with PPSS. For reasons of brevity, the term neuropathic pain will be used throughout this manuscript.

By permutation of the clinical variables, a set of 324 mutually exclusive scenarios (unique patient profiles) was constructed. Using an electronic rating program, panellists individually assessed the appropriateness of the 4 treatment options for all 324 scenarios using a 9-point scale (reference values: 1 = inappropriate, 5 = uncertain, 9 = appropriate). According to the RUAM definition, a treatment was considered appropriate if the expected benefits outweigh the expected negative consequences by a sufficient margin [29]. Panellists were instructed to take the clinical perspective as a starting point, and to disregard cost and reimbursement of treatments. The rating results were discussed during the second panel meeting (Paris, April 2017). Panellists received feedback on their own ratings in comparison with the anonymous results of their colleagues. The panel discussion led to a number of adaptations to the clinical variables, treatment options, and definitions. The final set included 1 variable with 2 categories, 3 variables with  3 categories, and 1 variable with 5 categories (Table 1), summing up to (21 * 33 * 51 =) 270 scenarios. After exclusion of 60 scenarios that the panel considered to be unrealistic or falling beyond the scope of this study (for example nociceptive leg pain), 210 scenarios remained for which the appropriateness of treatments was assessed during the second round. Final recommendations were established and approved during a web conference (October 2017). An overview of clinical variables and treatment options used in the second round is provided in Table 1.

Table 1 Overview of clinical variables and treatment options used for the construction of patient scenarios in the second rating round

Classification of appropriateness and statistical analysis

Similar to most RUAM studies, appropriateness of treatments was classified using the median panel score and extent of agreement between panellists [29]. The outcome was considered appropriate if the median score was between 7 and 9, and inappropriate if the median was between 1 and 3, without disagreement between panellists. Disagreement was defined as the situation in which at least one-third of the panellists scored in each of the sections 1–3 and 7–9 [29]. All other outcomes were deemed “uncertain”. Frequency tables and cross-tabulations were used to describe the appropriateness outcomes by treatment choice, clinical variables, and specialty. Multivariate logistic regression was used to analyse underlying patterns and to determine the internal consistency of the ratings. All statistical analyses were performed using IBM SPSS for Windows version 25.

Results

Agreement and appropriateness

For all indications together, disagreement after the second round was 8%. Dispersion of opinions was highest for minimally invasive treatment and re-operation in patients with predominant back pain (disagreement 68 and 88%, respectively). Appropriateness outcomes for the theoretical patient population are shown in Fig. 2.

Fig. 2
figure 2

Appropriateness of treatment; second round results. Percentage of the 210 clinical scenarios for which the panel outcome was appropriate, inappropriate, or uncertain

Conservative treatment was considered appropriate for around two-thirds of cases. The choice for minimally invasive treatments showed the highest proportion of uncertainty (84%). Inappropriate indications were seen for neurostimulation (15%) and re-operation (37%). The “rating behaviour” of individual panel members showed large variations: the proportion of ratings in the sections 1–3, 4–6, and 7–9 ranged from 0 to 46%, 3 to 65%, and 15 to 65%, respectively. However, differences in appropriateness outcomes across specialties were modest (Fig. 3). The difference between the mean median scores was at most 1.1 points on the 9-point scale.

Fig. 3
figure 3

Appropriateness of treatments by specialty; CON = conservative treatment, MIN = minimally invasive treatment, NEU = neurostimulation, ROP = re-operation

The appropriateness of treatments by clinical variables is shown in Table 2.

Table 2 Appropriateness of treatments by clinical variables

Evidence for anatomic abnormality appeared to be the most discriminative variable. Conservative treatment was considered appropriate for most patients with pronounced scar tissue or iatrogenic lesion and for those with no or inconclusive evidence for an anatomic abnormality. The same factors were obviously considered as a contraindication for re-operation. Spinal instability proved to be an important factor in favour of re-operation, and against minimally invasive treatment and neurostimulation. The presence of predominant leg pain was the strongest single factor in favour of neurostimulation, while the presence of predominant nociceptive (back) pain was the most important factor against this treatment option. The impact of the type of previous surgery was only significant for minimally invasive treatment: this treatment option was not considered appropriate for most patients with previous instrumented surgery and fusion. Onset of pain was most relevant for the choice of conservative treatment. The appropriateness of this option was substantially higher in patients with new pain, i.e. different from the type of pain for which the initial surgery took place.

Logistic regression analysis, with the outcome appropriate (yes/no) as the dependent variable and including all clinical variables as explanatory factors, confirmed the appropriateness patterns of the 4 treatment options, with predictive values for the statistical models varying between 83 and 94% at a cut-off value of 0.5. Appropriateness outcomes were highly specific. In 48% of the scenarios, only one of the treatment options was appropriate, and in another 39%, at most two options were appropriate. Principal appropriateness patterns are shown in Fig. 4. Detailed tables are provided in “Appendix 1”.

Fig. 4
figure 4

Main appropriateness patterns. Summarised panel outcomes by key variables

Discussion

Due to the heterogeneity of the patient population and the variety of available treatment modalities, the management of PPSS requires an individualised approach, taking into consideration both physical and psychological aspects. This study focused on the somatosensory dimension of treatment choice. The RUAM proved to be helpful in identifying key variables from the perspective of daily clinical practice. Moreover, opinions on the appropriateness of treatments for 210 distinct patient profiles were remarkably in line across the specialties involved (Fig. 3). This is deviant from other RUAM studies that have shown a significant impact of panellist discipline on the ratings [30]. Furthermore, the appropriateness outcomes tended to be highly specific as in 48 and 87% of scenarios exclusively one or maximally two treatments were considered appropriate. The results of regression analysis support the internal consistency of the panel data.

Comparison with guidelines and algorithms

Evidence for anatomic abnormality proved to be the most discriminative variable in relation to the appropriateness of treatments. The absence of an anatomic abnormality or discordance with symptoms clearly and logically rules out surgical intervention, including minimally invasive procedures. Similar results were seen for patients with pronounced scar tissue or an iatrogenic lesion, with the difference that minimally invasive treatments may sometimes be an option here (Fig. 3). Available guidelines and algorithms do not explicitly give recommendations on scar tissue and iatrogenic lesion [6, 23,24,25,26,27]. However, there is general agreement in the surgical community that an already partially or completely damaged nerve or nerve root causing chronic neuropathic pain cannot be restored with any kind of surgery, and symptomatic pain therapy in a multidisciplinary setting is required [37, 38]. For patients with spinal instability, re-operation is the most appropriate option. Conservative treatment in these patients may also be appropriate, (particularly) if the pain is different from the original pain for which previous spinal surgery took place (see “Appendix 1”). Re-operation was considered an appropriate option for most patients with a recurrent disc herniation or spinal/foraminal stenosis, and having predominant leg or mixed leg/back pain (Table 2, “Appendix 1”). Most currently available guidelines and algorithms do either not mention, or are not very specific on the indications for re-operation [6, 23,24,25]. The algorithm by Van Buyten and Linderoth is close to our findings in this respect, as it considers re-operation indicated in patients with (recurrent) disc herniation and leg pain prevailing over back pain [27]. The algorithm suggested by Rigoard and Assaker is also largely in line with our recommendations, albeit that their specification of the related type of symptoms is different (“mixed back pain” or “exclusive/predominant axial back pain”) [26]. According to the panel, neurostimulation is most appropriate in patients with predominant neuropathic leg or mixed leg/back pain, and without clear surgical indications such as spinal instability. This is in accordance with the inclusion criteria of some, but not all RCTs that have studied spinal cord stimulation for “failed back surgery syndrome” [39, 40], albeit that different terms have been used such as “radicular pain” or “neuropathic pain of radicular origin” [39, 40]. The recommendations are also largely in line with the guidelines and algorithms in which neurostimulation was mentioned [6, 23,24,25,26,27]. Neurostimulation was also considered appropriate in the case of scar tissue or a iatrogenic lesion (Table 2), mostly in patients with a neuropathic leg pain component, but also for patients with exclusively neuropathic or mixed back pain (Fig. 4). The efficacy of SCS for PPSS patients with predominant low back pain is currently under study [41]. Minimally invasive treatments showed some specific indications for patients with (mostly) back or mixed leg/back pain after non-instrumented previous spinal surgery (Fig. 4, “Appendix 1”). Some guidelines and algorithms advise selective root blocks, radiofrequency, and epidural injections as a first (diagnostic) step in specific situations (e.g. suspected facet joint syndrome) [6, 26]. The overall panel recommendations are summarised in Table 3.

Table 3 Summarised panel recommendations

Strengths and limitations

Although the use of detailed clinical scenarios and the involvement of different specialties may be considered as strong points of this study, it still is a first step towards an individualised and multidisciplinary approach of managing PPSS.

The panel consisted of 3 different specialities, but there are many other disciplines that are involved in the management of persisting pain after spine surgery, and that all could have had a significant contribution to this project. However, the number of participants had to be limited to allow everyone to be involved in the group discussion. In addition, to be able to detect potential differences between specialities, a minimum number per specialty was needed. For these reasons, the panel was restricted to the primary decision-makers from both the spine and pain perspectives.

Although a variety of patient scenarios was used, the recommendations relate to a theoretical population and the applicability of the clinical factors and the distribution of scenarios need to be determined in daily practice. For reasons of practicability, a set of relatively simple variables without extensive definition was chosen. As was also noticed during the panel discussions, some of these may need further refinement. This is particularly true for the concept of neuropathic (versus nociceptive or mechanical) pain for which there is currently much debate on its precise nature [14, 42,43,44,45] and role in the diagnostic evaluation and treatment choice for patients with PPSS [12,13,14, 46, 47]. Further research may also consider the inclusion of additional somatic variables, potentially relevant to treatment choice, such as sagittal balance [15, 16].

The focus of this study was on somatosensory aspects which is surely an important limitation. Psychological and social aspects are very important in the management of PPSS, but including this dimension would have substantially increased the complexity and extensiveness of the study. This study should therefore be considered as a foundation on which further refinements can be made.

Conclusions

The lack of coherence in the management of PPSS and large practice variations urge consensus development from a multidisciplinary perspective. Using the RUAM, an international panel of pain and spine specialists established a set of appropriateness criteria for conservative, minimally invasive, and surgical interventions. These could be a starting point to improve consistency of care, to further design more specific studies, and to reduce undesirable practice variations. The study outcomes, both the summarised recommendations and look-up tables, may be used for reflection on individual clinical decisions, and also for discussion in educational settings. However, validity and applicability of the panel recommendations in daily clinical practice need further study.