Background

Low back pain [LBP] is the leading cause of disability internationally according to the latest Global Burden of Disease study [1]. A key intervention for LBP with radiculopathy is lumbar discectomy surgery. The number of discectomies performed in community hospitals in the United States in 2012 was 184,000 and the cost of these procedures has doubled in the past 10 years to exceed 9 billion dollars in 2012 [2]. In the UK, the number of lumbar discectomies performed increased from 7043 in 2001–2002 to 8478 in 2013–2014 [3].

Systematic reviews support that lumbar discectomy is superior to prolonged non-surgical treatment for short-term pain relief and improvement in function for lumbar radiculopathy [4, 5]. In the most recent synthesis across trials [using a range of outcome measures], surgical success rates have been estimated as 46–75% patients at 6–8 weeks, and 78–95% patients at 1–2 years post-surgery [6], supporting it as an effective procedure for many patients presenting with radiculopathy; but illustrating variability in outcome for patients. Clinical data also suggest ongoing disability is an issue for some patients, with 30–70% patients reported to experience residual pain [7]. Recent studies also suggest that recurrent lumbar disc herniation can occur, contributing to reoperation [14% from latest figures in the UK [8]], and can often lead to worse outcomes for patients [9, 10].

It is therefore important to determine prognostic factors predicting patient outcome following lumbar discectomy. Knowledge of prognostic factors would inform selection of patients for surgery and selection of patients for rehabilitation following surgery. Prognosis is a developing field of research [11], and findings can contribute to the clinical decision making and evaluation of new methods of patient management [12]. Although an increasing number of primary studies investigating prognostic factors for patient outcome following lumbar discectomy have been published, there are only 3 systematic reviews to date that have synthesised and reviewed the existing evidence.

The first systematic review by den Boer [13] investigated potential biopsychosocial factors across 11 prospective studies. They found that lower level of education, lower work satisfaction, longer duration of sick leave, higher severity of pre-operative pain, higher level of passive avoidance coping strategies, and higher level of psychological problems were associated with poor outcome for patients following lumbar discectomy. Outcome was defined as pain, disability or work capacity or their combination. However, risk of bias was not assessed for the included studies, and heterogeneity of outcome measures and candidate predictors limited both analyses and confidence in the review’s findings, although a basic rating system of the level of evidence was used. In the second systematic review, Sabnis and Diwan [14] investigated the timing of lumbar discectomy across 21 prospective and retrospective studies, and randomised controlled trials. They found that long duration of pre-operative leg pain was associated with poor outcome for patients. However, patient outcome was not clearly defined and risk of bias was not assessed for the included studies [an unsupported scoring system was used to assess aspects of quality], which limits confidence in the review’s findings, although an early best-evidence rating system was used. In the third systematic review, Schoenfeld and Bono [15] investigated the timing of lumbar discectomy surgery across 11 prospective and retrospective studies. They found that a longer duration of pre-operative symptoms was associated with poor patient outcome and identified 6 months duration of symptoms as the critical point when outcome started to be compromised i.e. symptom duration ≥6 months was associated with poor outcome for patients. A range of outcome measures were employed across studies [Short Form Health Survey [SF36], Oswestry Disability Index [ODI], motor weakness, delayed recovery, Visual Analogue Scale [VAS] pain, Japanese Orthopaedic Association Back Pain Questionnaire [JOABPEQ], psychological disorders, degree of return to activities of daily living, pain/disability score [PDS], failed back surgery syndrome, clinical outcome score, good postoperative outcome score, pain and working capacity]. Along with no assessment of risk of bias for the included studies [an unsupported scoring system was used to assess aspects of quality], the heterogeneity of outcome measures and candidate predictors limited analyses and limits confidence in the findings, although an early best-evidence rating system was used.

There is absence of a PRISMA compliant systematic review of prospective cohort studies with a long-term follow-up to synthesise the data investigating in particular, the physical factors that may be associated with patient outcome following lumbar discectomy which are commonly used as indications for surgery [5]. In addition, although early best-evidence rating systems have been used in previous reviews, none have focused on the key issues for this type of review, for example difference in phases of investigation is very relevant to this field of research to ensure a solid theoretical/conceptual model underpinning studies. Identification of physical prognostic factors, which are utilised for clinician’s decision making [16,17,18], could help inform clinicians which patients are likely to have a more or less favourable outcome. This would allow clinicians to manage their patient’s expectations prior to surgery and help their patient’s make an informed choice about surgery and alternative management strategies.

Objective

To determine whether pre-operative physical factors are associated with post-operative outcomes in adult patients [≥16 years old] undergoing lumbar discectomy or microdiscectomy.

Methods

This review was guided by a pre-defined and registered protocol [CRD42015024168], and followed method guidelines of the Back Review Group of the Cochrane Collaboration [19], Cochrane Handbook [20] and PRISMA-P [21]. This systematic review is reported in line with the PRISMA statement [22].

Eligibility criteria

Types of studies

Prospective observational studies with a minimum of 1 year follow up. No restriction was placed on publication date.

Participants

Patients [≥16 years old] undergoing first time lumbar discectomy for lumbar disc herniation for irradiating leg pain without a rapid progressive severe motor deficit, cauda equina syndrome or severe comorbid conditions [e.g. arthritis or metabolic bone disease], and with no previous history of other lumbar spine operations.

Interventions

Primary, single-level, standard lumbar open discectomy or microdiscectomy.

Physical prognostic factors

Pre-operative physical prognostic factors including low back and/or leg pain intensity, duration of low back and/or leg pain, lumbar spine range of motion, disability, quality of life, clinical signs of motor deficit, sensory deficit, straight leg raise [SLR] test, crossed SLR test, walking distance.

Outcomes

Outcomes recommended in the evaluation of treatment of spinal disorders [23] were included; specifically disability, physical function, pain intensity and health related quality of life [24, 25].

Exclusion criteria were applied (Table 1).

Table 1 Criteria for inclusion and exclusion of studies

Information sources

A comprehensive search was performed from inception to 31st March 2017 using key databases:

  • CINAHL, EMBASE, MEDLINE, PEDro and ZETOC.

  • Hand searches of key journals [Spine, European Spine Journal, The Spine Journal].

  • Pubmed

  • Screening reference lists by hand in papers that match the eligibility criteria.

  • Unpublished research: British National Bibliography for Report Literature, Dissertation Abstracts, Index to Scientific and Technical Proceedings, National Technical Information Service, System for Information on Grey Literature.

Search

There was no restriction of the searches to specific languages. The search strategy was developed by one author [KZ] in discussion with a specialist librarian. It was performed independently by two authors [KZ/AP]. A methodological filter for the identification of prognostic studies which has the greatest sensitivity in Medline [26] was adapted for this study and used in combination with a variety of MESH terms and text words. The concepts that were searched included lumbar disc population, with leg pain and/or low back pain presenting symptoms, lumbar discectomy intervention, and studies investigating prognosis as the methodological focus. The Medline OvidSP search is presented in Table 2 as an example.

Table 2 Example of Medline OvidSP Search Strategy

Study selection

After removing duplicates, screening of the titles and abstracts according to the eligibility criteria (Table 1) was performed independently by 2 authors [KZ/AP] to reduce the risk of excluding relevant studies [27]. Full text articles were obtained for the studies that satisfied the inclusion criteria or in any case where eligibility could not be ascertained from the title or abstract. Full text articles were independently screened by 2 authors [KZ/AP]. Discrepancies about inclusion of articles were resolved by discussion and the third author [AR] was planned to resolve any disagreement.

Data collection process

Data were extracted from the studies into standardised forms independently by 2 authors [KZ/AP]. The third author [AR] checked the collected data of the included studies. Investigators were contacted by email to request additional information for missing or unclearly reported data in included studies.

Data items

Data were extracted from each study, including: study population, duration of follow up, prognostic factors, outcome measures and key findings.

Risk of Bias in individual studies

The Quality In Prognostic Studies [QUIPS] tool was used to assess the risk of bias for each individual study. The QUIPS tool was devised for prognostic factor review questions [28] and has demonstrated acceptable inter-rater reliability [median 83.5%] [29]. It consists of 6 categories-domains of potential biases: study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding, statistical analysis and reporting [29]. Each risk of bias domain was rated independently as ‘low’, ‘moderate’ or ‘high’ according to the responses to prompting items, with all domains weighted equally. Overall classification of risk of bias for individual studies was defined as low risk of bias when all domains were rated as low-moderate risk of bias; and high risk of bias when ≥1 domain was rated as high risk of bias [30]. Risk of bias was rated by two authors [KZ/AP] independently. Discrepancies were resolved by discussion and the third author [AR] was available to resolve any disagreement. Inter-rater agreement was planned to be measured with Cohen’s kappa coefficient [31].

Planned method of analysis

According to the protocol and dependent on homogeneity between the included studies, a quantitative analysis was planned. In the situation where a meta-analysis was not justified [owing to high risk of bias and clinical heterogeneity], a qualitative best evidence synthesis of the results was conducted. This synthesis was based on the risk of bias assessment of the included studies, prognostic factors and the strength of the association with the outcome. Consistency of results across studies was reported to contribute to the overall evidence for an individual candidate prognostic factor. Reporting of multivariable analyses, including odds ratios and 95% Confidence Intervals for dichotomous outcome measures, and βand 95% Confidence Intervals for continuous quantitative outcome measures, and p values were reported where possible. The Grading of Recommendations Assessment, Development and Evaluation [GRADE] method [32] was used to rate the overall quality of evidence for a prognostic factor per outcome [e.g. disability], across studies. The GRADE method criteria have been adapted for prognostic factor research [33]. Huguet et al. [33] modified the GRADE domains including 5 factors that may decrease [phase of investigation, study limitations, inconsistency, indirectness, imprecision and publication bias] and 2 factors that may increase [moderate or large effect size [standardized mean difference 0.5–0.8, or odds ratios 2.5–4.25] [33] and exposure-response gradient] the quality level of evidence. As distinct to GRADE used for assessing intervention studies, study design is not a key feature as longitudinal designs are the only option for prognostic research. Phase of investigation is a distinctive GRADE domain for prognostic research with phase 3 explanatory studies [aiming to understand prognostic pathways] and phase 2 explanatory studies [aiming to confirm independent associations between potential prognostic factor and the outcome measure] providing the highest quality of evidence [33].

Risk of Bias across studies

Visual assessment of potential publication bias with Funnel plots was planned to be performed if > 10 studies with comparable outcome measures were identified.

Results

Study selection

The initial search resulted in 6567 citations. After exclusion of duplicates, 1189 citations were screened by title and abstract. The full texts of 45 studies were retrieved and assessed for eligibility. Eight studies met the eligibility criteria. Figure 1 shows the number of studies at each stage of selection and the main reasons for exclusion. Details of studies excluded at the full text stage are detailed in the Additional file 1: Table S1. Three non-English studies were excluded at the full text stage. Complete agreement was achieved at each stage of the study selection process following the independent assessments of the 2 authors. Of the 8 included studies, 2 acknowledged that they presented data from the same sample with the later paper by Lewis et al. reporting data at all timepoints [34, 35]. Two further studies appeared to present data from the same sample with the later 2011 article focusing to data on health-related quality of life outcome measures [36, 37]. A request for clarification from the authors did not receive a response. In both cases, data are presented as the same study to ensure appropriate weighting of the evidence in the narrative synthesis. Overall therefore, 6 studies were included reflecting 8 articles.

Fig. 1
figure 1

Study selection flow diagram

Study characteristics

The main characteristics of the 6 included studies are presented in Table 3.

Table 3 Study characteristics

Methods

The studies were conducted in four different countries and published between 1979 and 2011. Five studies were published, 1 was unpubished [38] but was presented at a conference and data were acquired after personal communication with the authors. The follow-up period in included studies ranged from 1 to 10 years.

Participants

The total number of participants included across the 6 studies was n = 802 and sample sizes ranged from 82 to 228. Age ranged from 17 to 83 years. After communication with the authors of 3 studies that did not report the age range of the participants [34, 36, 39], it was confirmed that all participants were ≥ 16 years old to enable study inclusion.

Physical prognostic factors

The most common physical prognostic factor that was investigated in 5 studies [34,35,36,37, 39,40,41] was pre-operative duration of leg pain, followed by intensity of pre-operative leg pain investigated in 3 studies [36,37,38, 41], and pre-operative back pain investigated in 2 studies [36, 37, 41].

Outcome measures

The range of outcome measures included: VAS for pain, ODI, EuroQol-5 Dimension [EQ-5D] score, SF-36, Neurogenic Symptom Score [NSS] and PDS for quality of life, Core Outcome Measures Index [COMI], Clinical Overall Score [COS], MacNab classification of postoperative outcome, satisfaction with treatment and change in leg/back pain.

Risk of Bias within studies

Of the 6 included studies, 1 was assessed as low risk of bias and 5 as high risk of bias (Table 4). Complete agreement in the assessment of risk of bias in all domains was achieved between the 2 authors. The domain ‘study attrition’ was rated as high risk of bias in 5 of the studies and only the domain ‘outcome’ was rated as low risk of bias in all studies. Most studies did not account for all of the important potential confounders in their study design and the risk of selection bias was also high due to incomplete reporting.

Table 4 Methodological Assessment according to six domains of potential biases [QUIPS]27

Results per physical prognostic factor

Eight different physical prognostic factors were investigated (Table 5). Due to heterogeneity between the included studies [predictors, follow-up timepoints, outcome measures] a meta-analysis was not justified, and a qualitative best evidence synthesis of the results was performed. In particular, there was great diversity in the patient outcomes assessed in the included studies. Using the adapted GRADE method for prognostic research [33] to rate the overall quality of evidence, all included studies were phase 1 predictive modelling or explanatory studies carried out to generate a hypothesis, and consequently the quality of evidence was moderate as a starting point (Table 6) [33]. The level of evidence was downgraded in particular for inconsistency, and only upgraded for effect size for 2 prognostic factors.

Table 5 Overview of Significant Physical Prognostic Factors: synthesis across included studies [bivariate and multivariable analyses when reported are documented here for consistency - reporting was inconsistent across studies]
Table 6 Adapted Grading31 of Recommendations Assessment, Development and Evaluation [GRADE] table for systematic reviews with meta-analysis of prognostic studies for positive outcome across a range of measures

ODI

The ODI was included in 2 studies [36, 41] as a candidate prognostic factor. There were inconsistencies regarding the association between the ODI and several outcomes. One study [low risk of bias] found no association [leg pain or back pain at 2 or 7 years], while 1 study [high risk of bias] found that higher disability was associated with better patient outcome [ODI at 12 months]. Using GRADE, there is very low level evidence that ODI is not associated with patient outcome.

Duration back pain

Duration of back pain was included in 2 studies [39, 41] as a candidate prognostic factor. Consistent findings from the 2 studies [high risk of bias] found no association with patient outcome [Clinical Overall Score and ODI at 12 months]. Using GRADE, there is very low level evidence that duration of back pain is not associated with patient outcome.

Duration leg pain

Pre-operative duration of leg pain was included in 5 studies [34,35,36,37,38,39, 41] as a candidate prognostic factor. There were inconsistencies regarding the association between the duration of pre-operative leg pain and numerous outcomes. Three studies [1 low risk of bias and 2 high risk of bias] found no association [leg pain, back pain and ODI at 12 months; leg pain and health-related quality of life at 2 and 7 years] while 2 studies [both high risk of bias] found that shorter pain duration was associated with better patient outcome [Pain Disability Score and Clinical Overall Score at 12 months]. Using GRADE, there is low level evidence that duration of pre-operative leg pain is not associated with patient outcome.

Severity leg pain

Severity of leg pain was included in 3 studies [36,37,38, 41] as a candidate prognostic factor. There were inconsistencies regarding the association between the severity of leg pain and several outcomes. Two studies [1 low risk of bias and 1 high risk of bias] found no association [health related quality of life at 2 and 7 years; ODI at 12 months], while 2 studies [1 low risk of bias and 1 high risk of bias] found that higher severity of leg pain was associated with better patient outcome [leg pain at 2 and 7 years; Core Outcome Measures Index at 12 months]. Using GRADE, there is low level evidence that higher severity of pre-operative leg pain predicts better Core Outcome Measures Index at 12 months and better post-operative leg pain at 2 and 7 years.

Severity back pain

Severity of back pain was included in 2 studies [36, 37, 41] as a candidate prognostic factor. Consistent findings from the 2 studies [1 low risk of bias and 1 high risk of bias] found no association with patient outcome [ODI at 12 months; back pain and health-related quality of life at 2 and 7 years]. Using GRADE, there is very low level evidence that severity of back pain is not associated with patient outcome.

Health-related quality of life

Health-related quality of life [EQ5D] was included in 1 study [37] as a candidate prognostic factor. The study [low risk of bias] found that low quality of life pre-operatively was associated with better patient outcome [health-related quality of life at 2 years]. Using GRADE, there is very low level evidence that a lower pre-operative EQ-5D predicts better EQ-5D at 2 years.

Straight leg raise and forward bend

Ipsilateral straight leg raise and forward bend were included in 1 study [34] as candidate prognostic factors. The study [high risk of bias] found that ipsilateral straight leg raise and forward bend were not associated with patient outcome [back pain or leg pain at 5–10 years]. Using GRADE, there is very low level evidence that straight leg raise and forward bend are not associated with patient outcome.

Discussion

This is the first systematic review of physical prognostic factors to evaluate their association with patient outcome following lumbar discectomy. Only 6 studies were included, and risk of bias in the included studies was disappointing with only 1 study at low risk of bias. As a consequence, our current understanding of physical prognostic factors is limited.

Based on the strength of association of the prognostic factors investigated and the overall quality of evidence, we know that pre-operative severity of leg pain [low level of evidence] and quality of life [very low level of evidence] are associated with patient outcome. Specifically, higher severity pre-operative leg pain predicts better Core Outcome Measures Index at 12 months and better leg pain at 2 and 7 years; and lower pre-operative EQ-5D predicts better EQ-5D at 2 years. The findings are consistent with den Boer’s previous review that found higher severity of pre-operative pain was associated with patient outcome [13]. Greater confidence in low risk of bias studies in situations of inconsistency between study findings contributed to severity of leg pain being identified overall as associated with patient outcome and this may be a limitation of this review. Interestingly, apart from the Core Outcome Measures Index, for both significant factors the prognostic factor and outcome were the same measure, and therefore for both of these factors, the reason they were more likely to report improvement could be due to the fact that were starting from a higher level of pain or lower level of quality of life initially.

Other potential predictors examined were pre-operative ODI, duration leg pain, duration back pain, severity back pain, ipsilateral SLR and forward bend, and very low quality of evidence found that they were not associated with patient outcome, except for duration of leg pain where the quality of evidence was low. Consistent findings identified that pre-operative duration of back pain and severity of back pain were not associated with patient outcome [clinical overall score and ODI at 12 months; back pain or EQ-5D at 2 or 7 years and ODI at 12 months respectively]. Findings from 1 study [34, 35] identified that pre-operative ipsilateral SLR and forward bend were not associated with patient outcome, although it is difficult to have any confidence in these findings as they were based on bivariate analyses only [Table 5]. Inconsistent findings identified that pre-operative ODI [1 low risk of bias, 1 high risk of bias study] was not associated with patient outcome [leg pain or back pain at 2 and 7 years; ODI at 12 months]. None of these factors had been examined in previous reviews. Inconsistent findings identified that duration leg pain was not associated with patient outcome [Pain Disability Score, ODI, leg pain, back pain and Clinical Overall Score at 12 months; EQ-5D at 2 and 7 years]. This was in contrast to previous reviews that identified pre-operative duration of leg pain as associated with patient outcome [14, 15]. It is however difficult to have confidence in the findings from previous reviews as they themselves were at risk of bias.

In comparison with other systematic reviews [13,14,15], this review included only prospective cohort studies which are the gold standard design for investigating prognostic factors to enable optimal measurement of outcomes and predictors [42]. Our findings illustrate that the current level of evidence is low/very low. An adequately powered low risk of bias prospective observational study that assesses patient outcome at 12 months following surgery is required to further investigate pre-operative severity of leg pain, EQ-5D and duration of leg pain; and those candidate prognostic factors with inconsistent and very low level evidence to date, specifically ODI, duration back pain, severity back pain, ipsilateral SLR and forward bend. Other physical factors worthy of investigation and examined in studies excluded from this review, include pre-operative motor deficit, sensory loss and walking capacity.

Strengths and limitations

This is the first low risk of bias systematic review [self-assessed using AMSTAR 2 [43]] that has synthesised the evidence for physical prognostic factors predicting patient outcome following lumbar discectomy surgery. However, the review is limited by risk of bias across the small number of available studies, and a lack of comparable outcome measures across studies. This lack of comparable outcome measures meant that the definition of outcome taken into the GRADE analysis was broad encompassing a range of domains and outcome measures. The exclusion of 3 non-English studies could be a major limitation of this review as key findings may have been missed; particular as only 6 studies were included. Discussion of this review’s findings is limited by the scarce literature in this area and the quality of reporting of individual study results which was inconsistent and poor overall.

Conclusions

Results from this systematic review identified low level evidence that higher severity of pre-operative leg pain predicts better Core Outcome Measures Index at 12 months and better post-operative leg pain at 2 and 7 years. There is very low level evidence that a lower pre-operative EQ-5D predicts better EQ-5D at 2 years. Low level evidence supports duration of leg pain pre-operatively not being associated with outcome, and very low-quality evidence supports other factors [pre-operative ODI, duration back pain, severity back pain, ipsilateral SLR and forward bend] not being associated with outcome [range of outcome measures used]. Research to date is however poor, consisting mostly of high risk of bias studies with inadequate reporting of analyses, not enabling full understanding of the prognostic value of physical factors assessed prior to surgery.

An adequately powered low risk of bias prospective observational study, with clear reporting of multivariable analyses is required to investigate all potential physical factors. Knowledge of the physical prognostic factors is essential to inform clinical decision-making processes regarding selection of patients for surgery and potentially the targeting of patients for rehabilitation following surgery. The results of prospective observational studies can help clinicians to decide which people should receive surgery or rehabilitation. However, a limitation is that a difference in prognosis does not necessarily mean a causal link with the surgery. Therefore, when we understand the prognostic factors we need to investigate them in a randomised controlled trial to investigate predictors of treatment response.