Background

Symptomatic lumbar disc herniation (SLDH) can be treated non-surgically or surgically. Non-surgical treatment was shown to be effective for SLDH long ago [1], although surgery results in more rapid and effective short-term alleviation of symptoms than non-surgical treatment [2, 3]. However, the long-term effects of the two have not been consistently reported [2,3,4], and there is a risk of complications with surgery [5]. Thus, in many cases, there is not a clear correct decision regarding the use of surgical or non-surgical treatments for SLDH [6].

Since the first case of regression after the non-surgical treatment of SLDH was reported in 1984 [7], the phenomenon of SLDH regression has been widely reported [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46], with the incidence of regression (IR) varying from study to study. Reports on the correlation between the regression of SLDH and clinical outcomes have been contradictory: an early study observed a connection between morphological changes in SLDH and clinical outcomes [41], while later studies found that the regression of SLDH does not correspond with the resolution of symptoms [9, 47]. However, we cannot ignore the physical decompression that occurs during regression in the acute context of SLDH, and the probable regression of SLDH still needs to be considered in clinical practice, according to the guidelines of the North American Spine Society [48]. Understanding the IR of SLDH is clearly of clinical importance. However, scant generalized data regarding the IR are currently available to serve as a reference. When making clinical decisions regarding SLDH, practitioners and patients have little high-level evidence regarding IR to which they can refer.

We therefore performed a systematic review and meta-analysis to provide a comprehensive examination of the IR of SLDH in patients who were treated non-surgically.

Methods

This systematic review and meta-analysis is reported in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [49]. We did not publish a prior protocol for this systematic review and meta-analysis.

Search strategy

For this systematic review and meta-analysis, we searched PubMed, Embase, the Cochrane Central Register of Controlled Trials, and the Web of Science (from inception to September 16, 2019). Search terms included those related to intervertebral disc herniation, regression, comparison, outcome, follow-up, image, and their variants. To avoid missing articles without information about the language in the database records, there was no language limitation in the literature search. A sample search strategy can be found in an additional file. We included studies identified from the references of included articles and other review articles on the topic. Two reviewers performed the searches. Disagreements were resolved by discussion with a third reviewer.

Eligibility and exclusion criteria

Relevant articles pertaining to the phenomenon of the regression of SLDH after non-surgical treatment and potential studies that may have reported morphological changes in lumbar disc herniation (LDH) among the follow-up results for non-surgically-treated SLDH patients were included, with the publication language restricted to English. Randomized controlled trials (RCTs) and nonrandomized studies were eligible for inclusion. The following studies were excluded: 1. Studies that only reported the follow-up results of surgery, including percutaneous endoscopic transforaminal discectomy, microendoscopic discectomy, microdiscectomy, fenestration discectomy, open discectomy, lumbar laminectomy, lumbar interbody fusion and radiofrequency ablation; 2. Studies on cervical discs; 3. Studies that did not report the morphological changes in SLDH; 4. Studies that did not report the number of patients exhibiting regression; 5. Studies on only intradiscal injections, including oxygen-ozone therapy, plasma injection and collagenase chemonucleolysis; 6. Studies on asymptomatic LDH; 7. Studies with less than 10 patients at follow-up; 8. Animal studies; 9. Reviews; and 10. Studies that did not report specific non-surgical treatment.

Quality assessment

The quality of the nonrandomized studies was assessed based on the Methodological Index for Nonrandomized Studies (MINORS) [50]. There is no consensus on when the regression of LDH occurs; thus, item six of the MINORS (follow-up period appropriate to the aim of the study) was not applicable, and the highest total score was 14 (high quality: 10–14; moderate quality: five-nine; and low quality: zero-four). The risk of bias of RCTs was evaluated using a tool from the Cochrane Collaboration [51]. Considering the nature of RCTs of the non-surgical treatment of SLDH, performance bias was generally not a particular concern and had a minor impact on the study quality. Thus, we considered all the included RCTs to have a low risk of performance bias. RCTs were categorized as having a high, low, or unclear risk according to the following criteria: high risk, any item presented a high risk; low risk, no more than 2 items presented an unclear risk; and unclear risk, more than 2 items presented an unclear risk. Two reviewers independently assessed the quality of the included studies and extracted the data. Disagreements were resolved by consensus with a third reviewer.

Data extraction and analysis

Relevant data were extracted using a standardized form that included the publication year, country, study type, study quality or risk of bias, LDH level, regression measurement, imaging method, patient count, total number of SLDH patients at follow-up and number of patients with SLDH regression, as well as age, symptom duration, nerve symptoms, whether regression was defined and follow-up duration. The primary outcome was the IR of SLDH after non-surgical treatment. The IR was estimated based on the total number of SLDH patients at follow-up and the number of patients that experienced regression. For studies that recorded the number of patients according to the regressed proportion or size interval but did not define the interval of non-regression or the number of patients without regression, we regarded the lowest interval as the no-regression range, and the number of patients outside of this interval was considered the number of patients with regression. For studies in which more than two imaging examinations were performed, we used the author’s final count, and if no final count was provided, the latest imaging examinations with enough information were compared to the baseline examinations. For studies reporting the same cohort or trial, only the latest study was included. For studies with overlapping data, we selected the study with the highest number of patients at the last follow-up. Herniations after baseline were not counted. For RCTs, we calculated the total number of occurrences in the two groups.

The I2 statistic was employed to evaluate the heterogeneity of pooled data, and the DerSimonian and Laird random effects model was used to pool the IRs with corresponding 95% confidence intervals (CIs). Incidences from studies with zero events were treated by adding 0.5 cases to both the numerator (number of patients with regression) and denominator (total number of SLDH patients), consistent with recommended practices [52]. Subgroup analysis was performed by stratifying the studies according to the time period, region, study type, LDH level, regression measurement, imaging method and method used to determine the patient count. Potential sources of heterogeneity were explored by meta-regression with a p value less than 0.1. Sensitivity analysis was performed by including only high-quality non-randomized studies and low-risk RCTs and by sequentially excluding each study. Publication bias was assessed using Egger’s test and was visualized with a funnel plot. All statistical analyses were performed using the Meta and metafor packages in R (V3.6.1) [53].

Results

Study selection and characteristics

Our initial search yielded 13,672 articles, and two were hand-selected from reference lists. A total of 38 articles were included in the final meta-analysis (Fig. 1). The non-surgical treatment used in these studies included bed rest, lumbar support, traction, spinal manipulation, physical therapy, exercise, oral steroids, analgesics, nonsteroidal anti-inflammatory agents, epidural block, caudal epidural injections, traditional Chinese medicine and alternative medicine. These articles included 5 RCTs and 33 nonrandomized studies. The studies were from Asia, Europe and North America and were from a total of 13 countries. Japan contributed seven studies; Korea and the USA each contributed five; Turkey contributed four; China, the UK and Italy each contributed three; France and Finland each contributed two; and Denmark, Germany, the Netherlands and Sweden each contributed one. The imaging examinations used were magnetic resonance imaging (MRI) in 29 studies and computed tomography (CT) in eight studies, and one study used CT at baseline and MRI at follow-up. The characteristics of the included studies are summarized in Table 1. These studies reported patient age (14–78), symptom duration (one day-ten years) and follow-up time (20 days – 6.1 years) in different formats (Table 2). A total of 16 studies did not report symptom duration, five studies did not report whether nerve symptoms were experienced by all patients or by a subset of patients, and five studies did not describe or define regression (Table 2).

Fig. 1
figure 1

Study selection

Table 1 Characteristics of the included studies
Table 2 Other characteristics of the included studies

Quality assessment

Of the 5 included RCTs, 4 showed a low risk of bias, and 1 showed an unclear risk of bias. Of the 33 included non-randomized studies, 22 were of high quality, and 11 were of moderate quality (Table 1).

Incidence synthesis and data analysis

The pooled analysis for IR after the nonsurgical treatment of SLDH included 2219 patients, 1425 of whom presented regression. The pooled IR in our study was 63% (95% CI 0.49–0.77), with significant heterogeneity among the studies (I2 = 97.7%, p < 0.001; Fig. 2).

Fig. 2
figure 2

Overall IR after the non-surgical treatment of SLDH. Weights are from the random effects analysis. Grey squares represent the proportional weight of each study in the meta-analysis. The pooled incidence and CIs from studies with zero events were treated by adding 0.5 cases to both the numerator (number of patients with regression) and denominator (total number of SLDH patients)

Subgroup analyses (Table 3) showed that studies that quantitatively measured the regression of SLDH yielded statistically higher (p = 0.02) pooled IRs (81, 95% CI 0.69–0.91) than those that adopted qualitative methods (54, 95% CI 0.37–0.70). We repeated subgroup analyses based on the time period of the study and did not identify any secular trends in the IR of non-surgically treated patients before 2000 (65, 95% CI 0.55–0.75), from 2000 to 2009 (57%, 0.29–0.83), or from 2010 to 2019 (66%, 0.35–0.91). We found no significant regional variation within Asia (63, 95% CI 0.40–0.83), Europe (65%, 0.43–0.85), and North America (60%, 0.22–0.92). The pooled IR gradually increased in RCTs (37, 95% CI 0.00–0.88), prospective studies (67%, 0.57–0.77), and retrospective (84%, 0.65–0.97) studies. Studies of single-level SLDH patients (78, 95% CI 0.67–0.87) yielded higher pooled IRs than those that included both single- and multiple-level SLDH patients (51%, 0.17–0.86). Studies based on MRI yielded the same pooled IR (63, 95% CI 0.46–0.79) as those based on CT (63, 95% CI 0.45–0.79); the IR was calculated as 82% in Saal’s research [42], in which CT was used at baseline and MRI was used at follow-up. Studies that reported the number of patients without regression yielded lower pooled IRs (61, 95% CI 0.44–0.77) than those that did not define regression or reported the number of patients without regression (72%, 0.59–0.83).

Table 3 Subgroup analyses of the regression measurement, time period, region, study type, LDH level, imaging method and patient count

Meta-regression showed that study types (R2 = 41.94% p = 0.02), LDH levels (R2 = 31.53%, p = 0.05), and regression measurements (R2 = 41.94% p = 0.02) contributed to the heterogeneity. There was no significant change in the pooled IR (69, 95% CI 0.54–0.82) or heterogeneity (I2 = 97.2%, p < 0.001) when only high-quality non-randomized studies and low-risk RCTs were included (Fig. 3). The pooled IR varied from 62 to 66% after the sequential omission of any single study.

Fig. 3
figure 3

Forest plot for the meta-analysis of high-quality nonrandomized studies and low-risk RCTs

Publication bias

Egger’s test suggested that there was no publication bias (p = 0.46). No asymmetric patterns were seen in the funnel plot (Fig. 4).

Fig. 4
figure 4

Funnel plot of incidence

Discussion

We found an IR of 63% after the non-surgical treatment of SLDH in the present systematic review and meta-analysis, with significant heterogeneity among the studies. Our pooled IR needs to be interpreted with caution.

We comprehensively searched for studies that potentially reported morphological changes in SLDH during clinical follow-up and that investigated the regression of SLDH. We conducted a wide database search, and a small number of articles were included. Because follow-up is a necessary step to study regression, “follow up” or “follow-up” or “outcome” or “result” was included in the search terms. The use of these search terms resulted in the retrieval of a large number of articles. However, there are so many non-surgical treatment methods for SLDH in the world that it is impossible to limit the specific non-surgical treatment methods in the literature search process. In addition, studies that compared the results of surgical and non-surgical treatment may have reported the morphological changes in herniated discs of the non-surgically treated patients, making it impossible to exclude studies on surgery. As a result, 13,672 articles were identified, more than half of which were studies on surgery for SLDH, and a small number of articles were included. Both RCTs and non-randomized studies were included in our study. The pooled IR in our study was similar to the IR of 66.66% that was reported in a previous review of 11 studies [54], and these IR values can be considered quantitative data that can inform clinical decisions regarding SLDH.

The highest IR (96%) was documented by Lee with an average follow-up of 341 days [46], suggesting that we should seriously consider the probability of SLDH regression. Three studies reported no regression with follow-ups of 45 days [10], 20 days [11], and a median of 5 days (3–7 days) [26], suggesting that SLDH regression should not be expected to occur within one and a half months of symptom onset. The average of the IRs reported in the included studies was 63%, which is the same as the pooled IR of the meta-analysis, and 7 studies reported IRs of approximately 63%: Ahn [21] reported an IR of 69% with an average follow-up time of 8.5 months, Delauche-Cavallier [23] reported an IR of 67% with an average follow-up time of 12.5 months, Bozzao [25] reported an IR of 63% with an average follow-up time of 11 months, Matsubara [34] reported an IR of 62% with an average follow-up time of 9.7 months, Komori [41] reported an IR of 64% with an average follow-up time of 262 days, Bush [43] reported an IR of 64% with an average follow-up time of 1 year, and Iwabuchi [44] reported an IR of 62% with an average follow-up time of 4.1 months. According to Iwabuchi’s report, which reported an IR that was consistent with the average of the IRs reported for the included studies and had a follow-up time of 4.1 months [44], we suggest that 4 months after onset is an important time point for imaging. The follow-up time of the other 6 studies with IRs of approximately 63% ranged from 8.5 to 12.9 months, with an average of 10.5 months. Therefore, we suggest that 10.5 months after onset is another important time point for imaging. There were 4 studies that reported long-term follow-up, with an average duration of more than 24 months: Fagerlund [36] reported an IR of 73% with a follow-up of 24 months, Yukawa [35] reported an IR of 57% with an average follow-up of 30 months, Shin [40] reported an IR of 58% with a follow-up of 3 years, and Ilkko [22] reported an IR of 83% with an average follow-up of 5.2 years. The IR trend reported by these articles over time was inconsistent; some reported that IR increased over time to above the average IR, some reported that IR decreased over time to fall below the average IR, and no secular trends were identified for long-term follow-up.

We did not classify SLDH during the data synthesis, as most of the studies included in our meta-analysis did not include classifications; this is in contrast to another review that calculated IR based on 9 articles reporting that sequestration, extrusion, protrusion and bulging were present in 96, 70, 41 and 13% of patients, respectively [55]. These IR classifications provide a more detailed reference. The probability of SLDH regression should be considered in clinical practice according to the guidelines of the North American Spine Society [48], and we provided an extensive summary of estimated IRs as evidence. Together with existing evidence, our research shows that the regression of SLDH should be fully considered by clinical decision makers. For patients without absolute indications for surgery, the regression of SLDH can be considered very likely, and surgery may be avoided for most patients. As some SLDH patients who were treated non-surgically did not experience regression, the effective prediction of SLDH regression should be explored in the future.

Our study revealed that the study types, LDH levels and regression measurements contributed to the heterogeneity. The increase in the risk of selection bias in the three study types (RCTs, prospective studies and retrospective studies [56, 57]) was consistent with the increase in the pooled IR of the three types of studies, explaining the heterogeneity observed among different study types. The pooled IR of studies that included only single-level SLDH patients was higher than that of those including both single- and multiple-level SLDH patients. Because there are usually more herniated disc tissues in single-level SLDH than in multiple-level SLDH, which induces a more robust inflammatory response, and the most likely mechanism underlying regression is an inflammatory response directed against the herniated disc tissues [58, 59], patients with single-level SLDH are more likely to experience regression than patients with multiple-level SLDH. We also found that studies that included quantitative measurements tended to report higher IRs for SLDH than studies that qualitatively measured LDH. The quantitative methods used in these included studies included 3D volume measurements and cross-sectional area measurements, while the qualitative methods used were visual estimations. Quantitative measurements were performed in millimetres or centimetres, and in some studies, they were accurate to one decimal place. In general, quantitative measurements are better for detecting small dimensional changes than visual assessments. In addition, intervertebral discs are three-dimensional irregularly shaped tissues, making it difficult to capture small changes in their volume on planar images using visual estimation. Quantitative measurements made it easier to record slight changes in the size of the discs on sagittal and cross-sectional views or changes in volume that are rarely detected by visual inspection due to the occurrence of slight changes in multiple directions. Both imaging methods have obvious defects that may cause inaccuracies. For quantitative measurements, it was impossible for the follow-up images to use the exact same slices that were initially scanned [60, 61]. For qualitative measurements, unclear borders and the three-dimensional characteristics of LDH made the judgement of regression inaccurate, especially for visual estimations. In the future, a more standardized and reliable method for determining the occurrence of SLDH regression needs to be established. Other factors, such as age, symptom duration, type of non-surgical treatment and follow-up time, may play a role in heterogeneity, these factors were documented in the included studies, but sufficient information was not available for determining whether these factors contributed to the heterogeneity.

Our study has limitations. We included studies in the meta-analysis without limiting the criteria or measurements for regression to ensure the robustness of IR synthesis, which inevitably led to the inclusion of sources of heterogeneity. Presently, there is no clear definition of the time frame for SLDH regression. The follow-up period of some of the included studies may not have been appropriate or long enough to observe the presence of SLDH regression, making the reported IR lower than the actual IR.

Conclusions

Our meta-analysis results supplement the guidelines of the North American Spine Society on the IR [48]. We revealed an overall IR of 63% among patients with SLDH who were treated non-surgically, thus providing clinical decision makers with quantitative evidence of IR. The probability of regression after the non-surgical treatment of SLDH should be fully considered before making decisions regarding surgery. Based on our systematic review, we suggest a follow-up timeline that consists of the time points 4 and 10.5 months after onset when deciding whether to perform surgery for SLDH. Surgery can be considered for patients with severe symptoms who do not experience regression after 4 months of onset, and we highly recommend surgery for those who do not experience regression after 10.5 months of onset.