Introduction

Low back pain (LBP) is a major cause of disability. It was ranked first and sixth in terms of disability (YLDs) and overall burden (DALYs), respectively [1]. Pharmaceutical and non-pharmaceutical therapies are taken extensively to tackle this issue; in this way, guidelines provide a variety of suggested medicines and practices such as the use of nonsteroidal anti-inflammatory drugs (NSAIDs) and weak opioids in patients with non-specific/acute LBP for short periods [2,3,4,5]. Although antidepressants (ADs) are not recommended as the first-line prescribed medicine to manage LBP, they are taken widely [2, 6,7,8,9]. There is conflicting evidence about the effect of antidepressant, different studies showed their beneficial role in pain reduction while others have opposed them due to the high risk of adverse effects such as dry mouth, dizziness, nausea, headache, and constipation and no clear evidence of efficacy [10,11,12,13]. In addition, some systematic reviews (SRs) and meta-analyses (MAs) which summarized the results of the available evidence, provided heterogeneous results which make it difficult to decision regarding the efficacy of ADs [14,15,16,17,18].

An SR is a type of literature review which critically evaluates research studies. It can summarize results obtained from a plethora of studies helping researchers and clinicians to keep up with the new findings. MA is also a statistical approach to summarize the evidence extracted from secondary data obtained from the SR of studies in a specific subject. SRs and MAs provide a reference source for aiding experts in decision making. Despite their rapid growth and profound influence in health science, discrepancies of the results in studies on the same subject has made them unreliable in decision making. One reason is the matter of methodological quality of the reviews [19,20,21]. In this respect, evaluating the reliability and methodological quality of the studies is of great importance. There are some technical and methodological approaches to enrich SRs and MAs in order to reach valid results [22,23,24,25,26]. For this purpose, the Assessment of Multiple Systematic Reviews (AMSTAR) scale provides an appraisal tool for measuring the methodological quality of SRs [27, 28]. The purpose of this study was to assess the methodological quality of SRs and MAs of the role of ADs in treating LBP using the updated version of AMSTAR.

Materials and methods

Data sources and study selection

We searched for all SRs and MAs up to November 2018 using the PubMed, EMBASE, Medline, and Cochrane Library databases. Our search strategy followed the recommendations of the Cochrane Back Review Group [22,23,24]. Combinations of the following keywords were used in the search: “low back pain” AND “chronic low back pain” AND “non-specific low back pain” AND “sciatica” AND “leg pain” AND “antidepressant” AND (“TCA” OR “tricyclic antidepressants”) AND (“SSRI” OR “selective serotonin reuptake inhibitors”) AND (“SNRI” OR “serotonin and norepinephrine reuptake inhibitors”) AND (“TeCA” OR “tetracyclic antidepressants”) AND “meta-analysis” AND “systematic review”. The text words and MeSH terms were entered depending on the databases characteristics. The reference lists from retrieved articles were also screened for additional applicable studies.

Inclusion and exclusion criteria

We included SRs and MAs of the ADs treatment effects on LBP published in English language. We also included all types of low back pain such as Chronic Low Back Pain (CLBP), Non-specific Low Back Pain (NLBP), Chronic Non-specific Low Back Pain (CNLBP) and sciatica, regardless of the cause of pain such as cancer, fracture, inflammatory disease, etc. There was no limitation on the type of ADs drugs, clinical setting, and study population, while non-systematical reviews and qualitative and narrative reviews were excluded.

Study selection and data extraction

Screening of titles and abstracts of the retrieved studies for inclusion was conducted by two independent reviewers (RBY and MHP). The full texts of the eligible reviews were extracted and evaluated to determine whether they met the inclusion criteria by RBY and MHP. Any disagreements were resolved by consensus through discussion and the third person (FRT). For each study, the following information was extracted: authors, year of publication, study design, type of study and intervention, characteristics of study population, outcome measurement and summary of obtained 50 results. PRISMA flow diagram [29] was used to guide the process of inclusion and exclusion of studies.

Assessment of methodological quality of included studies

Quality assessment was performed independently by two authors (RBY and MHP). Any discrepancies were resolved by discussion, and a blinded third reviewer was consulted if necessary. We used the updated Assessment of Multiple Systematic Reviews (AMSTAR2) appraisal tool to evaluate the methodological quality of eligible SRs and MAs [28]. It has some advantages compared to its previous version, such as the inclusion of non-randomized studies in SRs, and a different scoring system which helps reduce bias produced by quality scores obtained traditionally by summing up scores and getting an overall score [30]. AMSTAR2 contains 16 items; i.e., four domains have been added to this new version of AMSATR. Two of these were adopted directly from the ROBINS-I tool, namely, elaboration of the PICO and the way in which risk of bias was handled during evidence synthesis. Another one was the discussion of possible causes and significance of heterogeneity. The last new domain was the justification of selection of study designs to deal with non-randomized designs. The domain-specific questions in AMSTAR 2 are framed so that a “Yes” answer denotes a positive result. “Not Applicable” and “Cannot Answer” options in the original AMSTAR instrument were removed and “Partial Yes” responses have been provided where it is worthwhile to identify partial adherence to the standard. Moreover, the AMSTAR tool has a good agreement, reliability, construct validity, and feasibility to assess the quality of systematic reviews [31].

Data analysis

Characteristics of the studies are reported in Table 1. In addition, Tables 2 and 3 show the results of AMSTAR2 domain (“Yes”, “Partial Yes”, “No”) of each included study. Moreover, the secular trend of the number and quality of included reviews was illustrated as well.

Table 1 Characteristics of included systematic reviews and meta analyses studies
Table 2 Methodological quality of systematic reviews or meta-analyses using AMSTAR2
Table 3 Methodological quality of the included meta-analyses and systematic reviews

Results

Study identification

Through the initial search, we extracted 3700 potentially relevant articles by searching electronic databases and other resources. After skimming the titles and abstracts and identifying duplications, 3646 articles were excluded. The full texts of the remaining 54 articles were read carefully in their entirety. Twenty-five articles were eligible for the inclusion; 29 Narrative/reviews were excluded from the assessment. All included studies were SRs and MAs on the role of ADs in LBP. The PRISMA flowchart guided the selection process of extracted literature (Fig. 1).

Fig. 1
figure 1

PRISMA Flow Diagram of the Review Search and Identification

Characteristics of included SRs

Characteristics of the 25 SRs and MAs are presented in Table 1. Studies were reported between 1992 and 2017. The number of studies included in MAs ranged from 4 to 10 intervention studies on ADs. Studies included were performed on relatively homogeneous patients or populations which suffer from chronic low back pain (CLBP), non-specific low back pain (NLBP), chronic non-specific low back pain (CNLBP) and sciatica. Moreover, multiple AD drug categories with different dosages were considered as intervention. Six out of 25 included studies had no specific subgroups of drug intervention; others consisted of selective serotonin reuptake inhibitors (SSRIs), serotonin and norepinephrine reuptake inhibitors (SNRIs), tricyclic antidepressants (TCAs), tetracyclic antidepressant (TeCA), selective serotonin reuptake inhibitors (SSRIs), and serotonin-norepinephrine reuptake inhibitors (SNRIs). Regarding study design, most studies included in MAs or SRs were randomized controlled trials. In addition, we reported the results of the AMSTAR quality assessment of each study.

Assessment of methodological quality of included SRs

The assessments of the methodological quality are given in Tables 2 and 3. Out of 25 included studies, 11, 9 and 5 studies were classified as high [2, 14, 16, 17, 33, 34, 37, 41, 44, 45, 48] moderate [12, 35, 36, 42, 43, 47, 49] and low [32, 38,39,40, 46] quality, respectively.

Table 3 shows the results of the methodological quality assessment according to each item. Items 1: “Did the research questions and inclusion criteria for the review include the components of PICO (population, intervention, control group and outcome)?”, 3: “Did the review authors explain their selection of the study designs for inclusion in the review?”, 8: “Did the review authors describe the included studies in adequate detail?”, 10: “If meta-analysis (MA) was justified did the review authors use appropriate methods for statistical combination of results?”, 11: “If meta-analysis (MA) was justified did the review authors use appropriate methods for statistical combination of results?” and 16: “Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review?” were the most common AMSTAR items in which the studies scored highest, while they lost points in 2: “Did the report of the review contain an explicit statement that the review methods were established prior to conduct of the review and did the report justify any significant deviations from the protocol?” and 15: “If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review?”. For items 9, 12, 13 and 14 which were related to the issue of Risk of Bias (RoB) and heterogeneity, they got an average score. 13 (52%) of the studies used a satisfactory technique for assessing the RoB; the Cochrane Collaboration’s tool was the most common tool applied. 5 (50%) of MAs assessed the potential impact of RoB in individual studies on the results of the meta-analysis or other evidence synthesis. 14 (56%) of the review studies accounted for RoB in individual studies when interpreting the results of the review and 16 (64%) of them provided a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review. Only 2 (20%) of the meta-analyses out of 10 carried out an adequate investigation of publication bias (small study bias) and discussed its likely impact on the results of the review. Trend analysis showed that since 2016 an increasing trend was observed with regard to the number of publications in this topic with high quality (Fig. 2).

Fig. 2
figure 2

The secular trend of the number and quality of included reviews

Discussion

Methodological quality assessment

To the best of our knowledge, this is the first study to examine specifically the quality of SRs and MAs on the effectiveness of ADs on LBP using AMSTAR 2. In our study, 11 (44%), 9 (36%) and 5 (20%) studies were classified as high, moderate, and low quality, respectively. The former version of AMSTAR assigns even weights to each item and produces an overall score while it is subjected to bias estimation. To overcome this issue, AMSTAR 2 has been designed in a way that it does not estimate an overall score. A high score may disguise critical weaknesses in specific domains, such as an inadequate literature search or a failure to assess the risk of bias (RoB). RoB is a critical section of the appraisal of any systematic reviews. It was conducted by 13 (52%) of the studies which mostly applied the Cochrane Collaboration’s tool. 8 (32%) of the studies assessed quality instead of RoB; we mention them as well to distinguish studies which did none. Contrary to the AMSTAR which focused on the quality assessment of included studies (Item 7), AMSTAR 2 considered RoB in three items [9, 12, 13]. A study may have the highest possible quality and yet have an important risk of bias. For example, in many situations, it is impractical or impossible to blind participants or study personnel to the intervention group. The Newcastle Ottawa Scale, SIGN, and the Mixed Methods Appraisal Tool as well as Cochrane instrument and ROBINS-I are the most comprehensive instruments for assessing RoB. It is important that the impact of RoB be considered in the results of the MAs, so they should assess the impact of this by meta-regression analysis, or by estimating pooled effect sizes by excluding studies at high RoB through sensitivity analysis. 16 (64%) of the included reviews provided a satisfactory explanation for any heterogeneity observed in the results. As a matter of fact, heterogeneity in the SRs and MAs points to the variation in study outcomes between studies. Considering potential sources of heterogeneity which can be related to the domains of bias or PICO description (population, intervention, control group and outcome) is essential. Assessing heterogeneity through Chi-squared test or I-squared index and conducting appropriate methods of analysis like Fixed/Random-effect models and other methods such as meta-regression and sensitivity analysis help detect sources of heterogeneity and strengthen the results. In addition, 2 (20%) of the included MAs carried out an adequate investigation of publication bias and discussed its likely impact on the results. Methods of exploring publication bias in MAs such as funnel plot, Egger and Begg’s test, etc., were presented [50,51,52,53]. In addition, the secular trend of studies showed that since 2007 which was the initiation of AMSTAR more publications at moderate to high quality were published and since 2016 most of them were high-quality. It showed that authors were more aware of items which can improve the quality of their research and consequently provide more precise and reliable results.

Summary of ADs effect on LBP

Most SRs and MAs in this area, illustrated that there was no clear evidence of ADs effectiveness on LPB [2, 16, 34, 41, 54,55,56] while others achieved contradictory results [18, 35, 36, 57, 58]. Some of them showed that TCAs had significant analgesic effect more than other types of Ads [15, 17, 32, 40, 42, 59,60,61,62], while there exists contradiction as well [63]. In addition, some reviews reported a lack of sufficient data for the conclusion [33, 55, 64]. Significant side effects were observed in ADs as well [2, 12, 15, 18].

Strengths and limitations

The present study is the first to comprehensively assess the methodological quality of SRs on the effect of ADs on LBP. We used the updated version of AMSTAR appraisal tools (AMSTAR 2) which has some merits over the older version. This evaluation can help experts to rely on high-quality studies when getting stuck in the dilemma of conflicting literature. A limitation of our study was that it only included reviews published in English, so publication bias could be introduced.

Conclusion

Although the trend of publishing high quality papers in ADs effect on LBP increased recently, performing more high-quality SRs and MAs in this field with precise subgroups of the type of pains, the class of drugs and their dosages may give clear and more reliable evidence to help clinicians and policymakers.