Introduction

Spine surgery, a common therapeutic adjunct in degenerative spinal diseases, has noticeably increased over the years [1]. However, this is associated with a growth in the frequency of postoperative complications after spinal surgery [2, 3], even more than 30% patients have underwent revision surgery for complications in degenerative lumbar diseases [4]. Studies have revealed that the paraspinal muscle degeneration, a universal phenomenon among old people, is implicated in multiple degenerative lumbar pathologies [5,6,7]. Thus, the atrophy and fat infiltration (FI) of multifidus (MF), erector spinae (ES) and psoas major (PS) serving as primary extensor and flexor muscles may be closely related to patients’ clinical outcomes.

Currently, the value of preoperative paraspinal muscle morphometry on image examination is being unearthed, which has been assessed as a prognostic factor for several surgical disciplines [8,9,10]. Paraspinal muscle morphometry can be visibly characterized by a decreased cross-sectional area (muscle atrophy) and an increase in fat content (fat infiltration). With manual definition of region of interest and several threshold methods, the cross-sectional area and FI can be measured on magnetic resonance imaging. Although substantial work has been carried out to identify potential factors for spine surgery prognosis [11, 12], few unequivocal predictive factors related with paraspinal muscle have come to light. Some reviews previously concluded the degree of preoperative paraspinal muscle degeneration in relation to disability and persistent pain after surgery [13, 14], whereas no review has focused on the predictive value of paraspinal muscles on complications.

Objectives

The objective of this review is to investigate the association between preoperative paraspinal muscle morphology on MRI and common complications after surgery, inclusive of bone nonunion, pedicle screw loosening, adjacent segment degeneration, proximal junctional kyphosis and sagittal imbalance, in adults with degenerative lumbar spine diseases.

Methods

Literature search strategy

The Preferred Reporting Items for Systematic Reviews and Meta—Analyses (PRISMA) statement was used to structure this systematic review. To retrieve interrelated articles, we conducted a search in the following three databases: PubMed, EMBASE, and Web of Science databases from inception through January 2021. All fields were searched for these terms: “paraspinal muscle”, “paravertebral muscle”, “multifidus”, “erector spinae” or “psoas major”; and “surgery”, “operative”, “complication”, “clinical outcome” or “functional status”; and “lumbar” or “lumbosacral”.

Eligibility criteria

Two authors assessed all abstracts and titles to rate adherence to review criteria. Inclusion criteria consisted of the following: (1) articles including adult with degenerative lumbar diseases; (2) assessment of any lumbar paraspinal muscle characteristic on magnetic resonance imaging (MRI) preoperatively; (3) assessment of any complications after lumbar surgery; (4) analyzed the relationship between preoperative imaging data and postoperative outcomes; and (5) articles were published in English. Studies were excluded if they included subjects < 18 years of age; assessed lumbar muscles through nonconventional MRI (such as functional MRI, MR spectroscopy and chemical-shift MRI); included only postsurgical data. Studies included randomized controlled trials, case–control studies, case series and cohort studies.

Study selection

We screened titles and abstracts for relevant articles from the electronic search based on the eligibility criteria. Relevant full-text articles were obtained and then assessed in the same manner. The study selection process is illustrated in Fig. 1.

Fig. 1
figure 1

PRISMA diagram showing the flow of studies through phases of the review

Assessment of risk of bias

We used the modified Newcastle Ottawa Scale (NOS) [15] for case–control studies and cohort studies and the Joanna Briggs Institute (JBI) Critical Appraisal Tools for case series [16] to evaluate potential bias on account of no onefold, widely accepted tool for evaluating the risk of bias in prognostic studies. All articles meeting review criteria were evaluated independently for risk of bias by two authors, with any differences in assessment resolved by discussions until consensus was reached. For case–control studies and cohort studies, we regarded studies achieving six or more points as high quality. For case series, we regarded studies achieving eight or more points as high quality.

Data extraction

Two authors independently extracted the following information from included studies: study design, participant characteristics, details of MRI assessments of preoperative lumbar muscle characteristics, and clinical outcomes that were relevant to our research question.

Clinical outcome variables

Clinical outcome measures included were bone nonunion (measured using dynamic lumbar X-rays or computed tomography), pedicle screw loosening (measured using spine radiographs or computed tomography), adjacent segment degeneration (diagnosed based on the flexion and extension lateral radiography or on MRI), proximal junctional kyphosis (measured using full spine radiographs) and sagittal imbalance (defined as the deterioration of local alignment or global alignment on full spine radiographs).

Levels of evidence

We performed a qualitative summary of the evidence for lumbar muscle characteristics as predictors of complications, using definitions for levels of evidence applied in previous systematic reviews [17, 18]: “strong” evidence was defined as consistent findings (≥ 80%) in at least two high-quality studies; “moderate” evidence was defined as one high-quality study and consistent findings (≥ 80%) in one or more low-quality studies; “limited” evidence was defined as findings in one high-quality study or consistent findings (≥ 80%) in one or more low-quality studies; “conflicting” evidence was defined as inconsistent findings irrespective of study quality.

Synthesis of results

For evaluating the level of evidence, we decided in advance not to require the consistency of diseases, surgeries and follow-up duration between studies due to high heterogeneity. For abating the effect of heterogeneity, we conducted subgroup analyses in each complication according to measured paraspinal muscles and its method of morphology measure. Similar subgroup analyses were performed by previous studies [14, 19]. We defined the paraspinal extensor muscle (PSE) group as the integrity of MF and ES. The parameters of muscle atrophy included the cross-sectional area of total and lean paraspinal muscles declined; and the parameters of FI covered the percentage of fat content increased and signal intensity of muscles increased. Considering the data obtained from the included literatures was clinically heterogeneous, meta-analysis analysis was precluded. Instead, a narrative synthesis was conducted.

Results

Study selection

We identified 5632 studies through database searching. Of these articles, 122 were deemed to be eligible for full-text review. A total of 16 articles were included. All included studies investigated the correlation between a measure of preoperative paraspinal muscles (e.g. greater tCSA and/or low FI) and at least one complication at follow-up. The search flow diagram is shown in Fig. 1.

Study characteristics and results of assessment of risk of bias

The characteristics of the included studies are summarized in Table 1. Of the 16 included studies, three studies investigated participants with lumbar spinal stenosis (LSS) [20,21,22], three studies investigated degenerative lumbar scoliosis (DLS) [21, 23, 24], one study investigated degenerative flat back [25], one study investigated spondylolisthesis [26] and 9 investigated multiple degenerative lumbar diseases [27,28,29,30,31,32,33,34,35]. Three studies examined the effect of preoperative paraspinal muscles on bone nonunion [20, 27, 28], two studies examined screw loosening [23, 29], four studies examined ASD [26, 30, 32, 35], five studies examined PJK [21, 24, 31, 33, 34], and two studies examined studies examined sagittal imbalance [22, 25]. For paraspinal muscle morphometry, 6 investigated muscle CSA only [20, 27, 29, 30, 33, 34], 2 investigated FI only [26, 28] and 8 investigated both [21,22,23,24,25, 31, 32, 35]. All included studies examined paraspinal muscle morphology on MRI. There were 12 case–control studies [20, 23, 24, 26,27,28,29, 31,32,33,34,35], 1 cohort studies [21] and 3 case series studies [22, 25, 30]. In line with assessment of risk of bias, all studies were rated as high quality (Table 2).

Table 1 Study characteristic
Table 2 Study results and quality of studies

Results of individual studies

Bone nonunion

The strong evidence from two high quality studies showed that the atrophy of both PSE and PS was related to the bone nonunion at follow-up (Table 3). Two studies by Choi et al. [20, 27] included patients with degenerative lumbar diseases who were performed posterior lumbar interbody fusion (PLIF) using stand-alone cages. They found that nonunion group had a smaller tCSA of MF, ES and PS at lower lumbar segments than union group. Besides, Choi et al. [20] furtherly demonstrated that the PSE CSA and PS CSA were negatively correlated with time to fusion. Moreover, limited evidence from one high quality study indicated that the FI of PSE was associated with bone nonunion rate (Table 3). In Lee et al.’s study [28], they reported that the union rate decreased as the fat content of extensor muscles increased at L3-S1 in patients with degenerative lumbar diseases after instrumented fusion. Furtherly, they found that paraspinal muscle FI of ≥ grade 2 evaluated by Goutallier scale could be a cut-off value.

Table 3 Levels of evidence for paraspinal muscle characteristics as predictors of postoperative complications after lumbar surgery

Pedicle screw loosening

Strong evidence from 2 high quality studies found that the atrophy of PSE was associated with screw loosening at follow-up. However, there was conflicting evidence from 2 high quality studies that the FI of PSE was associated with screw loosening. As for PS, limited evidence from 1 high quality study showed that the FI of PS, not the atrophy, was associated with screw loosening (Table 3). A case–control study by Kim et al. investigated the patients who underwent lumbosacral interbody fusion and pedicle screw fixation including L5–S1 [29]. The results figured that smaller CSAs and higher T2 signal intensity in both MF and ES at L5–S1 were related with screw loosening, while PS did not show any difference between S1 screw loosening group and non-loosening group at 1-year follow-up. Another case–control study by Leng et al. [23] included DLS patients underwent corrective surgery and divided them into two groups: group A had six or more fused levels and group B had four or five fused levels. They found that only in group A, patients with lower instrumented vertebra (LIV) screw loosening had a significantly higher muscle-fat index (MFI) of PS and a lower relative functional CSA (FCSA) and relative tCSA of ES at L4-5 and L5-S1.

Adjacent segment degeneration

Strong evidence from 3 high quality studies showed that both atrophy and FI of PSE were associated with the development of ASD at follow-up (Table 3). Kim et al. [31] divided patients who underwent PLIF for degenerative lumbar disease into ASD group and non-ASD group. They found that a smaller relative CSA and a higher FI of PSE at L4-5 were significant factors for ASD. In Chang’s study, patients were included if they had undergone additional surgery for symptomatic ASD after lumbar fusion and were matched to control group [32]. Similarly, they found that the mean FCSA, the ratio of the FCSA to the tCSA, and the skeletal muscle index of the FCSA of the paraspinal muscles group were significantly smaller in patients with ASD compared to the control group. Duan et al. reported that MF FI was most significant at L3 in patients undergoing single-level transforaminal lumbar interbody fusion for spondylolisthesis with ASD than in those without ASD [26]. In addition, limited evidence from 1 high quality study showed that the atrophy of PS was associated with ASD (Table 3). Verla et al. [30] demonstrated that decreased PS thickness at L4-5 was associated with ASD among patients undergoing lumbar fusion.

Proximal junctional kyphosis

There was strong evidence from 5 high quality studies that both atrophy and FI of PSE were associated with proximal junctional kyphosis (PJK) at follow-up (Table 3). Hyun et al. [31] reviewed 44 cases of patients having multilevel spinal instrumented fusion stopping at thoracolumbar junction for adult spinal deformity. They found that a smaller relative CSA of ES and higher MFI of both MF and ES at T10 to L2 preoperatively were identified risk factors for PJK. However, lower muscularity of MF was not corelated to PJK. Another study by Yagi et al. included surgically treated 60 DLS patients and followed for at least 2 years [21]. They found that the preoperative MF average CSA at L5–S1 was correlated with PJK, not the PS CSA. This finding was in accordance with a case–control study by Zhu et al. [33]. They included patients with lumbar degenerative diseases who underwent fusion of L5 and performed a relative FCSA of MF assessment at L4-5. Zhu et al. showed that there was a significant difference in the preoperative relative FCSA of MF between the PJK and non-PJK groups [33]. They also found that the predicting value of MF only emerged in the L1–L2 group, not the T9–T12 group. Pennington et al. found a smaller size of paraspinal muscles at the upper instrumented vertebrae was an independent factor of PJK in patients undergoing thoracolumbosacral fusion greater than 2 levels [34]. Yuan et al. reviewed 84 DLS patients undergoing long instrumented fusion surgery and found that the FCSA of PSE was statistically smaller and MFI of PSE was higher at all levels in the PJK group than in the non-PJK group [24].

Sagittal imbalance

Limited evidence from 1 high quality study indicated that both atrophy and FI of PSE had an association with less LL improvement at follow-up (Table 3). A study by Lee et al. demonstrated that the severity of atrophy or FI of PSE group was correlation with LL improvement in patients with degenerative flat back after corrective surgery [25]. Besides, evidence for an association between PSE and SVA improvement at follow-up was limited. Sho Dohzono et al. uncovered that preoperative tCSA and FI of PSE group at L4–5 were not associated with SVA improvement in LSS patients with preoperative SVA ≥ 40 mm [22].

Discussion

This systematic review identified 16 studies providing evidence for relationships between various lumbar muscle characteristics and five main postoperative complications. First, the review found strong evidence for an association between the atrophy of all paraspinal muscles and bone nonunion. These could be interpreted by the fact that paraspinal muscle atrophy was correlated to poorer function and weakness [36, 37]. Paraspinal muscles with serious degeneration might have a weak effect on reducing the mechanical loading on bone, thus increasing the risk of bone nonunion. In addition, Lee et al. found that paraspinal muscle FI of ≥ grade 2 evaluated by Goutallier scale could be a cut-off value. They thought that in cases with a paraspinal muscle fat contents of ≥ grade 2, more rigid fixation, more graft bone, and meticulous fusion bed preparation should be necessary.

There was strong evidence that PSE atrophy was predictive of screw loosening. Notably, Leng et al. found that screw loosening group had a significantly smaller CSA of ES than non-loosening group in six or more level fusion for DLS, whereas no difference was found between two groups in four or five level fusion [23]. It is postulated that the role of paraspinal muscles to maintain stability was more important as the stress upon LIV was stronger in longer fused levels [23]. Once atrophy of paraspinal muscles appeared preoperatively, especially in patients with long-segment fusion, the screw would undertake stronger stress and was prone to loosening. Thus, we considered that more rigid fixation or more graft bone might be requisite in cases with PSE atrophy during preoperative evaluation.

There was strong evidence that both preoperative atrophy and FI of PSE could predict the development of ASD. Previous study has reported that extensive degeneration and weakness of PSE after operation were risk factors of ASD [38]. Our findings demonstrate a satisfactory predictive value of preoperative PSE degeneration on ASD. This might be because that the paraspinal muscle degeneration potentially adds more stress to the adjacent levels accelerating the degenerative pathway [31]. For reducing ASD, surgeons could use less traumatic techniques to protect paraspinal muscles in patients with preexisting PSE degeneration.

There was strong evidence that both atrophy and FI of PSE were predictive of PJK. It is suggested that surgeons should pay attention to preoperative paraspinal muscle evaluation and perform some methods to prevent PJK in cases with severe muscle degeneration. However, the effect of PSE at which level should be remarkable was still uncertain. Hyun et al. [31] reported that both MF and ES degeneration at T10 to L2 preoperatively were identified risk factors for PJK, whereas Zhu et al. [33] found that the predicting value of MF only emerged in the L1–L2 group, not the T9–T12 group. Yuan et al. found that the degeneration of PSE was severer in the PJK group than the non-PJK group at L1–S1 [24]. Therefore, our results suggested that surgeons should evaluate muscularity and fatty degeneration at both thoracolumbar and lower lumbar area. Whether patients need fusion up to the upper thoracic area when finding severe atrophy and fatty degeneration of PSE in the preoperative MRI evaluation should be furtherly examined. In addition, there was no relationship between both atrophy and FI of PS and PJK with limited evidence.

Two studies have revealed the association between PSE and LL and SVA improvement, while it was still indeterminate with limited evidence. We hypothesized that PSE could provide support for lumbar stability when standing upright, thus the stabilizing effect decreased and sagittal imbalance subsequently developed as the degeneration of muscles occurred [39]. Further studies should explore the prognostic value of paraspinal muscles on the development of spinal curve.

Limitations

Few limitations in this review require to be considered. First, on account of relatively new research on the predictive value of paraspinal muscle morphology, the possibility of publication bias could not be eliminated. Thus, we conducted an extensive search of the literature and screened up to 5,000 articles to minimize this bias. Second, included studies had a high heterogeneity of lumbar diseases, surgical procedures, follow-up duration, paraspinal muscle parameters and definitions for each complication resulting in the inapplicability of meta-analysis. To abate the effect of heterogeneity, subgroup analyses according to previous systematic reviews have been conducted in each complication. In addition, due to the small volume of published literature, one additional study could switch the level of evidence in specific complication. In consequence, there is a need for more high-quality prospective research demonstrating the association for different complications to achieve clinical application.

In addition, another potential limitation in this review was the adaptability of the assessment tools used for the risk of bias. The NOS was designed for cohort and case–control studies, while it was lacking in items evaluating the prognosis research. Therefore, we employed the modified NOS which had been used to assess the risk of bias in prognosis studies [18]. Moreover, the non-response rate in all included case–control studies was not reported or matched, which need to be addressed in future studies to reduce the risk of bias.

Ideally, future research should incorporate a prospective design and control potential confounding factors so as to demonstrate the predictive value of paraspinal muscle morphology. More studies are also required to assess the result consistency through different methodologies of paraspinal muscle morphology [40]. Besides, reporting cut-off value of muscle atrophy and FI related to complications can help clinicians easily distinguish patients who are inclined to have unsatisfactory outcomes at follow-up.

Conclusions

For predicting the postoperative complications, we found strong evidence that preoperative paraspinal muscle degeneration was related to the development of bone nonunion, pedicle screw loosening, ASD and PJK. However, the predictive value of paraspinal muscles on the less improvement of sagittal parameters was indeterminate. In general, it is possible that the assessment of paraspinal muscle degeneration could be a viable method to stratify patients by risk of postoperative complications. On account of the small volume of published literature, there is a need for more high-quality prospective research demonstrating the association for different complications to achieve clinical application.