The predictive value of preoperative paraspinal muscle morphometry on complications after lumbar surgery: a systematic review

The effect of paraspinal muscles atrophy and fat infiltration (FI) on the complications of spinal surgery has not been established. A review of the literature was conducted from a search of the PubMed, EMBASE, and Web of Science databases from inception through January 2021. The literature was searched and assessed by independent reviewers based on criteria that included an assessment of preoperative paraspinal muscle morphology in addition to measuring its relationship to surgical complications. All relevant papers were assessed for risk of bias according to the modified Newcastle Ottawa Scale and the Joanna Briggs Institute Critical Appraisal Tools. A narrative synthesis was conducted. The initial search yielded 5632 studies, of which 16 studies were included in the analysis. All included studies were at a low risk of bias. There existed strong evidence that the atrophy and FI of paraspinal muscles had an association with the development of bone nonunion (two high quality studies), pedicle screw loosening (two high quality studies), adjacent segment degeneration (three high quality studies) and proximal junctional kyphosis (five high quality studies) after lumbar surgery. Besides, there is also limited evidence for association between atrophy and FI of paraspinal extensor muscles and less local and global curve improvement. Strong evidence was found for an association between preoperative paraspinal muscle degeneration and multiple postoperative complications after lumbar surgery. However, the findings should be interpreted with caution due to the small quantity of the available literature and high heterogeneity among studies.


Introduction
Spine surgery, a common therapeutic adjunct in degenerative spinal diseases, has noticeably increased over the years [1]. However, this is associated with a growth in the frequency of postoperative complications after spinal surgery [2,3], even more than 30% patients have underwent revision surgery for complications in degenerative lumbar diseases [4]. Studies have revealed that the paraspinal muscle degeneration, a universal phenomenon among old people, is implicated in multiple degenerative lumbar pathologies [5][6][7]. Thus, the atrophy and fat infiltration (FI) of multifidus (MF), erector spinae (ES) and psoas major (PS) serving as primary extensor and flexor muscles may be closely related to patients' clinical outcomes.
Currently, the value of preoperative paraspinal muscle morphometry on image examination is being unearthed, which has been assessed as a prognostic factor for several surgical disciplines [8][9][10]. Paraspinal muscle morphometry can be visibly characterized by a decreased cross-sectional area (muscle atrophy) and an increase in fat content (fat infiltration). With manual definition of region of interest and several threshold methods, the cross-sectional area and FI Han Gengyu and Dai Jinyue contributed equally to this article.
* Li Weishi puh3liweishi@163.com 1 3 can be measured on magnetic resonance imaging. Although substantial work has been carried out to identify potential factors for spine surgery prognosis [11,12], few unequivocal predictive factors related with paraspinal muscle have come to light. Some reviews previously concluded the degree of preoperative paraspinal muscle degeneration in relation to disability and persistent pain after surgery [13,14], whereas no review has focused on the predictive value of paraspinal muscles on complications.

Objectives
The objective of this review is to investigate the association between preoperative paraspinal muscle morphology on MRI and common complications after surgery, inclusive of bone nonunion, pedicle screw loosening, adjacent segment degeneration, proximal junctional kyphosis and sagittal imbalance, in adults with degenerative lumbar spine diseases.

Literature search strategy
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement was used to structure this systematic review. To retrieve interrelated articles, we conducted a search in the following three databases: PubMed, EMBASE, and Web of Science databases from inception through January 2021. All fields were searched for these terms: "paraspinal muscle", "paravertebral muscle", "multifidus", "erector spinae" or "psoas major"; and "surgery", "operative", "complication", "clinical outcome" or "functional status"; and "lumbar" or "lumbosacral".

Eligibility criteria
Two authors assessed all abstracts and titles to rate adherence to review criteria. Inclusion criteria consisted of the following: (1) articles including adult with degenerative lumbar diseases; (2) assessment of any lumbar paraspinal muscle characteristic on magnetic resonance imaging (MRI) preoperatively; (3) assessment of any complications after lumbar surgery; (4) analyzed the relationship between preoperative imaging data and postoperative outcomes; and (5) articles were published in English. Studies were excluded if they included subjects < 18 years of age; assessed lumbar muscles through nonconventional MRI (such as functional MRI, MR spectroscopy and chemical-shift MRI); included only postsurgical data. Studies included randomized controlled trials, case-control studies, case series and cohort studies.

Study selection
We screened titles and abstracts for relevant articles from the electronic search based on the eligibility criteria. Relevant full-text articles were obtained and then assessed in the same manner. The study selection process is illustrated in Fig. 1.

Assessment of risk of bias
We used the modified Newcastle Ottawa Scale (NOS) [15] for case-control studies and cohort studies and the Joanna Briggs Institute (JBI) Critical Appraisal Tools for case series [16] to evaluate potential bias on account of no onefold, widely accepted tool for evaluating the risk of bias in prognostic studies. All articles meeting review criteria were evaluated independently for risk of bias by two authors, with any differences in assessment resolved by discussions until consensus was reached. For case-control studies and cohort studies, we regarded studies achieving six or more points as high quality. For case series, we regarded studies achieving eight or more points as high quality.

Data extraction
Two authors independently extracted the following information from included studies: study design, participant characteristics, details of MRI assessments of preoperative lumbar muscle characteristics, and clinical outcomes that were relevant to our research question.

Clinical outcome variables
Clinical outcome measures included were bone nonunion (measured using dynamic lumbar X-rays or computed tomography), pedicle screw loosening (measured using spine radiographs or computed tomography), adjacent segment degeneration (diagnosed based on the flexion and extension lateral radiography or on MRI), proximal junctional kyphosis (measured using full spine radiographs) and sagittal imbalance (defined as the deterioration of local alignment or global alignment on full spine radiographs).

Levels of evidence
We performed a qualitative summary of the evidence for lumbar muscle characteristics as predictors of complications, using definitions for levels of evidence applied in previous systematic reviews [17,18]: "strong" evidence was defined as consistent findings (≥ 80%) in at least two highquality studies; "moderate" evidence was defined as one high-quality study and consistent findings (≥ 80%) in one or more low-quality studies; "limited" evidence was defined as findings in one high-quality study or consistent findings (≥ 80%) in one or more low-quality studies; "conflicting" evidence was defined as inconsistent findings irrespective of study quality.

Synthesis of results
For evaluating the level of evidence, we decided in advance not to require the consistency of diseases, surgeries and follow-up duration between studies due to high heterogeneity. For abating the effect of heterogeneity, we conducted subgroup analyses in each complication according to measured paraspinal muscles and its method of morphology measure. Similar subgroup analyses were performed by previous studies [14,19]. We defined the paraspinal extensor muscle (PSE) group as the integrity of MF and ES. The parameters of muscle atrophy included the cross-sectional area of total and lean paraspinal muscles declined; and the parameters of FI covered the percentage of fat content increased and signal intensity of muscles increased. Considering the data obtained from the included literatures was clinically heterogeneous, metaanalysis analysis was precluded. Instead, a narrative synthesis was conducted.

Study selection
We identified 5632 studies through database searching. Of these articles, 122 were deemed to be eligible for full-text review. A total of 16 articles were included. All included studies investigated the correlation between a measure of preoperative paraspinal muscles (e.g. greater tCSA and/

Bone nonunion
The strong evidence from two high quality studies showed that the atrophy of both PSE and PS was related to the bone nonunion at follow-up (Table 3). Two studies by Choi et al. [20,27] included patients with degenerative lumbar diseases who were performed posterior lumbar interbody fusion (PLIF) using stand-alone cages. They found that nonunion group had a smaller tCSA of MF, ES and PS at lower lumbar segments than union group. Besides, Choi et al. [20] furtherly demonstrated that the PSE CSA and PS CSA were negatively correlated with time to fusion. Moreover, limited evidence from one high quality study indicated that the FI of PSE was associated with bone nonunion rate (Table 3). In Lee et al.'s study [28], they reported that the union rate decreased as the fat content of extensor muscles increased at L3-S1 in patients with degenerative lumbar diseases after instrumented fusion. Furtherly, they found that paraspinal muscle FI of ≥ grade 2 evaluated by Goutallier scale could be a cut-off value.

Pedicle screw loosening
Strong evidence from 2 high quality studies found that the atrophy of PSE was associated with screw loosening at follow-up. However, there was conflicting evidence from 2 high quality studies that the FI of PSE was associated with screw loosening. As for PS, limited evidence from 1 high quality study showed that the FI of PS, not the atrophy, was associated with screw loosening (Table 3). A case-control study by Kim et al. investigated the patients who underwent lumbosacral interbody fusion and pedicle screw fixation including L5-S1 [29]. The results figured that smaller CSAs and higher T2 signal intensity in both MF and ES at L5-S1 were related with screw loosening, while PS did not show any difference between S1 screw loosening group and nonloosening group at 1-year follow-up. Another case-control study by Leng et al. [23] included DLS patients underwent corrective surgery and divided them into two groups: group A had six or more fused levels and group B had four or five fused levels. They found that only in group A, patients with lower instrumented vertebra (LIV) screw loosening had a significantly higher muscle-fat index (MFI) of PS and a lower relative functional CSA (FCSA) and relative tCSA of ES at L4-5 and L5-S1.

Adjacent segment degeneration
Strong evidence from 3 high quality studies showed that both atrophy and FI of PSE were associated with the development of ASD at follow-up (Table 3). Kim et al. [31] divided patients who underwent PLIF for degenerative lumbar disease into ASD group and non-ASD group. They found that a smaller relative CSA and a higher FI of PSE at L4-5 were significant factors for ASD. In Chang's study, patients were included if they had undergone additional surgery for symptomatic ASD after lumbar fusion and were matched to control group [32]. Similarly, they found that the mean FCSA, the ratio of the FCSA to the tCSA, and the skeletal muscle index of the FCSA of the paraspinal muscles group were significantly smaller in patients with ASD compared to the control group. Duan et al. reported that MF FI was most significant at L3 in patients undergoing single-level transforaminal lumbar interbody fusion for spondylolisthesis with ASD than in those without ASD [26]. In addition, limited evidence from 1 high quality study showed that the atrophy of PS was associated with ASD (Table 3). Verla et al. [30] demonstrated that decreased PS thickness at L4-5 was associated with ASD among patients undergoing lumbar fusion.

Proximal junctional kyphosis
There was strong evidence from 5 high quality studies that both atrophy and FI of PSE were associated with proximal  [21]. They found that the preoperative MF average CSA at L5-S1 was The CSA at segments (L3-4, L4-5) in bone fusion groups and bone non-fusion group were significantly different (p = 0.047, 0.031) for the PS muscle, those at L2-3 and L4-5 segments between groups were significantly different (p = 0.039, 0.015) for the ES and MF PLIF at L4-5 segment The CSA at all segments (L3-5, L5-S1) in bone fusion groups and bone non-fusion group were significantly different (p < 0.05) for the PS, those at L4-5 and L5-S1 segments between groups were significantly different (p = 0.011, 0.039) for the ES and MF Multivariate analysis (Logistic regression test): the PS CSA at L4-L5 was an independent factor for decreased possibility of non-fusion status in both segments (OR = 0.812, p = 0.028, 95% CI 0.402-1.222) Pearson correlation coefficient: PLIF at L3-4 segment PS CSA at L3-4 (p = 0.048) and L4-5 (p = 0.014) were negatively correlated with time to fusion (p < 0.05) ES and MF CSA at L4-5 (p = 0.042) were negatively correlated with time to fusion PLIF at L4-5 segment PS CSA at all segments (p < 0.05) were negatively correlated with time to fusion ES and MF CSA at L3-4 (p = 0.025) and L4-5 (p = 0.048) were negatively correlated with time to fusion High quality Choi [27] Univariate analyses (Mann-Whitney U test): The CSA at all segments (L3-5, L5-S1) in bone union groups and bone nonunion group were significantly different for the PS muscle (p < 0.05), those at L3-4 and L4-5 segments between groups were significantly different for the ES and MF (p = 0.048, 0.021) Multivariate analysis (Logistic regression test): Differences in the PS CSA at the L4-5 and L5-S1 segments remained significant (p = 0.048, 0.043; OR = 1.098, 1.169; 95%CI 0.998-1.198, 1.002-1.335)

High quality
Lee [28] Univariate analysis: The grade of fat content of paraspinal muscle was significantly higher in the nonunion group than the union group (p < 0.0001) Multiple logistic regression analysis: Fat content of paraspinal muscles remained significantly related to union (p < 0.05)

High quality
Pedicle Screw Loosening Kim JB [29] Student t-test: Smaller CSA and high T2 signal intensity in both MF and ES were in the S1 screw loosening group at 1 year follow-up after surgery (p < 0.05) PS didn't show any difference in both CSA and signal intensity between the two groups at  Mean CSA and rCSA of the paraspinal muscles were significantly smaller in the ASD group than in the control group (both p < 0.01) Mean CSA and rCSA of the PT (psoas major muscle) were not significantly different between the two groups (CSA, p = 0.96; rCSA, p = 0.72) The degree of FI in the paraspinal muscles was significantly greater in the ASD group than in the control group (p < 0.01) Multivariate logistic regression analysis: Smaller rCSA and more FI of the paraspinal muscles preoperatively was a significant factor for predicting the development of ASD (OR = 0.083, p = 0.003; OR = 1.080, p = 0.044) Neither the CSA (p = 0.585) nor the FI of PS did not have the predictive value High quality Chang [32] Independent-sample t test (   [24] Univariate analysis: PSE The FCSA (all p < 0.05), but not the tCSA (all p > 0.05) of PSE from L1-2 to L5-S1 was significantly smaller in the PJK group than the non-PJK group The lean MFI and total MFI of PSE from L1-2 to L5-S1 was significantly higher in the PJK group than the non-PJK group (all p < 0.05) PS and QL There were no significant difference between the PJK and non-PJK groups for CSA, lean-MFI or totalMFI of PS and QL from L1-2 to L5-S1 (all p > 0.05) Multivariate logistic regression: Smaller FCSA (OR = 22.56, p = 0.021) and higher totalMFI (OR = 16.44, p = 0.029) of PSE from L1-2 to L5-S1 were independent predictors of radiographic PJK High quality Hyun [31] Mann-Whitney U test: The CSA of ES at each level was significant smaller in the PJK group at the final follow-up (p < 0.05) There were no significant differences for CSA of MF at each level between the two groups at the final follow-up (p > 0.05) The FI of ES and MF at each level was significant higher in the PJK group at the final follow-up (p < 0.05) Multivariate regression analysis (Cox proportional hazards model): High quality correlated with PJK, not the PS CSA. This finding was in accordance with a case-control study by Zhu et al. [33]. They included patients with lumbar degenerative diseases who underwent fusion of L5 and performed a relative FCSA of MF assessment at L4-5. Zhu et al. showed that there was a significant difference in the preoperative relative FCSA of MF between the PJK and non-PJK groups [33]. They also found that the predicting value of MF only emerged in the L1-L2 group, not the T9-T12 group. Pennington et al. found a smaller size of paraspinal muscles at the upper instrumented vertebrae was an independent factor of PJK in patients undergoing thoracolumbosacral fusion greater than 2 levels [34]. Yuan et al. reviewed 84 DLS patients undergoing long instrumented fusion surgery and found that the FCSA of PSE was statistically smaller and MFI of PSE was higher at all levels in the PJK group than in the non-PJK group [24].

Sagittal imbalance
Limited evidence from 1 high quality study indicated that both atrophy and FI of PSE had an association with less LL improvement at follow-up (

Discussion
This systematic review identified 16 studies providing evidence for relationships between various lumbar muscle characteristics and five main postoperative complications. First, the review found strong evidence for an association between the atrophy of all paraspinal muscles and bone nonunion. These could be interpreted by the fact that paraspinal muscle atrophy was correlated to poorer function and weakness [36,37]. Paraspinal muscles with serious degeneration might have a weak effect on reducing the mechanical loading on bone, thus increasing the risk of bone nonunion. In addition, Lee et al. found that paraspinal muscle FI of ≥ grade 2 evaluated by Goutallier scale could be a cut-off value. They thought that in cases with a paraspinal muscle fat contents of ≥ grade 2, more rigid fixation, more graft bone, and meticulous fusion bed preparation should be necessary. There was strong evidence that PSE atrophy was predictive of screw loosening. Notably, Leng et al. found that screw loosening group had a significantly smaller CSA of ES than non-loosening group in six or more level fusion for DLS, whereas no difference was found between two groups in four or five level fusion [23]. It is postulated that the role of paraspinal muscles to maintain stability was more important as the stress upon LIV was stronger in longer fused levels [23]. Once atrophy of paraspinal muscles appeared preoperatively, especially in patients with long-segment fusion, the screw would undertake stronger stress and was prone to loosening. Thus, we considered that more rigid fixation or more graft bone might be requisite in cases with PSE atrophy during preoperative evaluation.
There was strong evidence that both preoperative atrophy and FI of PSE could predict the development of ASD. Previous study has reported that extensive degeneration and weakness of PSE after operation were risk factors of  ASD [38]. Our findings demonstrate a satisfactory predictive value of preoperative PSE degeneration on ASD. This might be because that the paraspinal muscle degeneration potentially adds more stress to the adjacent levels accelerating the degenerative pathway [31]. For reducing ASD, surgeons could use less traumatic techniques to protect paraspinal muscles in patients with preexisting PSE degeneration. There was strong evidence that both atrophy and FI of PSE were predictive of PJK. It is suggested that surgeons should pay attention to preoperative paraspinal muscle evaluation and perform some methods to prevent PJK in cases with severe muscle degeneration. However, the effect of PSE at which level should be remarkable was still uncertain. Hyun et al. [31] reported that both MF and ES degeneration at T10 to L2 preoperatively were identified risk factors for PJK, whereas Zhu et al. [33] found that the predicting value of MF only emerged in the L1-L2 group, not the T9-T12 group. Yuan et al. found that the degeneration of PSE was severer in the PJK group than the non-PJK group at L1-S1 [24]. Therefore, our results suggested that surgeons should evaluate muscularity and fatty degeneration at both thoracolumbar and lower lumbar area. Whether patients need fusion up to the upper thoracic area when finding severe atrophy and fatty degeneration of PSE in the preoperative MRI evaluation should be furtherly examined. In addition, there was no relationship between both atrophy and FI of PS and PJK with limited evidence.
Two studies have revealed the association between PSE and LL and SVA improvement, while it was still indeterminate with limited evidence. We hypothesized that PSE could provide support for lumbar stability when standing upright, thus the stabilizing effect decreased and sagittal imbalance subsequently developed as the degeneration of muscles occurred [39]. Further studies should explore the prognostic value of paraspinal muscles on the development of spinal curve.

Limitations
Few limitations in this review require to be considered. First, on account of relatively new research on the predictive value of paraspinal muscle morphology, the possibility of publication bias could not be eliminated. Thus, we conducted an extensive search of the literature and screened up to 5,000 articles to minimize this bias. Second, included studies had a high heterogeneity of lumbar diseases, surgical procedures, follow-up duration, paraspinal muscle parameters and definitions for each complication resulting in the inapplicability of meta-analysis. To abate the effect of heterogeneity, subgroup analyses according to previous systematic reviews have been conducted in each complication. In addition, due to the small volume of published literature, one additional study could switch the level of evidence in specific complication. In consequence, there is a need for more high-quality prospective research demonstrating the association for different complications to achieve clinical application.
In addition, another potential limitation in this review was the adaptability of the assessment tools used for the risk of bias. The NOS was designed for cohort and case-control studies, while it was lacking in items evaluating the prognosis research. Therefore, we employed the modified NOS which had been used to assess the risk of bias in prognosis studies [18]. Moreover, the non-response rate in all included case-control studies was not reported or matched, which need to be addressed in future studies to reduce the risk of bias.
Ideally, future research should incorporate a prospective design and control potential confounding factors so as to demonstrate the predictive value of paraspinal muscle morphology. More studies are also required to assess the result consistency through different methodologies of paraspinal muscle morphology [40]. Besides, reporting cut-off value of muscle atrophy and FI related to complications can help clinicians easily distinguish patients who are inclined to have unsatisfactory outcomes at follow-up.

Conclusions
For predicting the postoperative complications, we found strong evidence that preoperative paraspinal muscle degeneration was related to the development of bone nonunion, pedicle screw loosening, ASD and PJK. However, the predictive value of paraspinal muscles on the less improvement of sagittal parameters was indeterminate. In general, it is possible that the assessment of paraspinal muscle degeneration could be a viable method to stratify patients by risk of postoperative complications. On account of the small volume of published literature, there is a need for more high-quality prospective research demonstrating the association for different complications to achieve clinical application.

Availability of data and material Not applicable.
Code availability Not applicable.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethics approval Not applicable.

Consent to participate Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.