Introduction

Degenerative Lumbar Spinal Stenosis (LSS) is predominantly caused by arthrosis of the facet joints, ligament hypertrophy and degenerative changes of the intervertebral disks. These changes increase with age, and can lead to narrowing of the spinal canal, nerve root compression and subsequently symptoms of neurogenic claudication with radiating pain or numbness in the legs, aggravated by standing and walking [1]. Lumbar pain can also be a common symptom [2]. The diagnosis of symptomatic LSS is based on the combination of clinical symptoms and stenosis of the spinal canal visualized on magnetic resonance imaging (MRI) [3]. Symptomatic LSS is the most frequent indication for lumbar surgery in the age group above 65 years [4]. Former imaging studies indicate that radiological signs of LSS are also relatively common in asymptomatic elderly persons [5, 6]. Patients with symptoms of LSS can present a wide variety of radiological findings of uncertain clinical relevance [6,7,8]. A systematic review by Burgstaller et al. [9] found a low correlation between MRI findings and pain in patients diagnosed with LSS. This conclusion is in concordance with studies by Schizas et al. and Mannion et al. [10, 11]. However, Ogikubo et al. [12] reported a strong relationship between the grade of stenosis and patient reported disability.

Based on the nearly mandatory use of MRI in preoperative planning, and because walking disturbance and pain are the dominant symptoms, one would expect some association between MRI findings and patient reported disability and pain. It is important for the surgeon to understand the relevance of the MRI findings in this group of patients. Hence, the aim of this study was to explore the MRI findings in patients diagnosed with symptomatic LSS and further to investigate the association between several commonly used MRI findings and patient reported disability and pain, in a carefully selected cohort of LSS patients. The association between MRI findings and patient characteristics was also investigated.

Methods

The patients investigated here were included in the NORwegian Degenerative spondylolisthesis and spinal STENosis (NORDSTEN) study. This study is a large multicenter study evaluating clinical outcomes of different surgical treatment options for LSS. The NORDSTEN study consists of two randomized trials, the Spinal Stenosis Trial (SST) [13] and the Degenerative Spondylolisthesis Trial (DST) [14] and an observational study. The SST trail includes 437 LSS patients without spondylolisthesis eligible for surgery. The present cross-sectional study comprises preoperative data from the patients included in the SST trial.

Inclusion process and patient recruitment

All patients had MRl findings and symptoms consistent with LSS and were referred to an orthopedic or neurosurgical outpatient clinic. In total 2227 patients were referred, and 437 patients fulfilling all eligibility criteria (Table 1) were finally included in the SST trial (Fig. 1). All patients were enrolled between February 2014 and October 2018.

Table 1 Inclusion and exclusion criteria for the Spinal Stenosis Trial (SST) in the NORDSTEN study
Fig. 1
figure 1

Flowchart of the NORDSTEN and the SST according to the STROBE-statement

Magnetic resonance imaging

All participants underwent a 1.5 or 3 T MRI of the lumbar spine within 6 months before surgery. The MRI protocol included sagittal T1- and axial and sagittal T2- weighted images with repetition time (TR)/echo time (TE) 1500–6548/82–126 ms for T2-weighted images and 400–826/8–14 ms for T1-weighted images, slice thickness: 3–5 mm, FOV: 160–350 mm. MRI examinations were anonymized, without any link to demographics or clinical symptoms. The Picture Archiving and Communication System (PACS IDS7 Sectra, Sweden) integrated measurement tools were utilized for assessment of morphological changes.

Two experienced radiologists established the NORDSTEN study imaging criteria for MRI evaluation according to previously validated classification systems.

To validate the measurements in this study, we performed an inter- and intra-observer agreement analysis. Two orthopedic spine surgeons and two musculoskeletal radiologists evaluated all MRI examinations of the first 102 patients independently according to the predefined criteria. For the remaining 335 patients, the two surgeons and one of the radiologists performed the evaluation of the MRI examinations. For continuous parameters, the mean values of each parameter for all investigators were used. For categorical parameters, the majority score was decisive if any disagreement existed between the readers. The conclusion was that adequate agreement existed. The inter- and intra-observer agreement analyses will be published separately.

The index level was defined as the narrowest lumbar level measured with dorsal sac cross-sectional area (DSCA). At index level, the following parameters were investigated: morphological grade of stenosis according to the Schizas grading system from A (no or minor) to D (extreme) [10], quantitative grade of stenosis measured with DSCA according to the method described by Sconstrom and Hansson [15], disk degeneration according to the Pfirrmann grading system from 1 (normal) to 5 (worst) [16], facet joint angle according to the method described by Noren et al. [17], facet tropism according to the method of Vanharanta [18] and fat infiltration of the multifidus muscle according to the Goutallier classification from 0 (normal) to 4 (severe)[19].

The radiological scores were dichotomized into moderate or severe changes. The following values classified as severe changes: Schizas grade C and D, cross-sectional area less than 75 mm2, Pfirrmann grade 4 and 5, tropism of 15° or more, Goutallier grade 2–4 [10, 15, 18].

Preoperative clinical measures

At admission for surgery, the patients reported disability and pain using a self-administered questionnaire containing commonly used patient reported symptom severity measures, i.e., the Norwegian version of the Oswestry Disability Index (ODI), Zurich Claudication Questionnaires (ZCQ) and Numeric rating Scale (NRS) for back and leg pain.

The ODI is a low back pain-specific questionnaire consisting of ten questions concerning pain related disability. The ODI score ranges from 0 (no disability) to 100 (most severe disability) [20, 21].

The ZCQ is a disease-specific, validated score measuring walking capacity, neurogenic claudication and patient satisfaction. The first sub-scale measures symptoms ranging from 1 to 5 (worst), sub-scale two measures disability ranging from 1 to 4 [22]. Sub-scale three investigates post-treatment effects and was therefore not included in this study.

NRS scores for back and leg pain range from 0 to 10 (worst possible pain).

Statistical analysis

For descriptive statistics, we calculated mean and standard deviation for continuous variables and absolute and relative frequencies for categorical variables. To investigate potential associations we estimated, for each investigated variable, a multivariable linear regression model with the following covariates: Schizas dichotomized into A–B vs C–D, DSCA dichotomized into < 75 mm2 vs ≥ 75 mm2, Pfirrmann dichotomized into 1–3 vs 4–5, tropism dichotomized into ≤ 15° vs > 15°, fatty infiltration dichotomized into 0–1 vs 2–4, age (continuous), weight (continuous), gender and smoking status (yes/no).

Results were presented as unstandardized regression coefficients (gradients) with corresponding 95% confidence intervals and p-values. The given regression coefficient indicates the change in disability and pain score when going from “moderate” to “severe” for the given parameter and given instrument for patient reported symptom severity. Due to the risk of increasing probability of type 2 errors, no adjustments for multiple tests were done.

All analyses were done using STATA version 15.0 (StataCorp LLC, College Station, Texas, USA).

Ethics and trial registration

The Committee for Medical and Health Research Ethics of Central Norway approved the study (study identifier: 2011/2034). The study was registered at ClinicalTrials.gov on November 22nd 2013 under the identifier NCT02007083. All patients provided written informed consent.

Results

Demographic characteristics

A total of 437 patients with preoperative MRI and clinical data were included in this study. The mean patient age was 66.8 years (SD 8.4), and 227 out of 437 patients (52.7%) were men. Mean BMI was 27.8 (SD 4.2). Patient demographics, preoperative ODI, ZCQ and pain scores of the study population are presented in Table 2.

Table 2 Cohort of LSS patients selected for surgical treatment

MRI findings

MRI evaluation showed a high proportion of severe LSS changes when investigating spinal morphology: Schizas 296 of 415 (71.3%), DSCA 360 of 415 (86.8%), disk degeneration: Pfirrmann 241 of 415 (58.1%) and fatty infiltration of the multifidus muscle: 308 of 368 (83.7%). Tropism was detected less frequently: 49 of 415 (11.9%).

Association between MRI findings and symptom severity

The multivariable linear regression model showed that the MRI parameters assessed for severity of LSS had a weak association to symptom severity of disability and pain measured by continuous variables ODI, ZCQ or NRS scores. Of the investigated MRI parameters, only the difference between moderate and severe changes in the Pfirrmann classification system provided a significant change in ODI score. This difference of 3.27 (CI 0.23, 6.31) ODI points is lower than the reported thresholds of clinical relevance. Adjustment for age and weight did not influence disability and pain scores, while smoking was significantly associated with higher ODI score. Females reported significantly higher ODI, ZCQ and NRS score (Table 3).

Table 3 Cohort of LSS patients selected for surgical treatment

Discussion

In this study, a large proportion of patients eligible for surgery for LSS had severe degrees of various MRI findings of the lumbar spine. The severity of these MRI findings showed no or only weak association with disability and pain.

The multivariable linear regression analysis indicated minimal change in disability and pain scores when comparing moderate MRI findings to severe MRI findings. This trend is similar for all analyzed MRI parameters. When adjusting for selected patient characteristics the regression analysis suggests that gender influences disability and pain scores to a larger extinct than the degree of MRI findings. Therefore, the impact of being female gives nearly threefold larger impact on the ODI score than the difference between severe and moderate changes in the radiological parameters, e.g., the Pfirrmann score.

MRI findings

To our knowledge, this is the first study to investigate several MRI parameters among LSS patients. There are several studies with comparable patient cohorts exploring isolated MRI parameters. Bhalla et al. conducted an MRI comparison between LSS patients selected for surgery in Trondheim (Norway) and Boston (USA) regarding the Schizas score. They found a similar proportion of Schizas C or D in the Boston cohort (68%) and in the Trondheim cohort (78%) [23]. Moojen et al. [24] evaluated 155 LSS patients with MRI and found 77% of the patients to be classified with Schizas C or D. Both studies report a similar proportion of Schizas C or D as the NORDSTEN-SST cohort.

Sigmundsson et al. [25] investigated a cohort of 109 LSS patients eligible for spinal surgery, and 105 patients (96%) had DSCA of 70 mm2 or less. This is considerably higher than the result of 87% in the NORDSTEN SST cohort. The participants in the Swedish study were older than in the NORDSTEN study (mean 71 years vs mean 66.8 years). Since age is a key factor in the development of spinal stenosis, the age difference may explain the difference in DSCA between the populations. The lower threshold for a small DSCA in the Swedish study (70 mm2 vs 75 mm2 in our study) might also influence the result.

In a retrospective study of 43 patients who underwent surgery for LSS, Hwang et al. [26] classified 79% of the patients to Pfirrmann 4 or 5 based on MRIs at baseline, compared to our 58% in the present study. Mean age in the cohort by Hwang et al. was 69 years. The slightly older cohort could explain the larger number of patients with severe disk degeneration compared to the NORDSTEN cohort.

Akar et al. studied the prevalence of severe tropism in a cohort of 100 spinal stenosis patients eligible for surgery and established that 14% had a facet angle difference of 16 degree or more [27]. The result is similar to the 11% result in the NORDSTEN SST cohort.

In a study that investigated fatty infiltration of the paraspinal muscles, Chen et al. [28] enrolled 62 patients with spinal stenosis. By using the same classification system as we did 17 (27%) of the subjects scored value 3 or 4 when examining the multifidus muscle at the affected lumbar level. Our result of 84% is not directly comparable due to different dichotomizing.

Association MRI findings and symptom severity

Several studies focus on the relationship between MRI findings and clinical manifestations in patients with LSS. Previous studies have concentrated on DSCA and the Schizas classification system and report weak associations. Weber et al. [29] investigated preoperative MRIs of 208 patients with LSS in a retrospective study and found a weak association between Schizas score and ODI as well as NRS score at baseline. Mannion et al. [11] investigated the association between MRI findings at baseline (DSCA and Schizas) and disability/pain score at baseline in 157 patients planned for spinal stenosis surgery. The group could not establish a significant correlation. Kuittinen et al. [30] investigated the association between DSCA and ODI without finding a significant correlation. The patient cohorts in these studies are comparable to the present study.

Limitations and strengths

The present study has a cross-sectional study design and a highly selected group of patients. Consequently, causal inference about the association between MRI parameters and symptom severity in patients with spinal stenosis cannot be made. With all measurements collected preoperatively, the findings in this study cannot predict future clinical consequences of the MRI findings.

The quality of MR images collected in the NORDSTEN study varies between the institutions and is a possible source of bias when collecting and interpreting the data. Of the 437 collected MRIs, 415 were of adequate quality to be included in this study when investigating Schizas, DSCA, Pfirrmann and tropism. When investigating fatty Infiltration of multifidus, 368 MRIs could be included. Consequently, this study was sufficiently powered to investigate a larger number of MRI parameters than earlier studies with similar aim. This strengthens our findings. The selection process in this study provides a subject cohort with a large burden of disability, pain and MRI changes. Our findings cannot be generalized to a population not eligible for spinal stenosis surgery.

The dichotomization of the scores in the different classification systems increased the possibility to differentiate between patients with moderate and severe MRI changes. In addition, this reflects best the challenges physicians face in everyday practice when interpreting MRI findings of patients with LSS. The cut-of values were based on earlier studies when this was appropriate, but also in a pragmatic manner to ensure an adequate number of subjects in each group to perform statistical analysis. To test the robustness of the statistical analysis, a trichotomization of DSCA values (< 75 mm2, 75–100 mm2, > 100 mm2) and Pfirrmann values (1–2, 3–4 and 5) was investigated. This did not alter the conclusion of the primary analysis.

Clinical overall value

When discussing the high degree of degenerative changes in the observed cohort, it must be acknowledged that asymptomatic persons in the same age group also present a high degree of radiological lumbar degeneration [31]. The present paper shows a weak association between MRI findings and patient-reported pain and disability among patients selected for decompression surgery.

Overall, the findings suggest that physicians should not overly emphasize the radiological signs of degenerative changes when giving medical advice to patients with LSS.

Conclusion

The NORDSTEN SST COHORT presents a high prevalence of degenerative MRI findings at baseline. The prevalence is similar to observations in former studies. In this cross-sectional study, only weak associations could be detected between investigated MRI parameters and disability/pain before surgery.