Background

The surgical treatment of degenerative disk disease of the lumbar spine (DDL) is characterized by the heterogeneity of indications, techniques and results [1]. At the same time, this type of surgery is accompanied by the greatest failure rate among the main surgeries of the locomotor system [2]. In the process of shared decision-making, doctors have the obligation to inform patients of this fact [3], and patients have the right to receive this information. The content of this information should be the clinical pictures of success and failure (S&F) as well as their relative incidences.

The problem in contemporary medicine is that the concept of S&F has never been precisely defined [4,5,6]. However, an enormous number of decisions are made every day with the objective of avoiding failure or seeking success. These decisions are made not only by patients and doctors but also by payers, hospitals, and industry. To improve the interaction between all of these stakeholders, one requirement is obvious: when one says “failure” or “success”, everybody should understand the same thing.

This scenario offers an opportunity for the utilization of an operational definition. Operational definitions may not be perfect, but they allow an honest and predictable interaction among everybody involved in a process. A good operational definition must balance precision (that is, be based on relevant and well-measured data) and communicability (that is, be expressed in terms that can be understood and make sense to all). The a priori hypothesis of this study is that an operational definition of S&F after lumbar spine surgery that attends to these requirements may be formulated by the combination of satisfaction, pain, and disability measures.

Methods

Population and data collection

This study was performed at Hospital Moinhos de Vento (HMV), a private institution with 485 beds and limited medical staff. HMV has a clinical database for many diseases following the ICHOM criteria [7].

Cases of DDL, such as lumbar disk herniation, lumbar stenosis, degenerative spondylolisthesis and back pain due to disk degeneration treated surgically from May 2019 to February 2021 and followed up to 12 months are included in the database. Patient-reported outcome measures (PROMs) and data relative to the number of levels and surgical technique were also registered.

Ethics and consent

The study was approved by the Ethics Committee of the Moinhos de Vento Hospital (number 4.543.282 CAAE 41,454,920.3.0000.5330), and only participants providing a completed informed consent form participated. Data from medical records and PROMs were collected after informed consent was obtained from the participants. All methods in the study were performed in accordance with relevant institutional and national guidelines.

Patient-Reported Outcome Measures (PROMs)

PROMs questionnaires were administered by trained interviewers from the hospital`s value management office pre- and postprocedure (telephone and/or electronic forms), and satisfaction was measured using a Likert scale [8] at four levels in the postoperative period. Likert 1 and 2 – very satisfied or satisfied – was used to classify patients as “satisfied”, and Likert 3 and 4 – dissatisfied or very dissatisfied – was used to classify patients as “unsatisfied”.

Functional disability related to the lumbar spine was reported by the Oswestry Disability Index (ODI) [9]. Back and leg pain were measured by the Numerical Pain Rating Scale (0 to 10) [10] and identified as [NPRS back] [NPRS leg]. Quality of life was analyzed by the Euro Quality of Life 5-Dimension Scale (EQ5D) [11]. Inability to work and analgesic use were investigated preoperatively and at 6 and 12 months.

Statistical analysis

Categorical variables were summarized using absolute frequencies and percentages, while continuous variables were analyzed using means, standard deviations, medians and interquartile ranges. To compare proportions, the chi-squared test and Fisher’s exact test were used when appropriate, and the Mann–Whitney U test was used to compare continuous variables.

Analyses were stratified into pre- and postoperative periods. Subsequently, the postoperative group was divided into satisfied and unsatisfied groups, and comparisons between groups were performed. Furthermore, the postoperative group was also divided into success, incomplete success, incomplete failure and failure, and comparisons between group pairs were developed.

The optimal cutoff values of disability and pain were estimated by the receiver operating characteristic (ROC) curve by minimizing the Euclidean distance between the curve and the point (0.1) in the ROC space. The ROC curve of pain was built considering the highest value between NPRS back and NPRS leg. Areas under the curve (AUCs) and respective 95% confidence intervals (95% CIs) were also estimated. Sensitivity (Sen), specificity (Spe) and correct classification rate were also calculated for both measures. All analyses were performed using R software, version 4.0.3. Statistical significance was defined as a p value < 0.05.

Results

During the study period, 486 patients underwent surgery, but 80 (16.4%) of the initial cohort did not respond to follow-up. After exclusion, the clinical cohort included 406 patients. Responders and nonresponders had similar background information (Table 1). Fifty-one surgeons participated in the study, but the majority of patients (343 pts – 84.5%) were operated on by 21 surgeons. The median age was 49.2 years [40.1–60.4], and 50.9% were male with a mean body mass index of 27.4 points. The education level was relatively high for the Brazilian population, with 56.1% having a university degree. The main comorbidities were hypertension (50.9%) and depression (10.8%), and 7.9% were smokers. The surgical techniques used were decompression with fusion (36.6%), simple decompression (31.1%) and automated percutaneous discectomy (26.9%) [12, 13]. Approximately one-quarter (24.4%) of the patients had a history of previous back surgery. All outcome measures showed improvement after the surgery.

Table 1 Baseline characteristics of the patients

Global pre- and postoperative outcomes are presented in Table 2. The preoperative ODI improved from 42.0 points [32.0–54.0] to 16.0 points [4.4–34.0] postoperatively. Back pain (NPRS back) and leg pain (NPRS leg) varied from 7.0 [5.2–8.0] and 7.0 [5.0–8.0] points to 3.0 [0.0–6.0] and 3.0 [0.0–6.0], respectively. The regular utilization of opioids decreased from 40.0% in the preoperative period to 19.2% in the postoperative period (p < 0.01). The reduction in the percentage of patients unable to work due to back pain (22.6% to 19.40% p = 0.26) was not significant.

Table 2 Pre- and postprocedure groups evaluated for pain, disability and quality of life

Satisfied and unsatisfied patients

The outcomes of satisfied (80.7%) and unsatisfied (19.3%) patients are presented in Table 3, and the clinical profile in the postoperative subgroups differed considerably. The satisfied group presented mean values of NPRS back = 2.0 [0.0–5.0], NPRS leg = 0.0 [0.0–4.0] and mean ODI = 12.0 [4.0–26.0] points. Unsatisfied patients presented mean values of NPRS back = 7 [0.0–8.0], NPRS leg = 4.0 [0.0–8.0] and mean ODI = 38.0 [24.0–52.0]. Significant improvement between the preprocedure and postprocedure values in satisfied group was observed in practically all parameters. On the other hand, almost no difference was present between the preprocedure and postprocedure values in unsatisfied group.

Table 3 Satisfied and unsatisfied groups evaluated for pain, disability and quality of life

Cutoff values of disability and pain according to satisfaction/unsatisfaction

The sensitivity (Sen) and specificity (Spe) of the values of disability and pain used to discriminate between satisfied and unsatisfied patients were studied with ROC curves (Fig. 1), showing a narrow range of approximately 75.0% [72.0–77.0]. Both ROC curves [ODI (AUC 0.79) and pain (AUC 0.79)] presented an “ACCEPTABLE” performance (between 70.0 and 80.0) with values close to “GOOD” [14]. The cutoff values for ODI and back/leg pain were 28 and 6, respectively.

Fig. 1
figure 1

– Sensibility (Sen) and specificity (Spe) of ODI and pain values for S&F patients. *The correct classification rate is the sum of the number on the diagonal divided by the sample size in the test data

The data show that approximately 75.0% of satisfied patients presented pain ≤ 5, and 75.0% of unsatisfied patients presented pain ≥ 6 points. At the same time, ~ 75.0% of satisfied patients presented an ODI ≤ 27, and ~ 75.0% of unsatisfied patients presented an ODI ≥ 28 points.

Success, incomplete success, incomplete failure and failure

The satisfied and unsatisfied groups were further subdivided based on concordance or nonconcordance with the discrimination cutoff values:

  1. 1.

    Success (59.6%)—satisfied with pain and disability levels concordant (NPRS ≤ 5, AND ODI ≤ 27);

  2. 2.

    Incomplete success (20.4%)—satisfied with pain and disability levels nonconcordant (NPRS ≥ 6 AND/OR ODI ≥ 28);

  3. 3.

    Incomplete failure (7.1%)—unsatisfied with pain and disability levels nonconcordant (NPRS ≤ 5 AND/OR ODI ≤ 27);

  4. 4.

    Failure (12.4%)—unsatisfied with pain and disability levels concordant (NPRS ≥ 6 AND ODI ≥ 28).

The PROMs values of the four categories are presented in Table 4. There was a very significant improvement between preoperative (ODI 42.0 [32.0–54.0], NPRS back 7.0 [5.2–8.0], NPRS leg 7.0 [5.0–8.0]) and postoperative values in the success subgroup (ODI 8.0[2.0–16.0], NPRS back 1.0 [0.0–3.0], NPRS leg 0.0 [0.0–1.0]), but there was almost no difference between preoperative and postoperative values in the failure subgroup (ODI 44.4[38.0–54.0], NPRS back 7.0 [6.0–9.0], NPRS leg 7.0 [1.0–9.0]). The mean PROMs values of the incomplete success and incomplete failure subgroups lie in between these two extremes.

Table 4 Failure and success groups evaluated for pain, disability and quality of life

Discussion

We measured satisfaction, pain, and disability in a cohort of 406 patients who underwent surgery for DDL. Based on the combination of PROMs, we created four outcome categories in the following terms: success (59.6%)—satisfied with pain and disability levels concordant (NPRS ≤ 5, AND ODI ≤ 27); incomplete success (20.4%)—satisfied with pain and disability levels nonconcordant (NPRS ≥ 6 AND/OR ODI ≥ 28); incomplete failure (7.1%)—unsatisfied with pain and disability levels nonconcordant (NPRS ≤ 5 AND/OR ODI ≤ 27); and failure (12.4%)—unsatisfied with pain and disability levels concordant (NPRS ≥ 6 AND ODI ≥ 28).

The clinical profile of success (ODI 8.0 [2.0–16.0], NPRS back 1.0 [0.0–3.0], NPRS leg 0.0 [0.0–1.0]) is comparable with the normal healthy population, that is, pain in the range of “no pain” [10] and ODI in the range of the healthy population [9]. At the same time, the clinical profile of failure (ODI 44.4 [38.0–54.0], NPRS back 7.0 [6.0–9.0], NPRS leg 7.0 [1.0–9.0]) demonstrates that these patients remain as sick as they were before surgery. This model seems well adjusted to the common ideas of success (suggestive of normal life) and failure (continuation or worsening of the disease).

It is intuitive that there is not a sharp limit between S&F. Intermediary categories were then created for satisfied patients with pain and disability worse than expected (incomplete success) and for unsatisfied patients with pain and disability better than expected (incomplete failure).

Methodological issues

Our S&F model is based on satisfaction, disability, and pain, with satisfaction as the main criterion. The choice of satisfaction as the primary anchor may be debated. Some authors demonstrate that there is a discrepancy between satisfaction and PROMs [15], while others demonstrate that they correlate well [16]. It is clear that satisfaction correlates better with the final raw scores than with improvement [17]. It was hypothesized that PROMs may not be the best instrument for evaluating satisfaction [18] because satisfaction depends on a complex and wider array of variables, such as physical and mental health, expectations and lifestyle [16, 18]. Some authors chose satisfaction as the main translation of success [16] and were praised for that [19]. Even the concept of minimal clinically important difference (MCID) is based on satisfaction. Satisfaction represents the patient’s most comprehensive evaluation of what occurred [20].

We then chose ODI and NPRS [21] as complementary criteria because they are directly related to the disease. Quality of life is also important in this evaluation, but it is dependent on other social and health factors. EQ5D varies among countries and is difficult to explain in simple words. In the same manner, drug use and work status are also important but were left out of the model because they evaluate the consequences of the disease and not the disease itself.

For the method of this study, we adopted the final raw score of pain and disability as outcomes. Many authors base their studies on preoperative-to-postoperative variation as well as on MCID [22,23,24]. Previous studies demonstrated that the analysis S&F based on preoperative-to-postoperative differences or MCID may have some flaws [6]. The results obtained with this strategy are strongly influenced by the severity of preoperative symptoms [25, 26]. Final raw scores correlate better with S&F and are simpler and more objective, and they are not influenced by the intensity of preoperative symptoms [25]. Our model describes “how patients will be at the end of treatment” (final raw scores) and avoids referring to an elusive “minimum clinically significant” improvement.

Another peculiarity of our study was to assess pain considering the highest value between back and leg pain. We assume that the patient’s suffering is better assessed in this manner. Other authors have previously done the same [4].

Translation of numerical values into simple and meaningful terms

The translation of numerical values into simple and meaningful terms is the aim of our study. It is not exactly a “result” because it was not originally extracted from our data. A summary of the available literature will be presented in this section to support our rationale.

Satisfaction was linked to back/leg pain ≤ 5 in our cohort as well as in previous similar studies [23, 27]. Pain scales can be numerical, visual or verbal, and the equivalence among these three forms has already been studied [10, 28]. For a numeric scale, no pain is represented by pain 0 to 2; 3 to 4 is described as mild pain; 6 to 8 is moderate pain and 9 to 10 is severe pain. From the verbal standpoint, pain = 5 is located exactly in the midpoint between mild and moderate pain. However, what is the best word to describe pain = 5?

Zelman and coworkers [29] studied the interference of pain in the life of chronic low back pain patients (sensation of controlled pain, ability to participate in productive activities, decreased irritability, low analgesic intake and willingness to socialize). In this analysis, it was demonstrated that a pain = 5 represented the limit between tolerable and intolerable pain. The cutoff value of 5 for back/leg pain was found by us and by other authors. Our data as well as those of the literature support the idea that “tolerable” is an appropriate term to describe pain = 5. According to this information, patients with pain ≤ 5 can be described as having no or only mild to tolerable pain.

In Japan [17], the mean ODI value of patients who were disabled due to spine problems varied from 26 and 28 points at the ages of 50 and 70 years, respectively. In other studies, the criteria were stricter, and the mean ODI was 21 points for success [25] and 25 points for failure [27]. Most studies based on final raw scores found cutoff values for failure between 22 and 30 points [23, 24, 30].

In the short term, the pertinent literature determines the existence of a borderline zone between the ODI values of disabled and nondisabled patients, ranging from 21 to 31 points. In our cohort, an ODI ≤ 27 points were linked to satisfaction, and this value lies within this borderline zone. Therefore, patients with an ODI ≤ 27 points can be described as individuals with no disability or borderline disability.

Operational definitions

Our results support the description of four operational definitions:

  1. 1.

    Success– All patients are satisfied, and present no or only mild to tolerable pain and no or only borderline disability.

  2. 2.

    Incomplete success – All patients are satisfied despite levels of pain and/or disability worse than ideal for success.

  3. 3.

    Incomplete failure – All patients are not satisfied despite levels of pain and/or disability better than expected for failure.

  4. 4.

    Failure – All patients are unsatisfied, and all present moderate to severe pain and disability.

The option for an operational definition of S&F

The precise concept (or diagnostic criteria) of S&F after low back surgery has never been and will probably never be defined [4,5,6]. Nonetheless, S&F happen and are widely studied. One review at PUBMED with the terms “lumbar spine surgery AND failure” showed 3,268 results. Another one with “lumbar spine surgery AND success” generated 2,882 results. Concepts or definitions of S&F are based on many PROMs that measure different constructs, so their results are almost never coincident [31]. As a result, patients face a myriad of numbers that are difficult to understand. According to some authors, even doctors have difficulty fully understanding the meaning of these numbers [27].

A process of shared decision based on concepts such as “33.0% improvement in ODI” or “to reach MCID in leg pain” is almost impossible. This difficulty is more visible in people with low literacy but can happen in more educated people [32]. According to Werner et al. [27], patients have a greater ability to understand the percentages of definite types of outcomes than continuous variables. Our method responds to these problems in two ways: a) it divides the possible outcomes into four intuitive types (success, incomplete success, incomplete failure and failure), and b) the myriad of numeric variables was replaced by simple equivalent words.

We emphasized the importance of our operational definitions for communication among all stakeholders of spine surgery. However, there is one specific scenario where this type of definition reaches its most relevant moment: this is the preoperative discussion between patient and doctor concerning the indication of surgery [33]. Patients have the right to be informed, and doctors must be in charge of giving the information concerning all possible outcomes, that is, their relative incidences and clinical characteristics. This information must be as precise as possible and be presented in simple and meaningful terms. This is a prerequisite to ensure that patients can exert their freedom of choice [34, 35].

Possible deficiencies of the study

This study was based on a single institution, so our results need to be replicated and tested to obtain better validation. Our cohort included different diseases (disc herniation, stenosis, etc.), surgical techniques, approaches, and surgeons in one single group. This is in line with a recent tendency of the surgical literature, the so-called science of practice [36, 37]. With this approach, it was already demonstrated, for example, that return to work [38], improvement of pain, disability and quality of life depend more on the patients’ characteristics than on the type of approach, number of levels, use of fusion, surgeon’s experience and other factors [39].

Other criticisms can be made on the lack of attention to relevant clinical aspects such as the relatively short follow-up in patients who underwent fusion and the influence of educational level or previous surgery on the results. The objective of this study, however, is not to describe the rates of success and failure (which truly may depend on timing or many other variables) but rather to describe a manner (simple and communicable) of reporting the basic endpoints of success and failure. The rates of S&F may change, but the way they are described may not.

Finally, there is the problem of reducing all possible outcomes into only 4 categories. The complexity of degenerative disc disease and the heterogeneity of treatments and results deserve a very granular subdivision of possible outcomes. Such a “perfect” definition, on the other hand, would be cumbersome during the process of decision-making. It must be recognized that the broader aim of developing a completely truthful and sophisticated definition of S&F has not proven feasible in the context of lumbar spine surgery. The more complex and sophisticated the definition, the more difficult it is to be understood and communicated, and vice versa. This tradeoff is inevitable. It is the opinion of the authors that the simplicity and communicability of our operational definitions were obtained without compromising precision.

Conclusion

It is possible to report S&F after surgery for DDL with operational definitions based on satisfaction, disability, and pain that are precise, simple, and meaningful to all people involved in the process. Our operational definitions of success, incomplete success, incomplete failure, and failure may improve the process of shared decisions focused on the experience of the patient.