Introduction

Marker-by-treatment analyses are promising new methods in internal medicine [8], but have not yet been implemented in orthopaedics. Results from randomized clinical trials (RCTs) do not account for the heterogeneity in treatment effect and, therefore, RCT-based treatment recommendations are not always applicable to the individual patient [13, 18, 23]. The more conventional prognostic models identify the association between a prognostic marker and a good or poor treatment response. However, to select the best treatment for an individual patient it is important to quantify the benefit of one treatment over the other [8]. Previous marker-by-treatment analysis provided clinicians an evidence based method to select the best treatment for ovulatory infertile women [30]. In middle aged and older patients with a meniscal tear, the results from RCTs show that meniscal surgery has no clinical advantage over non-surgical treatment (such as physical therapy) or placebo surgery. However, meniscal surgery is associated with higher societal and healthcare costs, and higher risk of serious adverse events [2, 21, 28]. The number of surgeries slowly decreases, but surgery is still regularly performed for degenerative meniscal tears [20]. This is partly explained by the belief among some orthopaedic surgeons and patients that surgery is necessary to regain normal knee function in a subgroup of patients [4]. This view is based on a subgroup of non-responders to conservative treatment in RCTs [1]. [10] Several studies recommend to explore the heterogeneity of treatment outcome to better understand underlying factors which influence individual treatment effects [13, 18, 23]. Previous studies tried to define these subgroups [19, 25]. However, neither multivariable prognostic models [16, 19] nor surgeons’ personal predictions were able to accurately predict treatment outcome [25].

With a marker-by-treatment analysis, the influence of baseline information on the treatment effect can be determined [8, 13]. These predictive factors, or treatment selection markers, represent baseline information regarding patient characteristics, physical and radiological examination findings or patient reported outcomes. The relevant interactions between the baseline characteristics (markers) and treatment outcome can be plotted in a marker-by-treatment predictiveness curve [8]. The analysis provides specific cut-off points that can potentially identify the superior intervention of two interventions. The baseline characteristics that can accurately differentiate between the outcome between interventions are considered relevant treatment selection markers. These treatment selection markers can guide personalized treatment choices, based on a patient’s individual characteristics [8].

Previous prognostic models were unable to accurately predict treatment outcome. Therefore, this study aimed to introduce this novel approach in orthopaedic research and to identify relevant treatment selection markers that affect treatment outcome following meniscal surgery or physical therapy in patients with degenerative meniscal tears. Analysing patient’s baseline characteristics using this method can help clinicians to select the treatment that is potentially the most beneficial for an individual patient.

Materials and methods

The Medical Research Ethics Committees United (MEC-U; NL44188.100.13) approved the ESCAPE trial and the trial was registered (clinincaltrials.gov: NCT01850719 and The Netherlands Trial Register: NTR3908). All patients provided written informed consent before randomization.

To identify potential treatment selection markers, the data from the ESCAPE trial were used. The ESCAPE trial is a multi-center RCT comparing meniscal surgery with physical therapy in patients over 45 years old with a symptomatic degenerative meniscal tear who do not experience locking of the knee [26]. Patients were randomly allocated to meniscal surgery or physical therapy. Exclusion criteria were severe osteoarthritis (Kellgren and Lawrence score of 4, presenting significant osteophytes, joint-space narrowing, sclerosis, and bone ends abnormality) [11], a body mass index (BMI) > 35 kg/m2, prior surgery to the index knee (with the exception of diagnostic arthroscopic surgery), or clinically relevant anterior or posterior cruciate ligament insufficiency. Meniscal surgery, in which the damaged part of the meniscus was removed was performed within 6 weeks after randomization. Physical therapy which consisted of a predefined incremental exercise protocol, consisting of 16 sessions during eight weeks (Supplement 1) [27]. For patients with persistent knee symptoms after the intervention, additional physical therapy sessions could be attended or a delayed meniscectomy could be planned, depending on a shared decision after consultation with the orthopaedic surgeon. Further details of the interventions can be found in the study protocol of the ESCAPE trial [26, 27].

Selection of baseline characteristics for treatment selection

Baseline characteristics were preselected as possible treatment selection markers from an extensive list of baseline variables that were available (Supplement 2). First, a literature search was conducted to identify factors associated with the treatment outcome in patients with a meniscal tear. The search strategy can be found in Supplement 3. Second, an electronic survey was sent to an expert panel of orthopaedic surgeons (N = 24), physical therapists (N = 22) and patients (N = 10) who were involved in the ESCAPE trial to identify the most relevant treatment selection factors according to their opinion. The final selection of baseline characteristic that were analysed as potential treatment selection markers consisted of variables with a continuous outcome that were identified by the literature and/or chosen by the expert panel as variables associated with the treatment outcome [12].

Potential treatment selection markers included patients’ demographics (age, education level, BMI), patient reported outcome measures (the International Knee Documentation Committee Subjective Knee Form (IKDC) for knee function, pain intensity during activities on a Visual Analogue Scale (VAS), the RAND-36 Physical Component Scale (PCS) for general physical health, and the patient’s expectation for pain relief following treatment), and radiographic information (the Kellgren–Lawrence score for osteoarthritis [11], determined on a weight bearing radiograph in a posterior-anterior direction.

Patient involvement

Ten patients were surveyed who were involved in the ESCAPE trial. The patients were asked to select relevant treatment selection factors according to their opinion. From the list of variables measured within the ESCAPE trial.

Treatment outcome

The treatment outcome of interest was a clinically relevant improvement on patient reported knee function. For short-term effects, 3 months was identified by the patients and clinicians as an important time-point. Long-term effects were analysed at 12 and 24 months. Patient reported knee function was measured with the IKDC questionnaire [7]. The IKDC is a validated and reliable questionnaire with good responsiveness in patients with degenerative meniscal tears [17, 29]. Clinically relevant improvement was defined as an improvement exceeding the minimal important change (MIC) of the IKDC of 11 points for this patient population [17]. For this study, patients were divided into two groups at 3, 12, and 24 months follow-up: (1) patients who experienced a clinically relevant improved knee function (an improvement ≥ 11 IKDC points; i.e. good outcome) and (2) patients who did not experience a clinically relevant improved knee function (a deterioration or improvement < 11 IKDC points; i.e. poor outcome).

Data processing and statistical analysis

For the preselected baseline characteristics with a continuous outcome, separate logistic regression models were developed (treatment selection models) to predict the outcome using (1) the baseline characteristic (marker), (2) the allocated treatment (meniscal surgery or physical therapy), and (3) a marker-by-treatment interaction term. Interactions with a p value for association < 0.1 were considered as potential treatment selection markers [3, 22].

These potential treatment selection markers were further explored using predictiveness curves. These predictiveness curves present the risk on a poor outcome (no clinically relevant improved or deteriorated knee function) for both treatments. Furthermore, they also provide information on the performance of the potential treatment selection markers to guide treatment decisions, so called summary measures [8]. A detailed explanation of a predictiveness curve is provided in Supplement 4. The performance of the potential treatment selection markers was analysed under the assumption that physical therapy is the standard treatment as suggested by the current guidelines [21, 24].

The summary measures provide information on:

  1. 1.

    Marker positivity threshold; the threshold value of the baseline score of the potential treatment selection marker. Above this value patients would receive a recommendation for physical therapy, below this value a recommendation for meniscal surgery;

  2. 2.

    Marker positivity rate; the proportion of patients with a marker value greater than the marker positivity threshold. For this proportion of the population, physical therapy has an advantage over meniscal surgery;

  3. 3.

    Marker negativity rate; the proportion of patients with a marker score smaller than the marker positivity threshold. For this proportion of the population, meniscal surgery has an advantage over physical therapy and for this group standard care (i.e. physical therapy) would be recommended to change;

  4. 4.

    Average benefit physical therapy; the average benefit of physical therapy in patients with a marker value above the marker positivity threshold. This measure evaluates the effect of physical therapy compared to meniscal surgery on the marker outcome;

  5. 5.

    Average benefit meniscal surgery; the average benefit of meniscal surgery in patients with a marker value below the marker threshold. This measure evaluates the effect of meniscal surgery, if treatment decision was guided by the model, in terms of the expected decrease in patients with a poor outcome;

  6. 6.

    Decrease in rate of poor outcome, the estimated change in the outcome in our population if treatment decisions are guided by the model compared to the outcome when treated according to standard care (i.e. physical therapy). This measure is used to provide information on the decrease in percentage of patients with a poor outcome when treatment is decided on basis of the treatment selection models.

All analyses were performed based on the intention-to-treat data. The data analyses for the predictiveness curves were performed using R-studio, version 1.2.1335 and package ‘Treatment selection’ (R-studio Inc., Boston, MA, USA) [8].

The sample size was determined and calculated for the RCT in which the patients’ data were collected. The details on the sample size calculations can be found in previous publications [26, 27].

Results

Participants

Three hundred and twenty-one patients were included in the study. The mean (SD) age was 57.5 (6.6) years, and 161 (50.5%) participants were female. A total of 158 patients were allocated to surgery and 161 to physical therapy. Both groups showed comparable baseline characteristics for the potential treatment selection markers (Table 1). Main results and a detailed flow chart of the ESCAPE trial were previously published [26].

Table 1 Patients’ baseline characteristics

At 3 months follow-up, 57.0% (meniscal surgery) and 52.2% (physical therapy) of the patients were improved in knee function (> 11 IKDC points). At 12 months, this was 70.3% (surgery) and 54.7% (physical therapy), and at 24 months, this was 70.9% (surgery) and 65.8% (physical therapy). This shows that over time more patients achieved a clinically important improvement in knee function. In the physical therapy arm, 43 patients (27%) received delayed meniscectomy within 24 months.

Treatment selection markers

Potential treatment selection markers at baseline were general physical health (p = 0.01), pain during activities (p = 0.02) and knee function (p = 0.07) for the outcome at 3 months; BMI (p = 0.05) and age (p = 0.06) for the outcome at 12 months; and age (p = 0.05) for the outcome at 24 months (Table 2).

Table 2 Logistic regression analyses for interaction between the baseline characteristics and treatment at 3, 12, and 24 months

Prediction curves for potential treatment selection markers

These potential treatment selection markers were further explored with predictiveness curves. Figures 1, 2, 3, 4, 5 show the predictiveness curves at 3, 12, and 24 months for the following markers: general physical health, knee function, pain intensity during activities, age, and BMI.

Fig. 1
figure 1

Patients with a score above the threshold would improve more from physical therapy. The marker-by-treatment interaction at 3 months is significant (p = 0.01) with a corresponding marker positivity threshold of 40.7 points. At 12 and 24 months follow-up the marker-by-treatment interactions are no longer significant. Therefore, general physical health is not useful for treatment selection on the longer term.

Fig. 2
figure 2

Patients with a score above the threshold would improve more from physical therapy. The marker-by-treatment interaction at 3 months is significant (p = 0.07) with a corresponding marker positivity threshold of 50.6 points. At 12 and 24 months follow-up the marker-by-treatment interaction are no longer significant. Therefore, knee function is not useful for treatment selection on the longer term

Fig. 3
figure 3

Patients with a score above the threshold would improve more from physical therapy. The marker-by-treatment interaction at 3 months is significant (p = 0.07) with a corresponding marker positivity threshold of 53.9 points. At 12 and 24 months follow-up the marker-by-treatment interaction are no longer significant. Therefore, pain intensity during activities is not useful for treatment selection on the longer term

Fig. 4
figure 4

Patients with a score above the threshold would improve more from meniscal surgery. The marker-by-treatment interaction for the marker age is not significant at 3 months. However, at 12 and 24 months follow-up the marker-by-treatment interaction are significant (12 months p = 0.06; 24 months p = 0.05). The corresponding marker positivity threshold at 12 months follow-up is 49 years old and at 24 months follow-up the marker positivity threshold is 53 years old

Fig. 5
figure 5

Patients with a score above the threshold would improve more from meniscal surgery. The marker-by-treatment interaction for body mass index is not significant at 3 months. At 12 follow-up the marker-by treatment interaction is significant (p = 0.05) with corresponding marker positivity threshold of 22.3. However, at 24 months follow-up the marker-by-treatment interaction are no longer significant. Therefore, body mass index is not useful for treatment selection on the short and long term

Marker-by-treatment predictiveness curves for the outcome at 3, 12, and 24 months. Abbreviations: PT physical therapy, APM arthroscopic partial meniscectomy (surgery); predictiveness curves present the risk for individual patients, with a certain marker score, at the outcome of interest due to the given treatment. On the X-axis is the proportion of patients displayed that score below the corresponding marker value. The corresponding marker value is the raw marker score. On the Y-axis is the risk for the individual patient at the outcome with physical therapy and arthroscopic partial meniscectomy displayed.

The sloping lines in Fig. 1 indicate that the risk of poor outcome increases with higher levels of baseline general physical health. This applies to both treatments. The intersection in Fig. 1a indicates that at 3 months, patients with baseline values below 40.7/100 were more likely to benefit from surgery and patients with baseline values above 40.7/100 were more likely to benefit from physical therapy. The curves at 12 and 24 months follow-up were sloping less, indicating a similar risk of poor outcome across baseline values. Figure 2 shows a similar pattern for the marker knee function. The curves at 12 and 24 months run largely parallel, indicating that the effect of this baseline marker was similar for both treatments. The lines sloping down in Fig. 3 indicate that the risk of poor outcome decreased with higher levels of pain at baseline. The intersection in Fig. 3a indicates that at 3 months, patients with baseline VAS scores below 53.9/100 were more likely to benefit from surgery and patients with baseline VAS scores above 53.9/100 were more likely to benefit from surgery. No such intersection indicating a potentially relevant cut-off for baseline pain was observed at 12 and 24 months. For age and BMI, all curves were rather horizontal, but diverging with increasing baseline marker values with half of them reaching statistical significance. This indicates that the benefit of surgery compared to physical therapy was largest for patients with highest age and BMI. For all markers the predictiveness curves were inconsistent over time.

The summary measures of the predictiveness curves are presented in Table 3. This provides information on marker positivity threshold, average benefit physical therapy, average benefit surgery, marker positivity rate, marker negativity rate, and decrease in rate of poor outcome.

Table 3 Summary measures of predictiveness curves for pain during activities

Discussion

The important finding of the present study was that the identification of potential treatment selection markers did not result in clear clinical subgroups of patients who are substantially more likely to benefit from either surgery or physical therapy. Therefore, treatment decisions for patients with degenerative meniscal tear cannot be based on these treatment selection markers evaluated in the current study.

The published randomized clinical trials that evaluated surgical to conservative treatment for degenerative meniscal tears revealed small and clinically non-meaningful benefits of meniscal surgery over physical therapy in patients with degenerative meniscal tears for patient reported knee function [5, 6, 9, 15, 26, 32]. However, due to potential heterogeneity in treatment effects, this does not necessarily imply that individual patients cannot have a clinically relevant improvement from meniscal surgery compared to physical therapy [23].

The present study revealed that the average benefit that individual patients would experience from meniscal surgery is small (ranging from 5.2 to 14.8%) if treatment would be based on these markers. Similar to the results from the RCTs, the increased benefit that some patients may experience from meniscal surgery compared to physical therapy is not convincing since these benefits were small and not consistently present on all follow-up moments.

No studies were found that have analysed the variation in treatment effect for a musculoskeletal disorder based on RCT baseline data of their patients by performing a marker-by-treatment analysis. One cohort study on prognostic factors was identified and found worse outcomes at 1 and 2 years after surgery in case of complex tears, larger extrusion, cartilage injuries, and larger meniscal excision but without comparison to physical therapy [14]. In another, computer-based, prediction model in a similar population multivariable prognostic models were investigated to identify a subgroup of patients who might benefit from meniscectomy [19]. The multivariable prognostic models did not accurately predict treatment outcome after 1 year of surgery, and the study did not consider specific cut-off points that can potentially differentiate between the outcomes from the two treatments. In another study the orthopaedic surgeons’ prediction ability for treatment outcome in patients with degenerative meniscal tears was analysed for both physical therapy and meniscectomy [25]. Similar to the current findings, neither of these prediction studies were able to identify any subgroup of patients who might benefit from a meniscectomy or physical therapy on the longer term [25].

To our knowledge, this is the first RCT-based marker-by-treatment analysis that assessed the differential treatment effect of potentially relevant baseline variables for predicting clinically relevant improvement of knee function in patients with a degenerative meniscal tear. For treatment decision making, this type of prediction studies may be favourable over more common multivariable prediction studies [8]. Marker-by-treatment analyses focus on predicting the difference in outcomes between the two treatments, rather than only predicting the outcome for one treatment. Therefore, the analyses help clinicians and patients select the best treatment to optimize the outcomes. The selection markers deemed relevant by patients in this study aim to direct the choice of treatment based on specific baseline characteristics and the corresponding marker cut-off values.

Several limitations should be mentioned. First, the observed treatment thresholds did not account for potential adverse events resulting from surgery as an alternative to physical therapy. In other words, treatment benefit from surgery is overestimated because the risk of surgical complications is neglected [2]. Second, due to trial-based approach, the available baseline characteristics were restricted. Some potentially important predictors, such as objective knee function, muscle strength, the duration of symptoms [10], could not be included in our analyses. Although these factors may be viewed as relevant for treatment outcome, prior prognostic models that did include these variables could also not accurately predict treatment outcome in this population [19]. [10] Trials that included these prognostic variables are recommended to perform a marker-by-treatment analysis including these variables. Also, the trial-based approach might have resulted in an insufficient power for marker-by-treatment analysis due to the size of the RCT cohort. Third, the primary interest concerned the cut-off point on a predictiveness curve that distinguishes between a better outcome after surgery or after physical therapy based on a patient’s baseline score. Therefore, only continuous variables could be included, and dichotomous and categorical variables such as sex, joint line tenderness and tear type were not addressed.

Clinical implications

In general, marker-by-treatment analyses determine whether baseline characteristics can be used in making treatment decisions for individual patients. The predictiveness curves and performance measures of these predictiveness curves show the amount of benefit that individual patients will have from a treatment, if the treatment decision for that patient is based upon the treatment selection marker [8]. A threshold value is derived that differentiates between a favourable outcome for either of the compared treatments. Although such thresholds are rather uncommon to use as treatment decision tool in clinical practice, this information can be of high value to clinicians and policy makers who are seeking evidence based decision tools to weigh treatment benefit for individual patients against the risk of adverse events and healthcare costs [13]. So, instead of using mean outcomes of RCTs to make treatment decisions, patients and clinicians can potentially base their treatment decision for an individual patient upon the treatment selection marker.

In patients with degenerative meniscal tears our marker-by-treatment analyses only revealed specific baseline characteristics that showed a small increase in a better treatment outcome after meniscal surgery for each follow-up time point. As the combination of characteristics varies among patients, combining these potential selection markers may be more accurate. Future research, combining the individual data from the published RCTs in an individual patient data meta-analysis [31], may be able to identify any of these subgroups and could steer towards an even more individualized approach.

The opinion that surgery is necessary to regain normal knee function in selected patients is not supported by our study or previous scientific evidence in which subgroups have been unable to be identified [4, 19, 25]. Therefore physical therapy is recommended as initial treatment for all patients with degenerative meniscal tears.

Conclusion

A marker-by-treatment analysis was successfully conducted in orthopaedic research. No subgroups were found in this study that benefit more from surgery throughout the follow-up period. Physical therapy should be considered first choice treatment in all patients over 45 years old with degenerative meniscal tears who do not experience locking of the knee. Although the treatment selection markers had clear thresholds, none of the markers maintained a predictive effect over time. Therefore, treatment decisions for patients with degenerative meniscal tear cannot be based on the treatment selection markers studied in this trial.