An individualized decision between physical therapy or surgery for patients with degenerative meniscal tears cannot be based on continuous treatment selection markers: a marker-by-treatment analysis of the ESCAPE study

Purpose Marker-by-treatment analyses are promising new methods in internal medicine, but have not yet been implemented in orthopaedics. With this analysis, specific cut-off points may be obtained, that can potentially identify whether meniscal surgery or physical therapy is the superior intervention for an individual patient. This study aimed to introduce a novel approach in orthopaedic research to identify relevant treatment selection markers that affect treatment outcome following meniscal surgery or physical therapy in patients with degenerative meniscal tears. Methods Data were analysed from the ESCAPE trial, which assessed the treatment of patients over 45 years old with a degenerative meniscal tear. The treatment outcome of interest was a clinically relevant improvement on the International Knee Documentation Committee Subjective Knee Form at 3, 12, and 24 months follow-up. Logistic regression models were developed to predict the outcome using baseline characteristics (markers), the treatment (meniscal surgery or physical therapy), and a marker-by-treatment interaction term. Interactions with p < 0.10 were considered as potential treatment selection markers and used these to develop predictiveness curves which provide thresholds to identify marker-based differences in clinical outcomes between the two treatments. Results Potential treatment selection markers included general physical health, pain during activities, knee function, BMI, and age. While some marker-based thresholds could be identified at 3, 12, and 24 months follow-up, none of the baseline characteristics were consistent markers at all three follow-up times. Conclusion This novel in-depth analysis did not result in clear clinical subgroups of patients who are substantially more likely to benefit from either surgery or physical therapy. However, this study may serve as an exemplar for other orthopaedic trials to investigate the heterogeneity in treatment effect. It will help clinicians to quantify the additional benefit of one treatment over another at an individual level, based on the patient’s baseline characteristics. Level of evidence II. Supplementary Information The online version contains supplementary material available at 10.1007/s00167-021-06851-x.


Introduction
Marker-by-treatment analyses are promising new methods in internal medicine [8], but have not yet been implemented in orthopaedics. Results from randomized clinical trials (RCTs) do not account for the heterogeneity in treatment effect and, therefore, RCT-based treatment recommendations are not always applicable to the individual patient [13,18,23]. The more conventional prognostic models identify the association between a prognostic marker and a good or poor treatment response. However, to select the best treatment for an individual patient it is important to quantify the benefit of one treatment over the other [8]. Previous marker-by-treatment analysis provided clinicians an evidence based method to select the best treatment for ovulatory infertile women [30]. In middle aged and older patients with a meniscal tear, the results from RCTs show that meniscal surgery has no clinical advantage over non-surgical treatment (such as physical therapy) or placebo surgery. However, meniscal surgery is associated with higher societal and healthcare costs, and higher risk of serious adverse events [2,21,28]. The number of surgeries slowly decreases, but surgery is still regularly performed for degenerative meniscal tears [20]. This is partly explained by the belief among some orthopaedic surgeons and patients that surgery is necessary to regain normal knee function in a subgroup of patients [4]. This view is based on a subgroup of non-responders to conservative treatment in RCTs [1]. [10] Several studies recommend to explore the heterogeneity of treatment outcome to better understand underlying factors which influence individual treatment effects [13,18,23]. Previous studies tried to define these subgroups [19,25]. However, neither multivariable prognostic models [16,19] nor surgeons' personal predictions were able to accurately predict treatment outcome [25].
With a marker-by-treatment analysis, the influence of baseline information on the treatment effect can be determined [8,13]. These predictive factors, or treatment selection markers, represent baseline information regarding patient characteristics, physical and radiological examination findings or patient reported outcomes. The relevant interactions between the baseline characteristics (markers) and treatment outcome can be plotted in a marker-bytreatment predictiveness curve [8]. The analysis provides specific cut-off points that can potentially identify the superior intervention of two interventions. The baseline characteristics that can accurately differentiate between the outcome between interventions are considered relevant treatment selection markers. These treatment selection markers can guide personalized treatment choices, based on a patient's individual characteristics [8].
Previous prognostic models were unable to accurately predict treatment outcome. Therefore, this study aimed to introduce this novel approach in orthopaedic research and to identify relevant treatment selection markers that affect treatment outcome following meniscal surgery or physical therapy in patients with degenerative meniscal tears. Analysing patient's baseline characteristics using this method can help clinicians to select the treatment that is potentially the most beneficial for an individual patient.

Materials and methods
The Medical Research Ethics Committees United (MEC-U; NL44188.100.13) approved the ESCAPE trial and the trial was registered (clinincaltrials.gov: NCT01850719 and The Netherlands Trial Register: NTR3908). All patients provided written informed consent before randomization.
To identify potential treatment selection markers, the data from the ESCAPE trial were used. The ESCAPE trial is a multi-center RCT comparing meniscal surgery with physical therapy in patients over 45 years old with a symptomatic degenerative meniscal tear who do not experience locking of the knee [26]. Patients were randomly allocated to meniscal surgery or physical therapy. Exclusion criteria were severe osteoarthritis (Kellgren and Lawrence score of 4, presenting significant osteophytes, joint-space narrowing, sclerosis, and bone ends abnormality) [11], a body mass index (BMI) > 35 kg/m 2 , prior surgery to the index knee (with the exception of diagnostic arthroscopic surgery), or clinically relevant anterior or posterior cruciate ligament insufficiency. Meniscal surgery, in which the damaged part of the meniscus was removed was performed within 6 weeks after randomization. Physical therapy which consisted of a predefined incremental exercise protocol, consisting of 16 sessions during eight weeks (Supplement 1) [27]. For patients with persistent knee symptoms after the intervention, additional physical therapy sessions could be attended or a delayed meniscectomy could be planned, depending on a shared decision after consultation with the orthopaedic surgeon. Further details of the interventions can be found in the study protocol of the ESCAPE trial [26,27].

Selection of baseline characteristics for treatment selection
Baseline characteristics were preselected as possible treatment selection markers from an extensive list of baseline variables that were available (Supplement 2). First, a literature search was conducted to identify factors associated with the treatment outcome in patients with a meniscal tear. The search strategy can be found in Supplement 3. Second, an electronic survey was sent to an expert panel of orthopaedic surgeons (N = 24), physical therapists (N = 22) and patients (N = 10) who were involved in the ESCAPE trial to identify the most relevant treatment selection factors according to their opinion. The final selection of baseline characteristic that were analysed as potential treatment selection markers consisted of variables with a continuous outcome that were identified by the literature and/or chosen by the expert panel as variables associated with the treatment outcome [12].
Potential treatment selection markers included patients' demographics (age, education level, BMI), patient reported outcome measures (the International Knee Documentation Committee Subjective Knee Form (IKDC) for knee function, pain intensity during activities on a Visual Analogue Scale (VAS), the RAND-36 Physical Component Scale (PCS) for general physical health, and the patient's expectation for pain relief following treatment), and radiographic information (the Kellgren-Lawrence score for osteoarthritis [11], determined on a weight bearing radiograph in a posterioranterior direction.

Patient involvement
Ten patients were surveyed who were involved in the ESCAPE trial. The patients were asked to select relevant treatment selection factors according to their opinion. From the list of variables measured within the ESCAPE trial.

Treatment outcome
The treatment outcome of interest was a clinically relevant improvement on patient reported knee function. For shortterm effects, 3 months was identified by the patients and clinicians as an important time-point. Long-term effects were analysed at 12 and 24 months. Patient reported knee function was measured with the IKDC questionnaire [7]. The IKDC is a validated and reliable questionnaire with good responsiveness in patients with degenerative meniscal tears [17,29]. Clinically relevant improvement was defined as an improvement exceeding the minimal important change (MIC) of the IKDC of 11 points for this patient population [17]. For this study, patients were divided into two groups at 3, 12, and 24 months follow-up: (1) patients who experienced a clinically relevant improved knee function (an improvement ≥ 11 IKDC points; i.e. good outcome) and (2) patients who did not experience a clinically relevant improved knee function (a deterioration or improvement < 11 IKDC points; i.e. poor outcome).

Data processing and statistical analysis
For the preselected baseline characteristics with a continuous outcome, separate logistic regression models were developed (treatment selection models) to predict the outcome using (1) the baseline characteristic (marker), (2) the allocated treatment (meniscal surgery or physical therapy), and (3) a marker-by-treatment interaction term. Interactions with a p value for association < 0.1 were considered as potential treatment selection markers [3,22].
These potential treatment selection markers were further explored using predictiveness curves. These predictiveness curves present the risk on a poor outcome (no clinically relevant improved or deteriorated knee function) for both treatments. Furthermore, they also provide information on the performance of the potential treatment selection markers to guide treatment decisions, so called summary measures [8]. A detailed explanation of a predictiveness curve is provided in Supplement 4. The performance of the potential treatment selection markers was analysed under the assumption that physical therapy is the standard treatment as suggested by the current guidelines [21,24].
The summary measures provide information on: 1. Marker positivity threshold; the threshold value of the baseline score of the potential treatment selection marker. Above this value patients would receive a recommendation for physical therapy, below this value a recommendation for meniscal surgery; 2. Marker positivity rate; the proportion of patients with a marker value greater than the marker positivity threshold. For this proportion of the population, physical therapy has an advantage over meniscal surgery; 3. Marker negativity rate; the proportion of patients with a marker score smaller than the marker positivity threshold. For this proportion of the population, meniscal surgery has an advantage over physical therapy and for this group standard care (i.e. physical therapy) would be recommended to change; 4. Average benefit physical therapy; the average benefit of physical therapy in patients with a marker value above the marker positivity threshold. This measure evaluates the effect of physical therapy compared to meniscal surgery on the marker outcome; 5. Average benefit meniscal surgery; the average benefit of meniscal surgery in patients with a marker value below the marker threshold. This measure evaluates the effect of meniscal surgery, if treatment decision was guided by the model, in terms of the expected decrease in patients with a poor outcome; 6. Decrease in rate of poor outcome, the estimated change in the outcome in our population if treatment decisions are guided by the model compared to the outcome when treated according to standard care (i.e. physical therapy). This measure is used to provide information on the decrease in percentage of patients with a poor outcome when treatment is decided on basis of the treatment selection models.

3
All analyses were performed based on the intention-totreat data. The data analyses for the predictiveness curves were performed using R-studio, version 1.2.1335 and package 'Treatment selection' (R-studio Inc., Boston, MA, USA) [8].
The sample size was determined and calculated for the RCT in which the patients' data were collected. The details on the sample size calculations can be found in previous publications [26,27].

Participants
Three hundred and twenty-one patients were included in the study. The mean (SD) age was 57.5 (6.6) years, and 161 (50.5%) participants were female. A total of 158 patients were allocated to surgery and 161 to physical therapy. Both groups showed comparable baseline characteristics for the potential treatment selection markers (Table 1). Main results and a detailed flow chart of the ESCAPE trial were previously published [26].
At 3 months follow-up, 57.0% (meniscal surgery) and 52.2% (physical therapy) of the patients were improved in knee function (> 11 IKDC points). At 12 months, this was 70.3% (surgery) and 54.7% (physical therapy), and at 24 months, this was 70.9% (surgery) and 65.8% (physical therapy). This shows that over time more patients achieved a clinically important improvement in knee function. In the physical therapy arm, 43 patients (27%) received delayed meniscectomy within 24 months.

Treatment selection markers
Potential treatment selection markers at baseline were general physical health (p = 0.01), pain during activities (p = 0.02) and knee function (p = 0.07) for the outcome at 3 months; BMI (p = 0.05) and age (p = 0.06) for the outcome at 12 months; and age (p = 0.05) for the outcome at 24 months (Table 2).

Prediction curves for potential treatment selection markers
These potential treatment selection markers were further explored with predictiveness curves. Figures 1, 2, 3, 4, 5 show the predictiveness curves at 3, 12, and 24 months for the following markers: general physical health, knee function, pain intensity during activities, age, and BMI. Marker-by-treatment predictiveness curves for the outcome at 3, 12, and 24 months. Abbreviations: PT physical therapy, APM arthroscopic partial meniscectomy (surgery); predictiveness curves present the risk for individual patients, Table 2 Logistic regression analyses for interaction between the baseline characteristics and treatment at 3, 12, and 24 months Abbreviations: MIC minimally important change, OR odds ratio, CI confidence intervals, IKDC International Knee Documentation Committee Subjective Knee, RAND-36 PCS Physical Component Scale for or general physical health, BMI body mass index, K-L Kellgren-Lawrence scale Marker-by-treatment interactions per follow-up moment are shown a (n = < MIC vs. n = ≥ MIC) For each follow-up moment the distribution of patients who experienced MIC in knee function (improvement ≥ 11 IKDC points) and patients who did not experience a MIC in knee function (changed IKDC score < 11 points) is reported. The reference treatment is physical therapy. Data were available of 313 patients at 3 months, 279 patients at 12 months, and 289 patients at 24 months b For each marker-by-treatment interaction, the OR shows the relative change per unit increase in the marker and we reported the 95% CI of the OR. An OR ≥ 1 indicates the value is in favour of physical therapy. The p values expressed whether the marker-by-treatment interaction is significant (p ≤ 0.1) c We analysed educational level, expectation of pain relief an K-L score as a continuous variable in the logistic regression analyses *Indicates the baseline characteristics that are potential treatment selection markers         45  52  58  63  70  45  52  58  63  70  45  52  59  63  70 Corresponding marker values Corresponding marker values Corresponding marker values Fig. 4 Patients with a score above the threshold would improve more from meniscal surgery. The marker-by-treatment interaction for the marker age is not significant at 3 months. However, at 12 and 24 months follow-up the marker-by-treatment interaction are sig-nificant (12 months p = 0.06; 24 months p = 0.05). The corresponding marker positivity threshold at 12 months follow-up is 49 years old and at 24 months follow-up the marker positivity threshold is 53 years old with a certain marker score, at the outcome of interest due to the given treatment. On the X-axis is the proportion of patients displayed that score below the corresponding marker value. The corresponding marker value is the raw marker score. On the Y-axis is the risk for the individual patient at the outcome with physical therapy and arthroscopic partial meniscectomy displayed. The sloping lines in Fig. 1 indicate that the risk of poor outcome increases with higher levels of baseline general physical health. This applies to both treatments. The intersection in Fig. 1a indicates that at 3 months, patients with baseline values below 40.7/100 were more likely to benefit from surgery and patients with baseline values above 40.7/100 were more likely to benefit from physical therapy. The curves at 12 and 24 months follow-up were sloping less, indicating a similar risk of poor outcome across baseline values. Figure 2 shows a similar pattern for the marker knee function. The curves at 12 and 24 months run largely parallel, indicating that the effect of this baseline marker was similar for both treatments. The lines sloping down in Fig. 3 indicate that the risk of poor outcome decreased with higher levels of pain at baseline. The intersection in Fig. 3a indicates that at 3 months, patients with baseline VAS scores below 53.9/100 were more likely to benefit from surgery and patients with baseline VAS scores above 53.9/100 were more likely to benefit from surgery. No such intersection indicating a potentially relevant cut-off for baseline pain was observed at 12 and 24 months. For age and BMI, all curves were rather horizontal, but diverging with increasing baseline marker values with half of them reaching statistical significance. This indicates that the benefit of surgery compared to physical therapy was largest for patients with highest age and BMI. For all markers the predictiveness curves were inconsistent over time.
The summary measures of the predictiveness curves are presented in Table 3. This provides information on marker positivity threshold, average benefit physical therapy, average benefit surgery, marker positivity rate, marker negativity rate, and decrease in rate of poor outcome.

Discussion
The important finding of the present study was that the identification of potential treatment selection markers did not result in clear clinical subgroups of patients who are substantially more likely to benefit from either surgery or physical therapy. Therefore, treatment decisions for patients with degenerative meniscal tear cannot be based on these treatment selection markers evaluated in the current study.
The published randomized clinical trials that evaluated surgical to conservative treatment for degenerative meniscal tears revealed small and clinically non-meaningful benefits of meniscal surgery over physical therapy in patients with degenerative meniscal tears for patient reported knee function [5,6,9,15,26,32]. However, due to potential heterogeneity in treatment effects, this does not necessarily imply that individual patients cannot have a clinically relevant improvement from meniscal surgery compared to physical therapy [23].
The present study revealed that the average benefit that individual patients would experience from meniscal surgery is small (ranging from 5.2 to 14.8%) if treatment would be based on these markers. Similar to the results from the RCTs, the increased benefit that some patients may experience from meniscal surgery compared to physical therapy  is not convincing since these benefits were small and not consistently present on all follow-up moments.
No studies were found that have analysed the variation in treatment effect for a musculoskeletal disorder based on RCT baseline data of their patients by performing a Table 3 Summary measures of  predictiveness curves for pain  during activities The proportions are given in percentages (95% confidence interval) Abbreviation: NA not available, no marker positivity threshold as the line do not cross each other) The score ranges from 0 to 100, with 100 representing the best possible knee function Interpretation (example): For general physical health at 3 months, 36.8% (95% CI 5.4-66.1) of the patients scored higher than the threshold value of 40.7 points (marker positivity rate), representing the cut-off point for a better outcome from physical therapy. Patients with a score above this threshold had an average 10.1% (95% CI 1.1-20.6) better outcome from physical therapy as compared to those treated with surgery (average benefit physical therapy). A total of 63.2% (95% CI 33.5-94.6) of the patients scored lower than the threshold (marker negatively rate). These patients had an average 13.1% (95% CI 4.1-22.9) better outcome from surgery as compared to those treated with physical therapy (average benefit surgery). If treatment would be based on general health, there would be an 8.3% (95% CI 1.5-16.4) reduction in poor outcomes at 3 months if all patients with a RAND-36 score below 40.7 would receive surgery (decrease in rate of poor outcome) marker-by-treatment analysis. One cohort study on prognostic factors was identified and found worse outcomes at 1 and 2 years after surgery in case of complex tears, larger extrusion, cartilage injuries, and larger meniscal excision but without comparison to physical therapy [14]. In another, computer-based, prediction model in a similar population multivariable prognostic models were investigated to identify a subgroup of patients who might benefit from meniscectomy [19]. The multivariable prognostic models did not accurately predict treatment outcome after 1 year of surgery, and the study did not consider specific cut-off points that can potentially differentiate between the outcomes from the two treatments. In another study the orthopaedic surgeons' prediction ability for treatment outcome in patients with degenerative meniscal tears was analysed for both physical therapy and meniscectomy [25]. Similar to the current findings, neither of these prediction studies were able to identify any subgroup of patients who might benefit from a meniscectomy or physical therapy on the longer term [25].
To our knowledge, this is the first RCT-based marker-bytreatment analysis that assessed the differential treatment effect of potentially relevant baseline variables for predicting clinically relevant improvement of knee function in patients with a degenerative meniscal tear. For treatment decision making, this type of prediction studies may be favourable over more common multivariable prediction studies [8]. Marker-by-treatment analyses focus on predicting the difference in outcomes between the two treatments, rather than only predicting the outcome for one treatment. Therefore, the analyses help clinicians and patients select the best treatment to optimize the outcomes. The selection markers deemed relevant by patients in this study aim to direct the choice of treatment based on specific baseline characteristics and the corresponding marker cut-off values.
Several limitations should be mentioned. First, the observed treatment thresholds did not account for potential adverse events resulting from surgery as an alternative to physical therapy. In other words, treatment benefit from surgery is overestimated because the risk of surgical complications is neglected [2]. Second, due to trial-based approach, the available baseline characteristics were restricted. Some potentially important predictors, such as objective knee function, muscle strength, the duration of symptoms [10], could not be included in our analyses. Although these factors may be viewed as relevant for treatment outcome, prior prognostic models that did include these variables could also not accurately predict treatment outcome in this population [19]. [10] Trials that included these prognostic variables are recommended to perform a marker-by-treatment analysis including these variables. Also, the trial-based approach might have resulted in an insufficient power for marker-by-treatment analysis due to the size of the RCT cohort. Third, the primary interest concerned the cut-off point on a predictiveness curve that distinguishes between a better outcome after surgery or after physical therapy based on a patient's baseline score. Therefore, only continuous variables could be included, and dichotomous and categorical variables such as sex, joint line tenderness and tear type were not addressed.

Clinical implications
In general, marker-by-treatment analyses determine whether baseline characteristics can be used in making treatment decisions for individual patients. The predictiveness curves and performance measures of these predictiveness curves show the amount of benefit that individual patients will have from a treatment, if the treatment decision for that patient is based upon the treatment selection marker [8]. A threshold value is derived that differentiates between a favourable outcome for either of the compared treatments. Although such thresholds are rather uncommon to use as treatment decision tool in clinical practice, this information can be of high value to clinicians and policy makers who are seeking evidence based decision tools to weigh treatment benefit for individual patients against the risk of adverse events and healthcare costs [13]. So, instead of using mean outcomes of RCTs to make treatment decisions, patients and clinicians can potentially base their treatment decision for an individual patient upon the treatment selection marker.
In patients with degenerative meniscal tears our markerby-treatment analyses only revealed specific baseline characteristics that showed a small increase in a better treatment outcome after meniscal surgery for each follow-up time point. As the combination of characteristics varies among patients, combining these potential selection markers may be more accurate. Future research, combining the individual data from the published RCTs in an individual patient data meta-analysis [31], may be able to identify any of these subgroups and could steer towards an even more individualized approach.
The opinion that surgery is necessary to regain normal knee function in selected patients is not supported by our study or previous scientific evidence in which subgroups have been unable to be identified [4,19,25]. Therefore physical therapy is recommended as initial treatment for all patients with degenerative meniscal tears.

Conclusion
A marker-by-treatment analysis was successfully conducted in orthopaedic research. No subgroups were found in this study that benefit more from surgery throughout the follow-up period. Physical therapy should be considered first choice treatment in all patients over 45 years old with degenerative meniscal tears who do not experience locking of the knee. Although the treatment selection markers had clear thresholds, none of the markers maintained a predictive effect over time. Therefore, treatment decisions for patients with degenerative meniscal tear cannot be based on the treatment selection markers studied in this trial.
Funding This study was funded by the Dutch Arthritis Society (in Dutch: ReumaNederland; grant number 18-2-201), the Netherlands Organization for Health Research and Development (in Dutch: ZonMw; grant number 837002009), Zilverenkruis Health Insurance (grant number Z436) and the foundation of medical research of OLVG, Amsterdam (grant number 15u.025). The funders had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. All authors had full access to all of the data in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis.

Data availability
Requests for access to data should be addressed to the corresponding author.

Declarations
Conflict of interest All authors completed the Unified Competing Interest form (available on request from the corresponding author) and declare: VG, NW, JN, and RP received financial support from The Netherlands Organization for Health Research and Development (in Dutch: ZonMw) for the submitted work; the Achmea Healthcare Foundation (in Dutch Stichting Achmea Gezondheidszorg fonds), Dutch Arthritis Society (in Dutch: ReumaNederland and the foundation of medical research at OLVG, Amsterdam, the Netherlands; no financial relationships with any organization that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work. Transparency The corresponding author affirms that the manuscript is an honest, accurate, and transparent account of the study bring reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.