Clinical Orthopaedics and Related Research®

, Volume 475, Issue 3, pp 853–860

Has the Level of Evidence of Podium Presentations at the Musculoskeletal Tumor Society Annual Meeting Changed Over Time?

  • Daniel M. Lerman
  • Matthew G. Cable
  • Patrick Thornley
  • Nathan Evaniew
  • Gerard P. Slobogean
  • Mohit Bhandari
  • John H. Healey
  • R. Lor Randall
  • Michelle Ghert
Symposium: 2015 Meetings of the Musculoskeletal Tumor Society and the International Society of Limb Salvage

DOI: 10.1007/s11999-016-4763-x

Cite this article as:
Lerman, D.M., Cable, M.G., Thornley, P. et al. Clin Orthop Relat Res (2017) 475: 853. doi:10.1007/s11999-016-4763-x

Abstract

Background

Level of evidence (LOE) framework is a tool with which to categorize clinical studies based on their likelihood to be influenced by bias. Improvements in LOE have been demonstrated throughout orthopaedics, prompting our evaluation of orthopaedic oncology research LOE to determine if it has changed in kind.

Questions/purposes

(1) Has the LOE presented at the Musculoskeletal Tumor Society (MSTS) annual meeting improved over time? (2) Over the past decade, how do the MSTS and Orthopaedic Trauma Association (OTA) annual meetings compare regarding LOE overall and for the subset of therapeutic studies?

Methods

We reviewed abstracts from MSTS and OTA annual meeting podium presentations from 2005 to 2014. Three independent reviewers evaluated a total of 1222 abstracts for study type and LOE; there were 577 abstracts from MSTS and 645 from OTA. Changes in the distributions of study type and LOE over time were evaluated by Pearson chi-square test.

Results

There was no change over time in MSTS LOE for all study types (p = 0.13) and therapeutic (p = 0.36) study types during the reviewed decade. In contrast, OTA LOE increased over this time for all study types (p < 0.01). The proportion of Level I therapeutic studies was higher at the OTA than the MSTS (3% [14 of 413] versus 0.5% [two of 387], respectively), whereas the proportion of Level IV studies was lower at the OTA than the MSTS (32% [134 of 413] versus 75% [292 of 387], respectively) during the reviewed decade. The proportion of controlled therapeutic studies (LOE I through III) versus uncontrolled studies (LOE IV) increased over time at OTA (p < 0.021), but not at MSTS (p = 0.10).

Conclusions

Uncontrolled case series continue to dominate the MSTS scientific program, limiting progress in evidence-based clinical care. Techniques used by the OTA to improve LOE may be emulated by the MSTS. These techniques focus on broad participation in multicenter collaborations that are designed in a comprehensive manner and answer a pragmatic clinical question.

Introduction

The goal of evidence-based medicine (EBM) is to improve patient care by integrating the best available scientific evidence with individual patient preferences and physician expertise [37, 38]. The efficacy of EBM to assist in clinical care is dependent on the quality and applicability of the available clinical research. The University of Oxford’s Centre for Evidence-Based Medicine established a hierarchy of clinical studies, defined by Levels of Evidence (LOE), in which studies that are less likely to be affected by bias such as randomized controlled trials (RCTs) are ranked above studies with less rigorous scientific methods such as uncontrolled case series.

Compared with medical specialties, surgical fields have been slow to incorporate EBM into clinical care and education [3]. This delay may be attributed to the numerous challenges to surgical RCTs [42, 45]. The prospective comparison of surgical treatment requires the demonstration of equipoise to the internal review board, participating physicians, and prospective patients [14]. Study implementation is further complicated by surgeons’ variable experience, procedure familiarity introducing a potential source of bias when comparing surgical treatments, and difficulties associated with blinding [13, 14, 24]. Despite these challenges, impactful high-quality surgical RCTs have been accomplished and are necessary to advance clinical care [27, 44].

In 2008, the Study to Prospectively Evaluate Reamed Intramedullary Nails in Patients with Tibial Fractures (SPRINT) RCT was published by a collaborative orthopaedic trauma study group, demonstrating that high-level collaborative surgical studies were feasible and clinically meaningful [44]. In recent years, the LOE presented at the Orthopaedic Trauma Association (OTA) annual meeting has improved with a 30% increase in Level I studies [41]. Improvements in the OTA LOE occur despite the obstacles to recruitment, compliance, and followup inherent to the trauma population [39]. Although some high-level trauma studies benefit from the frequency of common injuries, RCTs have also been performed for infrequent fracture patterns with limited patient enrollment [8].

Challenges to high-level orthopaedic oncology research are often cited to be the infrequency and heterogeneity of the disease processes. Despite this, other fields have overcome these obstacles to generate high-quality sarcoma-focused RCTs [32, 47, 48]. High-level orthopaedic oncology studies are necessary to advance clinical care. Although many orthopaedic subspecialties have demonstrated improvement in their literature’s LOE [10, 18, 21, 41], the current state of orthopaedic oncology research has yet to be evaluated.

The objectives of this study were to answer the following questions: (1) Has the distribution of LOE presented at Musculoskeletal Tumor Society (MSTS) annual meetings improved over time? (2) Over the past decade, how do the MSTS and OTA annual meetings compare regarding LOE presented overall and for the subset of therapeutic studies?

Materials and Methods

A complete collection of podium abstracts from 2005 through 2014 was obtained from both the OTA and MSTS annual meetings and organized into a single database. The past decade of annual meetings was selected for review to provide insight into the current state of the annual meetings and a sufficiently large breadth of sample to minimize statistical error from annual aberrations. Presentation abstracts from the OTA were obtained from the OTA Annual Meeting Archives website [35]. Abstracts for MSTS presentations were retrieved from the MSTS website when available, and the outstanding abstracts were obtained in hard copy by request from the MSTS [28].

Three independent reviewers (DML, MGC, PT) evaluated a total of 1403 abstracts. Basic science and/or biomechanical studies were excluded. Each of the remaining clinical study podium presentation abstracts were evaluated for study type and LOE in keeping with the American Academy of Orthopaedic Surgery (AAOS) guidelines (Table 1) [49], which were adapted from a rubric originally designed by the Centre for Evidence-Based Medicine [46]. A single study type, defined as therapeutic, prognostic, diagnostic, or economic, was assigned to each presentation based on the abstracts’ stated primary objective. After determination of study type, LOE I through V was assigned to each abstract. An Internet-based algorithm was used to assist the reviewers in their determination of study type and/or LOE [34]. A pilot series of 40 abstracts, two from each of the reviewed annual meetings, were adjudicated independently by the three reviewers and then discussed. This initial audit allowed the reviewers to familiarize themselves with the process, educate each other, and identify any systemic issues before a complete review of all 1403 abstracts.
Table 1

Levels of evidence criteria for different types of studies

 

Type of studies

Therapeutic

Prognostic

Diagnostic

Economic and decision analyses

Level I

• High-quality RCT with statistically significant difference or no statistically significant difference but narrow confidence intervals

• Systematic review of Level I RCT (and study results were homogenous)

• High-quality prospective trial (all patients were enrolled at the same point in their disease with > 80% follow-up of enrolled patients)

• Systematic review of Level I studies

• Testing of previously developed diagnostic criteria in series of consecutive patients (with universally applied reference “gold” standard)

• Systematic review of Level I studies

• Sensible cost and alternatives; values obtained from many studies; multiway sensitivity analyses

• Systematic review of Level I studies

Level II

• Lesser quality RCT (< 80% followup, no blinding, or improper randomization)

• Prospective comparative study

• Systematic review of Level II studies or Level I studies with inconsistent results

• Retrospective study

• Untreated controls from a RCT

• Lesser quality prospective study (patients enrolled at different points in their disease or < 80% followup)

• Systematic review of Level II studies

• Development of diagnostic criteria on basis of consecutive patients (with universally applied reference “gold” standard)

• Systematic review of Level II studies

• Sensible cost and alternatives; values obtained from limited studies; multiway sensitivity analyses

• Systematic review of Level II studies

Level III

• Case-control study

• Retrospective comparative study

• Systematic review of Level III studies

• Case-control study

• Study of nonconsecutive patients (without consistently applied reference “gold” standard)

• Systematic review of Level III studies

• Analyses based on limited alternatives and costs; poor estimates

• Systematic review of Level III studies

Level IV

• Case series

• Case series

• Case-control study

•Poor reference standard

• No sensitivity analyses

Level V

• Expert opinion

• Expert opinion

• Expert opinion

• Expert opinion

Adapted with permission from Slobogean GP, Dielwart C, Johal HS, Shantz JA, Mulpuri K. Levels of evidence at the Orthopaedic Trauma Association annual meetings. J Orthop Trauma. 2013;27:e208–212; RCT = randomized controlled trial.

The highest potential LOE was assigned based solely on the information available in the published abstract. For example, RCTs were assigned a Level II unless evaluator blinding was explicitly stated. Therefore, as a result of the lack of blinding or unclear reporting, the majority of randomized surgical trials presented at the OTA and MSTS did not satisfy the criteria for Level I evidence and were subsequently assigned Level II. Registry studies are observational and were appropriately categorized within the hierarchy of evidence [11, 22].

Interobserver agreement was calculated for determination of study type and LOE using Fleiss’ kappa [20]. Kappa (κ) values were interpreted according to Landis and Koch as: 0 poor, 0.01 to 0.20 slight, 0.21 to 0.40 fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 almost perfect [40]. For instances in which disagreement occurred among the independent reviewers, discrepancies were reviewed and resolved by consensus discussion among the three reviewers. Level V studies were identified and then excluded from further analysis because they did not present original clinical data [33, 41]. Of the initial 1403 abstracts reviewed, 1222 represented clinical studies and were included in this analysis—577 from MSTS and 645 from OTA. There was substantial agreement between the independent reviewers’ adjudication of study type and LOE (κ = 0.856 and 0.776, respectively) [20].

Pearson chi-square tests were used to determine significance of the changes in distribution of study type and LOE over time for MSTS and OTA annual meetings. Although Pearson chi square does not inform about trend, it does tell us if the distributions of LOE did or did not change over time. The changes in LOE over time were assessed for all study types in combination and subgroup analysis of therapeutic studies only. The change in proportion of controlled therapeutic studies (LOE I through III) versus uncontrolled therapeutic studies (LOE IV) over time was also determined. This statistical subgroup analysis was planned a priori. Post hoc subgroup analyses were not performed because of concerns for identifying spurious results [17]. All tests of significance were two-tailed and p values < 0.05 were considered significant. All analyses were performed using Microsoft Excel (Santa Rosa, CA, USA) and IBM SPSS Version 21 (Chicago, IL, USA).

Results

There was no change observed in the overall LOE presented at MSTS annual meetings over the reviewed decade (Fig. 1A; χ2 = 35.5, degrees of freedom 27, p = 0.13) (Table 2). The overall LOE distribution for MSTS abstracts was 4% (22 of 577) Level I, 20% (114 of 577) Level II, 16% (94 of 577) Level III, and 60% (347 of 577) Level IV.
Fig. 1A−D

The graphs illustrate the distribution of LOE, as a percentage of all clinical research podium presentations, at a given meeting. The LOE distribution for all study types at (A) MSTS and (B) OTA. The LOE distribution for therapeutic studies at (C) MSTS and (D) OTA.

Table 2

Change in the distribution of abstracts’ study type and LOE over time for MSTS and OTA

Organization

Change over time

Pearson chi square

Degrees of freedom

p value

MSTS

Study type

87.4

36

< 0.01

 

Overall LOE

35.5

27

0.128

 

Therapeutic LOE

29.1

27

0.356

OTA

Study type

123

36

< 0.01

 

Overall LOE

46.9

27

< 0.01

 

Therapeutic LOE

36.6

27

0.103

LOE = level of evidence; MSTS = Musculoskeletal Tumor Society; OTA = Orthopaedic Trauma Association.

In contrast to MSTS, the OTA overall LOE increased over the observed decade (Fig. 1B; χ2 = 46.9, degrees of freedom 27, p < 0.01) with an overall LOE distribution of 8% (52 of 645) Level I, 30% (195 of 645) Level II, 29% (188 of 645) Level III, and 33% (210 of 645) Level IV. Subgroup analysis of MSTS therapeutic studies demonstrated a static LOE distribution (Fig. 1C; χ2 = 29.1, degrees of freedom 27, p = 0.36) with a predominance of retrospective Level III and IV studies—0.5% (two of 387) Level I, 2% (nine of 387) Level II, 22% (84 of 387) Level III, and 75% (292 of 387) Level IV. Similarly, OTA therapeutic studies did not increase their LOE over time, although there was a greater proportion of Level I studies than MSTS (3% [14 of 413] versus 0.5% [two of 387], respectively). The distribution of OTA therapeutic studies LOE was 3% (14 of 413) Level I, 23% (93 of 413) Level II, 42% (172 of 413) Level III, and 32% (134 of 413) Level IV (Fig. 1D; χ2 = 36.6, degrees of freedom 27, p = 0.10).

The overall distribution of study types for the decade of reviewed MSTS abstracts was 55% (386 of 707) therapeutic, 22% (156 of 707) prognostic, 3% (24 of 707) diagnostic, 1% (10 of 707) economic, and 18% (130 of 707) basic science studies (Fig. 2A; χ2 = 87.4, degrees of freedom 36, p < 0.01). An increase in the proportion of prognostic and economic studies was observed descriptively. The distribution of OTA abstracts by study type was 59% (413 of 696) therapeutic, 26% (179 of 696) prognostic, 5% (36 of 696) diagnostic, 7% (17 of 696) economic, and 7% (51 of 696) basic science studies (Fig. 2B; χ2 = 123, degrees of freedom 36, p < 0.01). Similar to MSTS, the distributions of study types changed over time toward a higher proportion of prognostic studies. Among MSTS abstracts, no change was observed in the distribution of controlled therapeutic studies (LOE I through III) versus case series (LOE IV) over the observed decade (χ2 = 14.7, degrees of freedom 9, p = 0.10). In contrast, the proportion of controlled OTA studies did increase in comparison to uncontrolled case series over the same time period (χ2 = 19.5, degrees of freedom 9, p < 0.02).
Fig. 2A–B

The graphs illustrate the percentage of different study types presented at the MSTS (A) and OTA (B) annual meetings from 2005 through 2014.

Discussion

The LOE framework is a crude, yet reliable tool with which to rapidly categorize clinical studies based on their likelihood to be influenced by bias [4]. Since 2001, the LOE of clinical research presented at the AAOS and OTA annual meetings has increased [41, 46]. Over this same time period, multiple orthopaedic subspecialty journals have demonstrated an increased emphasis on high-level study design and have subsequently increased the average LOE of articles published [10, 18, 21, 29, 33, 50]. It was not known whether the research presented at the leading North American orthopaedic oncology meeting had kept pace with the changes observed in other orthopaedic subspecialties. Therefore, we sought to determine if the LOE of clinical research presented at the MSTS annual meeting had improved over the last decade and how the change, if any, compared with that of the OTA.

There are numerous limitations to this study. We used an established and frequently referenced LOE hierarchical scheme (Table 1), although there are many others in the literature [41, 49]. However, regardless of the rubric used, LOE determination merely allows for the categorization of a clinical study based on its likelihood of being influenced by bias and is not an indicator of research quality. The simplicity of LOE adjudication is responsible for its popularity and its limitations. A more comprehensive framework with which to evaluate research quality is GRADE [1, 19]. The GRADE schematic integrates (1) study design, (2) risk of bias, (3) indirectness, (4) imprecision, (5) inconsistency, and (6) publication bias. GRADE assessments require more information than that provided by podium abstracts and therefore it could not be used for this study. Because study quality is independent of LOE, there are poor-quality RCTs and high-quality, practice-informing case series [2, 12, 26]. Therefore, our LOE review is not intended to dismiss the impact of Level IV case series but to identify the paucity of Level I data in hopes of promoting greater balance among the literature.

Although we acknowledge that the MSTS is not the only venue for orthopaedic oncology clinical research and that a minority of podium presentations are ultimately published [23, 43], we do believe it is a representative sample of surgical academia in North America. In contrast to other orthopaedic subspecialties, oncology lacks a dedicated journal, eliminating subspecialty-specific publications as a potential resource for LOE evaluations [18, 29, 50]. Although Clinical Orthopaedics and Related Research® is the official journal of the MSTS, its publication of orthopaedic oncology research is neither comprehensive nor exclusive.

An additional limitation of this study was that only podium abstracts were reviewed. Therefore, determinations of study type and LOE were derived from information provided within the abstract itself. As a result of abstract brevity and incomplete descriptions of study design, errors in study adjudication are possible. However, both MSTS and OTA abstracts were equally susceptible to this source of error and therefore the comparison between the two annual meetings should not have been affected. There was a wide range in the number of podium presentations between different years at both the MSTS (31 to 198 podium presentations) and OTA (54 to 128 podium presentations), representing variability in annual meeting organization and program committees’ biases. Furthermore, we were unable to account for the discrepancy in the number of abstracts submitted to the meetings annually. The OTA annual meeting receives many more submissions than the MSTS and therefore has a larger population of studies from which to select the podium presentations. It is possible that the distribution of LOE submissions is equivalent among the two organizations, but as a result of the greater volume of OTA submissions, lower level studies are excluded. Although any comparison between MSTS and OTA will be affected by the differential in organization size, our serial evaluation of MSTS meeting abstracts should be independent of this confounding variable. By comparing the societies for LOE change over time, we hoped to minimize the influence of the difference in organization size.

Our review of MSTS annual meeting podium presentations found that the LOE of clinical research has been static over the past decade and case series continue to predominate. This is in contrast to LOE improvement demonstrated among orthopaedic publications [12, 21, 33] and other subspecialties [18, 41, 50].

Over the same time period, OTA abstracts demonstrated increased overall LOE and a greater proportion of controlled to uncontrolled therapeutic studies. The OTA was selected as a control group in this study as a result of the trauma community’s ability to generate multiple high-quality clinical trials that have had an immediate influence on clinical management and patient outcomes [7, 9, 25, 36, 41, 44]. Conversely, the inability to produce high-level orthopaedic oncology studies may be stunting progress in clinical care [15]. There are multiple plausible explanations for the dearth of high-level orthopaedic oncology research. Primarily, sarcoma is a rare disease for which clinical trials are challenging [31]. Study design parameters are difficult to optimize when the heterogeneity of clinical care obfuscates historical data for prognostic risk factors and clinical outcomes. In addition, with limited patient enrollment, statistically significant endpoints are difficult to obtain [6]. Despite these obstacles, multicenter (Children’s Oncology Group) and international (European and American Osteosarcoma Study Group) pediatric and medical oncology collaborations have completed sarcoma RCTs, the results of which have improved clinical practice [5, 47, 48].

Despite the obstacles to high-level orthopaedic oncology clinical studies, early examples of success can be found in the prospective evaluation of CT rigidity analysis for metastatic bone disease and the randomized controlled Prophylactic Antibiotic Regimens in Tumor Surgery (PARITY) trial’s evaluation of perioperative antibiotic utilization [16, 30]. These ongoing studies, designed to generate Level I evidence, demonstrate progress and serve as examples that high-level clinical research can be accomplished through multicenter collaboration. The OTA’s success may serve as a roadmap for improvements in MSTS clinical research. The critical advance is a robust multicenter collaboration with extensive membership participation. Such collaboration is facilitated by a pragmatic practice-changing research question and a trial protocol whose efficacy has been established by a feasibility study [42]. The unified support of a single, centralized infrastructure and systematic approach to obtaining multiple sources of funding should help support these expensive and resource-demanding projects. As a result of the modest size of MSTS membership and its research endowment, consideration could be made for collaboration with larger organizations, with which the MSTS mission overlaps.

Copyright information

© The Association of Bone and Joint Surgeons® 2016

Authors and Affiliations

  • Daniel M. Lerman
    • 1
  • Matthew G. Cable
    • 2
    • 3
  • Patrick Thornley
    • 4
  • Nathan Evaniew
    • 4
  • Gerard P. Slobogean
    • 1
  • Mohit Bhandari
    • 4
  • John H. Healey
    • 5
  • R. Lor Randall
    • 2
    • 3
  • Michelle Ghert
    • 4
  1. 1.Department of OrthopaedicsUniversity of Maryland School of MedicineBaltimoreUSA
  2. 2.Sarcoma Services, Primary Children’s Hospital & Huntsman Cancer InstituteUniversity of UtahSalt Lake CityUSA
  3. 3.Faculty of Health Sciences, Michael G. DeGroote School of MedicineMcMaster UniversityHamiltonCanada
  4. 4.Division of Orthopaedic Surgery, Department of SurgeryMcMaster University, HHS Hamilton General HospitalHamiltonCanada
  5. 5.Orthopaedic Service, Department of SurgeryMemorial Sloan Kettering Cancer CenterNew YorkUSA