Editor’s Spotlight/Take 5: What Are the Levels of Evidence on Which We Base Decisions For Surgical Management of Lower Extremity Bone Tumors?
- First Online:
- Cite this article as:
- Leopold, S.S. Clin Orthop Relat Res (2014) 472: 3. doi:10.1007/s11999-013-3332-9
- 672 Downloads
Benign bone tumors are uncommon and malignant tumors of bone—other than metastases from nonskeletal sources—are rare. But these conditions cause harm disproportionate to their frequency. Malignant bone tumors are among the few diagnoses orthopaedic surgeons treat that can kill patients. And benign bone tumors that appear in inopportune locations, like the spine or the subtrochanteric area of the femur, may result in pain, pathological fractures that are hard to treat, and lingering disability.
As much as we would like to wish these conditions off to subspecialists, malignant bone tumors occur just often enough that it is likely most of us will make (or worse, miss) the diagnosis a few times in our careers. We will also see many benign bone tumors along the way, but because they do not come conveniently labeled as such, we will need to know what we are looking at.
Articles in peer-reviewed journals can help us do a good job at this important task. So the quality of those articles—and how to make the best use of them—is everyone’s business, not just the business of tumor specialists.
The evidence-based orthopaedics group at McMaster University, led by Dr. Michelle Ghert, just completed an analysis of 10 years worth of those papers in their report, “What Are the Levels of Evidence Upon Which We Base Decisions For Surgical Management of Lower Extremity Bone Tumors?” . Perhaps unsurprisingly, given the rarity of bone tumors, the large majority of studies (92%; 558 of 607) were either retrospective case series or case reports. More importantly, the quality of the scientific reporting in the papers they evaluated was extremely variable. Almost all the studies missed what I would call the “fine points” for studies of this sort; for example, only 2% justified their choice of sample size. That does not trouble me terribly, nor does the near-absence of randomized controlled trials, as these conditions are just too rare to expect many randomized controlled trials.
What I found disappointing—not as an editor, but as a consumer of peer-reviewed research—was the frequency with which studies failed to report key elements like eligibility criteria (missing in 45% of studies), followup (missing in 34%), and limitations of the study’s design (missing in 47%). Every study can, and should, provide its readers with those basic elements. Authors, reviewers, and editors must do better.
Dr. Ghert’s team examined research through 2012, and so I am afraid that it is still caveat lector (let the reader beware), even today. As we read case series, which remain the most common kind of research we are likely to read for the foreseeable future, we need to keep in mind the three most important kinds of bias that affect those studies, as well as some guiding questions on how to think about those kinds of bias:
Selection bias: Who was included? How were these patients chosen? Were they the “easy” cases or the “hard” ones? What percentage of the authors’ practice did this represent? Are the authors clear both on what their surgical indications were but also on their inclusion criteria for the study?
Detection bias: Were the endpoints of the study defined clearly? Were suitable, validated outcomes instruments chosen, if this is appropriate? Who did the assessments? Was (s)he also involved in the care of the patients? Is it likely that any outside interests played a role in the endpoints chosen or how they were assessed?
Transfer bias: Was the followup sufficiently long and complete? Watch out for studies that report a 97% success rate with 55% of patients lost to followup.
There are other kinds of bias, and external validity (“does the research done at the study site apply to my practice?”) is still critically important. But those three sources of bias seem to be the ones most likely to influence results in retrospective clinical research, and the ones we need to watch out for most carefully.
Take 5 Interview with Dr. Michelle Ghert and Dr. Nathan Evaniew, senior and lead authors of “What are the Levels of Evidence Upon Which We Base Decisions For Surgical Management of Lower Extremity Bone Tumors?”
Seth S. Leopold, MD:When you began the project, you could not have expected many randomized trials, and no doubt you knew up front that most of these papers would be case series or case reports. That being the case, what made you decide to take on this project?
Michelle Ghert MD: Tumor surgeons are challenged by the rarity of sarcomas, and the resulting difficulty in conducting adequately powered studies. We sought to quantitate what our subspecialty already suspected: that we are basing our clinical decisions on low-level evidence. We consider this an important initial step as our subspecialty moves towards more methodologically rigorous research. Although retrospective observational studies (case series) still dominate the orthopaedic surgery literature, orthopaedic research, including sarcoma research, is undergoing a paradigm shift towards high-quality evidence. In fact, the first large international collaborative clinical trial in sarcoma surgery, the Prophylactic Antibiotics in Tumor Surgery (PARITY) trial, is now underway in its pilot phase .
Dr. Leopold:What would you say the main take-home message of your paper should be for “consumers” of clinical research in orthopaedic journals—readers?
Nathan Evaniew MD: Nonrandomized retrospective studies are currently the dominant form of evidence in the orthopaedic literature. In these types of studies, numerous deficiencies in reporting raise the possibility of systematic forms of bias such as recall bias, selection bias and outcomes assessment bias. Surgeons should be aware of these limitations when discussing management options with their patients. Studies often make strong conclusions on weak and sometimes highly biased evidence. Our take-home message would be for consumers to use resources available to them [1, 2], and to arm themselves with basic skills in understanding research methodology so that they can individually decide if the conclusions of a given study warrant change in clinical practice.
Dr. Leopold:What do you think the most important message of your research is for orthopaedic researchers—people who write articles for publication in peer-reviewed journals?
Dr. Ghert: As orthopaedic clinician-scientists, the bulk of the evidence we produce will remain Level III and IV. This type of evidence does in fact have significant value, but that value will be determined by the adequacy of reporting. I suspect that, although our STROBE checklist identified large-scale lapses in reporting, the authors were simply unaware of the importance of reporting items such as missing data, loss to followup, and methodology limitations. Therefore, our message to researchers is to consider the STROBE checklist when composing their research reports. This will strengthen their studies and provide a clearer and more accurate message to the consumer.
Dr. Leopold:Is there anything you would like to tell orthopaedic editors, based on what you learned? Do not sugar-coat things, I can take it.
Dr. Ghert: Over the last decade, orthopaedic editors have done an excellent job of recognizing the need to move to higher-level evidence and have been very active in promoting the publication of methodologically well-designed studies. In the review and selection of studies for publication, it is useful for journals to enlist peer-reviewers with training and/or an interest in research methodology. Content experts provide valuable insight into the clinical relevance of a study, but do not always possess expertise in critical appraisal. A peer-review team consisting of clinical experts and methodological experts provides the ideal mechanism for publishing unbiased and impactful studies.
Dr. Leopold:I have a somewhat technical question, so I have left it for last. I noticed you evaluated the quality of scientific reporting using the STROBE checklist; I also saw that you recognized the shortcomings of using this tool for a purpose for which it was not designed. We have reasonably good tools for evaluating the quality of randomized trials (and you used one, the Detsky scale), but we really have no good, validated tool for studying the most common kind of research we are likely to see: The case series. Any ideas on how we might solve this issue?
Dr. Evaniew: The STROBE checklist is particularly useful for evaluating the transparency of reporting for observational studies, and transparency is critical because it allows readers to detect and understand bias. STROBE has been adopted by many top journals, and authors and readers should consider it when performing or appraising observational research. Nonetheless, actually quantifying bias in observational studies remains extremely challenging, particularly for systematic reviews and meta-analyses. More than 100 instruments have been developed, but none has demonstrated excellent validity and reliability and there is no consensus on which is best [3, 6]. Education and awareness are the keys to solving this issue. It is our shared responsibility to inform users of clinical research about bias and provide the critical appraisal tools required to carefully apply current best evidence.