Editorial: Large Database Studies—What They Can Do, What They Cannot Do, and Which Ones We Will Publish
- 1.9k Downloads
Most clinicians have come to imagine that scientific evidence has levels, and at the top of the pyramid are randomized clinical trials (RCTs) and meta-analyses . But we all have read some poorly designed randomized trials, and sometimes a humble case series can substantially influence thought about a technique or implant, particularly if it describes failures of treatment. Nevertheless, as the Centre for Evidence-Based Medicine analogizes , if clinical evidence were stored in boxes marked “randomized trial,” “cohort study,” “case series,” and the like, and we needed to answer an important question, we might first open the box stamped “randomized trial” to see if there is anything inside.
All the same, there are some important kinds of questions that RCTs and meta-analyses will never answer. Surgical RCTs generally are small to medium-sized, and as such, are tooled to assess efficacy more than safety. Randomized trials have almost no ability to detect rare events, evaluate resource utilization in normal practice, or compare any but the most common complications. Likewise, premature failures of commonly used interventions probably will not be identified first by RCTs, but rather by studies of other designs. To assess real-world and important endpoints like these, we will need to look elsewhere for evidence. Registry studies and analyses drawn from large-system databases represent two robust approaches to answering these kinds of questions.
Clinical Orthopaedics and Related Research ® is proud to support the excellent, thought-provoking work being done using the world’s registries. We publish it frequently [10, 11], and we have promoted it in Spotlight commentaries and interviews on these pages . CORR ® will publish selected proceedings from the most-recent meeting of the International Society of Arthroplasty Registries this summer, and we will cover the important role registries play in clinical research in an editorial in the coming months.
We also believe that important, pressing questions can be answered efficiently and accurately using large databases, including the Nationwide Inpatient Sample (NIS), National Surgical Quality Improvement Program (NSQIP), National Hospital Discharge Survey (NHDS), National Trauma Databank (NTDB), and Medicare administrative databases, among others. In fact, there has been something of a bloom of orthopaedic papers from these kinds of sources lately [2, 3, 4]. With sample sizes that run from thousands to millions of patients, these studies allow us to tackle questions that likely will never be answered any other way.
To make best use of this material, readers need to know just what kinds of questions can be asked of databases, how the same question may elicit different answers from different databases, and how slipshod or carelessly presented work from databases can mislead us. Finally, authors should know how editors for CORR ® evaluate this relatively new type of research.
We believe studies drawn from these databases best serve clinicians when they focus on what they are uniquely suited to do: Correlate less-common adverse events with modifiable and previously unidentified risk factors , identify adjustable provider- or hospital-level variables associated with readmissions or complications , and compare resource utilization for common interventions across diverse geographic regions or practice settings . As with any other study design, the more novel the questions (and the more unexpected the answers), the more important these studies are. They do not make such good reading when they tell us what we already know.
Readers also should realize that the same question asked of different databases may result in different answers [2, 4]. The most obvious reason for this is that each database surveys a specific population and may gather different data in different ways. Some databases, like the NIS, are inpatient-only samples: The moment a patient leaves the hospital, (s)he no longer is followed. As a result, studies drawn from inpatient-only samples that ask about complications that occur both before and after discharge—such as surgical site infection—will badly underestimate event rates. By contrast, prospective surgical registry-type databases (such as NSQIP) follow patents for 30 days or longer after surgery, and so are better tooled to find problems that occur after discharge. What do they lose in the exchange? Sample size. Prospective registries are smaller. There are other differences as well. Some registries focus on particular types of patients. The NTDB, for instance, includes patients with higher-energy injuries. Clearly, matching the study population to the question being asked is crucial.
Additionally, each database uses different data elements. Many databases, including NIS, are built from ICD-9 codes. Since these codes’ primary use is billing, they have lower sensitivity to answer the clinical questions in which most readers are interested [8, 9]. By contrast, other databases—including NSQIP and parts of the NTDB—use chart-abstracted data, which offer more of the details clinicians seek. Readers should also recognize that missing data limits all databases to some degree, and any study drawn from them should comment on the degree to which this might influence its findings. We believe the most important failing of orthopaedic studies from these sources is that these databases generally do not capture enough patient-level diagnostic information or patient-reported outcomes data. Efforts are underway to address many of these limitations. But at present, questions calling for that level of detail probably are better studied using single or multicenter designs that capture more complete patient-level data of this sort.
Notwithstanding the shortcomings of large-database studies, we at CORR ® believe that they have an important role to play. But since these studies are not always easy reading, there must be a payoff for the reader who toughs it out—at the very least, a finding that changes the way the reader thinks about an important problem or treats his or her patients. Going forward, CORR ® will consider a study that draws from a large database if it either presents a genuinely counterintuitive descriptive finding, or provides a specific suggestion to improve clinical care, practice management, or public policy. We will publish papers meeting one or both of those criteria when they are methodologically robust. As with the all of the important studies we publish in CORR ®, we will continue to make every attempt to accompany these studies with CORR ® Insights commentaries or Editor’s Spotlight features.
The authors thank Matthew B. Dobbs MD, Mark C. Gebhardt MD, Terence J. Gioe MD, Paul A. Manner MD, Clare M. Rimnac PhD, and Montri D. Wongworawat MD for their guidance, which determined the Journal’s editorial approach as described at the end of this essay.
- 1.Belmont PJ, Jr., Goodman GP, Kusnezov NA, Magee C, Bader JO, Waterman BR, Schoenfeld AJ. Postoperative myocardial infarction and cardiac arrest following primary total knee and hip arthroplasty: Rates, risk factors, and time of occurrence. J Bone Joint Surg Am. 2014;96:2025–2031.CrossRefPubMedGoogle Scholar
- 4.Bohl DD, Russo GS, Basques BA, Golinvaux NS, Fu MC, Long WD, 3rd, Grauer JN. Variations in data collection methods between national databases affect study results: a comparison of the nationwide inpatient sample and national surgical quality improvement program databases for lumbar spine fusion procedures. J Bone Joint Surg Am. 2014;96:e193.CrossRefPubMedGoogle Scholar
- 6.Centre for Evidence Based Medicine. The 2011 Oxford CEBM levels of evidence: Introductory document. Available at: http://www.cebm.net/2011-oxford-cebm-levels-evidence-introductory-document/. Accessed Feb 16, 2015.
- 9.Golinvaux NS, Bohl DD, Basques BA, Grauer JN. Administrative database concerns: accuracy of international classification of diseases, ninth revision coding is poor for preoperative anemia in patients undergoing spinal fusion. Spine (Phila Pa 1976). 2014;39:2019–2023.Google Scholar
- 12.OCEBM Levels of Evidence Working Group. The Oxford 2011 levels of evidence. Available at: http://www.cebm.net/wp-content/uploads/2014/06/CEBM-Levels-of-Evidence-2.1.pdf. Accessed February 16, 2015.