Of the more than $3.3 trillion dollars spent annually on health care in the USA, less than 0.1% currently goes towards research designed to improve how we deliver health care.1, 2 Increasing the effectiveness and efficiency of health care delivery has implications not only for our overall economy and health care spending, but also for the longevity and quality of life of our citizens.

Improving the way health care is delivered can sometimes be perceived as a haphazard endeavor with awkward overlap between quality improvement (QI) and traditional research. QI approaches typically seek to implement previously established, evidence-based practices using learning cycles and other process change techniques originally developed for business management (e.g., Plan-Do-Study-Act, Lean).3, 4 Traditional research applies rigorous but time-consuming methods to create new knowledge through hypothesis-driven discovery. While there is frequent overlap between these two approaches when it comes to new strategies for care delivery, they often differ in their sources of funding, generalizability to other health care settings and patients, and their authority to change current health care system workflows.

A study published in this issue of JGIM underscores some of the challenges faced by researchers working within this current health care delivery landscape.5 Ryskina et al. attempted to reduce over-ordering of routine lab tests (e.g., CBC, BMP) among hospitalized patients, a ubiquitous problem with significant implications for both hospitalization costs and patients’ hospital experiences.6 This problem is an excellent example of the need to conduct innovative research in how we deliver health care: Over-ordering of routine labs (and the cascade of subsequent work-ups that can follow spurious results) has multimillion-dollar implications for our health system and addressing this problem is one of the Society of Hospital Medicine’s five Choosing Wisely campaign recommendations.6

Ryskina et al. hypothesized that providing physicians with social comparison feedback (personalized feedback on how their routine lab ordering behaviors compared to their peers) could help curb over-ordering. This hypothesis was based on compelling evidence from the published literature.7 They conducted a single-blinded, two-arm randomized controlled trial testing the hypothesis that providing a single “dose” of social comparison feedback would improve lab ordering practices in the hospital setting. Resident physicians on a general medicine service team (each with two interns and one resident) were cluster-randomized to receive an emailed summary of their routine lab ordering behaviors during the prior week, along with information on the average ordering behaviors of other resident physicians on the general medicine service. This email also contained a link to a continuously updated personalized dashboard that had additional, patient-level details pulled from the electronic medical record.

The 12-week intervention period was divided into six two-week blocks. During each block, three of the six general medicine teams were randomized to the intervention arm. Resident physicians in the intervention group received feedback at the beginning of the second week of the block. The primary outcome of interest was the count of routine laboratory test orders placed by a physician per patient-day.

Unfortunately, Ryskina et al. did not observe a statistically significant impact from this intervention, with similar lab ordering rates in the intervention group compared to the control group (intervention arm had 0.14 fewer orders per patient-day [95% CI − 0.56 to 0.27], p = 0.50). Despite employing the gold standard RCT study design, this laudable effort to change clinical practice was hindered by several key limitations noted by the authors, some of which reflect the challenges of effectively answering research questions in the current health care delivery environment. First, there was significant crossover between arms. The authors report that 36% of physicians were exposed to both the control and intervention arms, with half of these individuals experiencing the intervention first. Second, attending physicians were not randomized and did not receive any feedback. While the choice to not include attending physicians in the intervention may reflect the fact that labs are ordered by residents and interns, it fails to account for the influence of different attending physician expectations on resident physician ordering habits. Third, the feedback was delivered via a research email that was separate from the electronic medical record, with many emails left unopened and few physicians clicking on the embedded link to the more detailed dashboard. Fourth, the intervention was a single “dose” for most participants (i.e., those who were only in the intervention arm for one block). Given this design, physicians with higher than average ordering during the one examined week may have considered their higher ordering as an outlier and not reflective of their typical practice. Finally, the appropriateness of lab ordering was not assessed. The feedback received may have lost meaning if it was discordant with physician perceptions of test necessity.

These limitations reflect some of the overarching challenges researchers face changing clinical workflows and care structures, such as altering provider schedules and modifying the electronic medical record interface. Ideally, Ryskina et al. would have been able to cluster-randomize at the team level (with the team including attending, resident, and intern), involve all medical services, and re-arrange the call schedules and team structures to eliminate any crossover or differences in intervention “doses.” Further, with additional financial investment, they would have been able to increase the study size, perhaps add a 2nd intervention arm to test an alternate (and more intensive) strategy, and integrate the intervention directly into the electronic medical record to ensure all intervention physicians were exposed to the feedback. But these changes require buy-in at all levels of the delivery system, from clinicians and patients to health care system leadership.

Multiple challenges currently stand in the way of implementing innovative experiments to answer important delivery science research questions. For example, the priorities of the researcher and the health system do not always align, with researchers needing to pursue funding opportunities from outside agencies like NIH and AHRQ that have their own research goals. Second, traditional research and QI initiatives typically follow different project cycles, with the time required to set up and implement a rigorous IRB-approved research protocol typically longer than the rapid cycle implementation preferred by health care systems seeking to improve quality. Finally, the rapid-fire and complex work environment of clinical medicine is often not pliable enough to adjust to the changes and adaptations needed to test research hypotheses in a timely manner. It’s not easy to ask over-worked clinicians and staff to change or add to their workload or revise long-set schedules.

Scientists working to improve health care delivery can apply several strategies to help create new knowledge while also changing how care is delivered. A key skill is developing long-term partnerships with health system leaders. These collaborations are bi-directional, with researchers making the case for the value of a proposed research project and leaders providing insight into the most pressing and unresolved questions facing the care system. Delivery scientists must be both advocates and listeners. In return, health system leaders have the authority to approve significant changes in care processes. Flexibility is another key trait, in that delivery scientists must find ways to bring rigor to the question at hand while also being able to adapt to the operational needs. Extended success in a delivery science career will often require a hybrid funding portfolio of traditional external funding and support from internal sources.8

These strategies form the core competency of delivery science. Successfully translating well-honed hypotheses into real-world delivery system interventions is a daunting task that currently requires substantial investment in time and effort. Making significant changes in how our health care system is organized and how care is delivered will require all involved stakeholders (researchers, clinicians, patients, operational leaders, funders, policy makers) to work together to lower the many barriers to conducting the highest quality delivery science research. Otherwise, researchers are left trying to answer important questions and transform health care delivery from the margins.