Background

Over the past two decades, opioid overdoses have claimed hundreds of thousands of lives, with millions grappling with opioid use disorder [1, 2]. Analyses of drug monitoring systems have revealed high rates of new opioid prescriptions among postoperative patients and within family medicine [3,4,5,6,7,8,9]. While the US opioid crisis is largely fueled by illicit opioid use (i.e., fentanyl), it is a result of an ongoing epidemic rooted in high rates of prescription opioid use [2].

Europe now witnesses a similar surge in prescription opioids [10,11,12,13,14,15,16,17,18,19,20], resulting in an increased incidence of opioid-related harms associated with opioid overconsumption, defined as prolonged use or higher doses for noncancer pain [21,22,23,24]. Notably, prolonged use may develop rapidly among opioid-naïve users [25,26,27]. Despite lower rates of opioid-related deaths in Europe than in the US, early intervention is crucial to prevent a shift from prescription to illicit opioids, as health policies alone may not suffice [28, 29].

Opioid stewardship programs have emerged in North America as a response to the prescription opioid crisis, employing strategies to decrease and track opioid prescriptions [30, 31]. These have been effective in reducing the number of opioid prescriptions or tablets without compromising patient well-being [32,33,34]. At their core, these programs incorporate opioid exit plans (OEPs), consisting of specific strategies that promote drug safety for improved outcomes, closing an important prevention gap.

While some countries are developing guidelines for opioid analgesic deprescribing [35,36,37,38], a recent guideline summary identified a need for greater evidence on the effectiveness of current strategies to inform clinical practice [35]. Therefore, this systematic review aimed to identify and summarize published hospital-based OEPs, detailing their design, main components, and reported evidence of their effectiveness.

Methods

A systematic review was performed according to the PRISMA 2020 guidelines [39] and the SPICE (setting, population, intervention, comparison, evolution) [40] and PCC (population, concept, context) [41] frameworks to define the study environment. The search was conducted in PubMed and Embase using a distinct keyword search string developed with an information specialist (LB). Articles published from January 1, 2000 to June 3, 2024, that explored the discharge management of postoperative patients receiving opioid analgesics were considered eligible. For homogeneous interventional exposure, articles needed to focus on patients 18 years of age or older at discharge, excluding patients with special needs or implications for routine outpatient opioid use after surgery, such as cancer, end-of-life care, and substance use disorders. The articles needed to include an accessible tapering protocol. The full search strategy and list of eligibility criteria for the literature are detailed in the Supplement Tables 1, 2, and 3.

Two searches were conducted (SO, MR), one on April 27, 2023, and an update on June 4, 2024. The results were imported into Rayyan.ai for screening [42] and duplicates were removed. Two researchers (SO, MR) independently screened the abstracts and obtained full-text articles if the predefined eligibility criteria were met. Conflicts in screening were resolved through in-person discussions. If necessary, a third author (DS) was consulted. The Cochrane Effective Practice and Organization of Care Group [43] template was used for consistent and comprehensive data collection on study characteristics and measured intervention efficacy (SO, MR), reported as a percentage reduction in opioid dosage as morphine milligram equivalents (MME) when applicable.

Three reviewers (SO, DS, MR) appraised the quality of evidence of the included studies using the LEGEND (let evidence guide every new decision) evidence evaluation tool [44]. In the LEGEND, a numerical rating system based on the study design determines the basic grading. Indicators "a" and "b" differentiate the quality of evidence: "a" indicates high quality, while "b" indicates inconsistencies or insufficient quality of design [44]. Disagreements in grading were resolved during in-person discussions. If a reported study design was suspected to be incorrect, three reviewers (SO, DS, MR) collectively reclassified the study.

When applicable, two reviewers (MR, DS) independently applied the Revised Risk of Bias tool (RoB2) for randomized controlled trials (RCTs) [45] and the Risk Of Bias In Non-randomized Studies—of Interventions (ROBINS-I) tool for non-randomized studies [46] to identify potential biases and confounders, assessing the level of risk.

Results

Article selection

Figure 1 illustrates the screening and inclusion process [39, 47]. The initial systematic literature search identified 2,483 articles, and the updated search identified 102 articles (n = 2,585). The respective abstracts were screened, and 26 articles were deemed eligible for full-text screening. Eventually, eight articles from the full-text screening were included in the final analysis [48,49,50,51,52,53,54,55].

Fig. 1
figure 1

PRISMA flow diagram of screening and inclusion process [39, 47]

Study characteristics

Table 1 provides an overview of the characteristics of the eight included studies. All the articles described studies conducted in North America, with 25% (N = 2) in Canada [48, 50] and 75% (N = 6) in the US [49, 51,52,53,54,55]. Half of the studies (N = 4) were quality improvement studies [51, 53,54,55] that were either uncontrolled and retrospective [53,54,55] or controlled and prospective [51]. Three were RCTs (37.5%; N = 3) [48,49,50], and one was a proposed OEP for patient services targeting postoperative pain [52]. For the latter, no conventional study design could be assigned. While the procedures varied, the studies predominantly investigated interventions within orthopedic departments, with total hip arthroplasty (THA) and total knee arthroplasty (TKA) being the most prevalent procedures (75%; N = 6) to involve patients in OEPs [48,49,50,51, 54, 55], followed by neurosurgery (12.5%; N = 1) [53]. The proposed OEP framework by Genord et al. [52] was considered applicable to orthopedic, neurosurgical, and colorectal surgery.

Table 1 Study characteristics of the included studies using the template provided by the Cochrane Effective Practice and Organization of Care Group [43] and quality assessment of different study types using the LEGEND (let evidence guide every new decision) evidence evaluation tool [44]. The numbers in the QA column represent the study design used, and the letters indicate that a is of good quality or b is of lesser quality

The patient demographics varied largely within the study populations and the reported items due to differences in study design (Table 1). Across the studies, patients had a mean age between the mid-fifties and mid-sixties, with the lowest mean age being 40.2 years [48] and the highest being 67.0 years [53,54,55]. The gender distribution was rather balanced in three studies [48, 49, 51], whereas studies conducted in Veterans Affairs Facilities [53,54,55] predominantly included male patients, and the study by Singh et al. [50] predominantly included female patients. A history of substance abuse, financial stability, mood disorders, preoperative pain, or prior opioid use was reported by 75% of the studies [48, 49, 51, 53,54,55]. Most studies reported psychiatric comorbidities (62.5%; N = 5) [48, 49, 53,54,55]. This was either done by screening for anxiety and depressive disorders (25%; N = 2) [48, 49], or the screening and the exact entity were not specified [53,54,55]. Kukushliev et al. [54] were the only ones to report further comorbidities, such as cardiovascular, renal, or hepatic diseases or impairments.

Quality of the included studies

Table 1 reports the quality of evidence for each study. Three [49, 51, 55] studies were found to be of good quality. The studies by Hah et al. [49], Chen et al. [51], and Tamboli et al. [55] selected an appropriate study method for the research question. These reported statistically significant results while also describing the intervention, patient allocation, variables, and outcomes clearly. The remainder received lower quality ratings, mostly due to underreporting of important details such as intervention delivery and the randomization process.

Risk of bias

Figure 2 [56] visualizes the bias judgments. All studies had a moderate to high risk of bias. The RCT by Hah et al. [49] was the only RCT with good quality evidence and moderate bias. However, there were some concerns regarding deviations from the intended protocol intervention. Among the non-RCT studies, the studies by Chen et al. [51] and Tamboli et al. [55] were both high quality. However, there were moderate to serious concerns regarding confounding, participant selection, outcome measurement, and protocol deviations.

Fig. 2
figure 2

Visualization of the risk of bias assessments in the respective domains (D) [56] using the Revised Risk of Bias tool [45] for randomized controlled trials (RCTs) and the Risk Of Bias In Non-randomized Studies—of Interventions tool [46] for non-randomized studies (non-RCTs)

Overview of interventions and outcome assessment

Table 2 provides the details of the intervention strategies. The most common (75%, N = 6) feature was an individualized tapering approach [50,51,52,53,54,55]. Tamboli et al., Joo et al., and Kukushliev et al. [53,54,55] used patients’ 24-h predischarge opioid utilization to generate a patient-specific tapering plan. In the pre-post design study by Chen et al. [51], the intervention was a model that converted 24-h predischarge opioid utilization to the preferred opioid analgesic for discharge and to the preferred tapering duration in days (0, 7, or 14 days) depending on the type of surgery. Singh et al. [50] assigned patients to risk groups for postoperative pain with risk group-specific tapers based on procedure type, which focused on postoperative patient satisfaction rather than on reducing the amount of opioids prescribed at discharge. Contrary to individualizing tapering regimens, Hah et al. [49] employed postoperative motivational interviewing to promote patients’ efforts toward medication adherence, opioid tapering, and pain management while closely monitoring pain outcomes and opioid-related adverse events.

Table 2 Summary of the interventions and details of the outcome assessments among the included articles using the template provided by the Cochrane Effective Practice and Organization of Care Group [43]

The articles by Bérubé et al. [48, 57] and Genord et al. [52] describe combined interventions that extended beyond primarily comprising a tapering protocol (Table 2). Bérubé et al. [48, 57] emphasized educational interventions. Patients participated in face-to-face educational sessions prior to discharge and thereafter, focusing on multimodal pain management and guidance on opioid tapering. Pain levels and interference with daily life were closely assessed after hospital discharge and complemented with generic tapering recommendations. These efforts aimed to improve patients’ self-management. At discharge, patients received an educational pamphlet with the aforementioned information. Genord et al. [52] proposed a yet to be trialed three-phase OEP to support opioid cessation. The first phase, prior to discharge, will include interdisciplinary rounds to assess analgesic needs and discharge eligibility. In the second phase, patients receive discharge counseling and an individualized pain management plan. In the third and final phase after discharge, patients will undergo medication evaluations based on progress with the prescribed pain regimen, opioid discontinuation status, and opioid-related adverse events.

All the published OEPs were developed for standard opioid analgesics (Table 2) using various decreasing approaches. Most studies did not restrict inclusion based on opioid type. Chen et al. [51] provided opioid conversion factors to taper the preferred opioid, and Singh et al. [50] included a predefined set of opioids (hydromorphone, oxycodone/acetaminophen, tramadol/acetaminophen). Hah et al. [49] and Bérubé et al. [48] did not specify. Genord et al. [52] proposed an untrialed tapering regimen to be applicable to any opioid analgesic. The studies based on the tapering regimen by Tamboli et al. [53,54,55] (Table 2) focused specifically on oxycodone. The OEP regimens followed either a linear [50, 53,54,55], exponential [48, 49, 52], or logarithmic [51] reducing tapering approach. The duration was either fixed for the investigated patient population [52,53,54,55] or adapted to the type of procedure [50, 51], while Hah et al. [49] and Bérubé et al. [48] did not predetermine a day of opioid or tapering cessation.

Table 2 also provides an overview of the primary endpoints. Overall, six of the eight studies assessed the efficacy of OEPs on opioid reduction or pain [48, 49, 51, 53,54,55], of which four reported statistical significance [49, 51, 53, 55]. Tamboli et al., Joo et al., and Kukushliev et al. [53,54,55] demonstrated a decrease in the dosage of opioids as MME of 56% (630 vs 280 MME, p < 0.01) and 63% (900 vs 295 MME, p < 0.01) within six weeks of postoperative discharge in the preintervention period and postintervention period, respectively. Similarly, compared to the preintervention period, the approach by Chen et al. [51] resulted in a 24% reduction in the quantity of opioids consumed at discharge (427 vs. 326 MMEs, p < 0.001). After discharge, the authors reported the rate of opioid refills within 30 days (1.58 vs 1.71 mean number, p = 0.082) rather than reductions in MME. An RCT by Hah et al. [49] found that patients receiving motivational interviewing and opioid taper support were 62% more likely to return to baseline opioid use than patients in the standard care group (hazard ratio 1.62, 95% confidence interval 1.06–2.44). Detailed information on the intervention content and provider delivery is provided in Supplement Table 4.

Discussion

This systematic review identified and summarized eight published OEPs [48,49,50,51,52,53,54,55] from hospital settings, providing concepts for the development of novel OEPs in tertiary care settings. Despite the heterogeneity of the approaches investigated, all articles that reported hypothesis testing of their primary outcomes [48,49,50,51, 53,54,55] were successful in achieving either a reduction in opioids at or after discharge. While none of the studies had a low risk of bias, three were of high quality according to the LEGEND quality assessment tool. All good-quality studies [49, 51, 55] yielded statistically significant results, demonstrating that the use of OEPs could effectively reduce the quantity of opioids used at or after discharge. This review therefore highlights that the application of OEPs in clinical practice could be an important addition to reducing discharge opioid consumption.

In this review, no standard OEP approach was identified, as individualization of the intervention and tapering appeared to be integral to meeting a patient’s individual analgesic need during deprescribing. This finding is in line with current evidence-based guidelines [35, 58, 59], as factors such as preoperative opioid use, preexisting pain conditions, social status, psychological comorbidities, and procedure types greatly influence pain and the risk of prolonged opioid use [60,61,62,63]. Among the identified OEPs in this review, implemented strategies included procedure-specific risk groups [50], total 24-h predischarge opioid consumption [51,52,53,54,55], or common pain and withdrawal assessments combined with taper counseling [48, 49]. Using 24-h predischarge opioid consumption is the most common approach and is a time-saving and practical way to individualize tapering, as the need for analgesia typically decreases as patients recover from surgery. This method has limitations, notably, its inapplicability to patients with a shorter inpatient stay than 24 h. Additionally, a shorter postoperative stay can affect pain assessments, as the residual effects of anesthesia may not have fully dissipated [63, 64]. In contrast, Hah et al. [49] and Bérubé et al. [48] employed standardized tapering rates but still individualized the tapering by continuous and close patient contact through follow-ups. The repeated assessment of pain and withdrawal symptoms during follow-up sessions facilitated adjusting the tapering to the patients’ needs. As a result, this method appears to be suitable even for complex cases and ensures sustained positive patient outcomes. Finally, Hah et al. [49] halved the time to baseline opioid use, reflecting the success of such an approach. This approach is also promoted in the American Center for Disease Control guidelines, suggesting that patients with acute pain who receive opioids for a longer time should be evaluated with a two-week frequency [59].

Although, Singh et al. [50] did not assess the statistical significance of their intervention, the OEP included an interesting element of risk stratification in opioid tapering. They allocated patients to one of three risk groups according to procedure type and anticipated postoperative opioid use to prescribe the total number of opioid tablets. A large meta-analysis including 37 studies with 1,969,953 surgery and trauma patients showed that patient-specific opioid requirements were the risk factors with the strongest association with developing chronic opioid use [62]. The American Centers for Disease Control proposed a 6- to 15-day opioid prescription for musculoskeletal procedures [65]. While stratification by procedure type may facilitate the estimation of the ideal number of opioid tablets to be prescribed at discharge, it does not address individual analgesic needs such as patient-specific opioid requirements, which are captured by reviewing 24-h opioid use prior to discharge. It may be promising to combine elements of stratification according to procedure and risk by creating risk groups based on key risk factors for chronic pain and prolonged opioid use. Opioid quantities can be minimized using 24-h inpatient opioid consumption and further individualized by dividing patients into different risk groups: if two patients have the same 24-h inpatient opioid use but one patient is in a higher risk group, the higher risk patient would have a slower tapering rate and more intensive follow-up.

Notably, this review focused on the application of OEPs in postoperative patients. In addition to chronic primary pain, noncancer postoperative patients are subject to the introduction of prescription opioid analgesics or to a higher dose than before admission [4,5,6,7,8,9]. Karmali et al. [66] showed that postoperative pain management is a key driver of long-term opioid use. Relevant predictors [60, 66, 67] for long-term opioid therapy, such as history of substance abuse, financial stability, mood disorders, preoperative pain, or preoperative opioid usage, were reported in almost all studies (75%; N = 6) [48, 49, 51, 53,54,55]. The study designs showed efficacy for surgical specialties associated with high invasiveness, such as orthopedic and spine surgery. For example, in orthopedic surgery, recommendations for the number of tablets range from 0 to 40 tablets of 5 mg oxycodone [68, 69]. This is equivalent to 0 to 300 MME. The described studies that measured the efficacy and the MME [51, 53,54,55] were approximately within the recommended postdischarge dose after the implementation of the tapering interventions. This suggests that tapering protocols have a positive influence on prescribing behavior toward guideline-recommended doses and that psychosocial aspects should be assessed. Thus, OEPs should be considered for implementation in “Enhanced Surgical Recovery” protocols as a valuable addition to patient safety, similar to opioid-free anesthesia [70]. These efforts may have a synergistic effect on opioid-sparing, as these have demonstrated in RCTs to reduce the requirement of postoperative analgesia [71,72,73].

There was a lack of high-quality studies, and none of the included OEPs were deemed to have a low risk of bias. Most articles lacked detailed information on the process, the rationale behind developing the tapering interventions, and consistent reporting of study endpoints. Bias concerns in RCTs mainly stemmed from randomization and intervention adherence. Some studies had predictable allocation [48] or lacked sequence information [50], while others poorly documented deviations from interventions [49, 50]. Adherence to tapering protocols was measured in only one non-RCT study [51]. It is inconclusive whether the steep logarithmic tapering method developed by Chen et al. [51] is superior to the slower linear tapering method developed by Tamboli et al. [55], or vice versa, in reducing opioid dose and improving rehabilitation outcomes. Future trials need to address these limitations and enhance the quality of the data by blinding outcome assessors. Further studies with a more rigorous study design are needed to validate the effectiveness of OEPs. The identified articles focused on the efficacy of their novel tools for assessing opioid-related outcomes, such as the number of opioid tablets taken, rather than on rehospitalization, or on extending the findings to a wider population.

The strengths of this review include the use of a robust keyword search string to screen two major medical publication platforms (PubMed and Embase). All identified articles were evaluated for quality of design and risk of bias to assess the validity of the findings. Ultimately, these findings help to reliably inform clinical practice and provide resources for the development of OEPs, allowing institutions to tailor tapering approaches to meet the needs of their patients. Limitations include the omission of articles published before 2000 and those not indexed in PubMed and Embase, including gray literature such as internal hospital guidelines and predischarge opioid-sparing protocols (e.g., enhanced recovery programs). Articles written in languages other than English or German were also excluded, as were those with inaccessible tapering protocols. Due to the eligibility criteria, the findings have limited applicability to patients with chronic opioid use and psychiatric disorders and no evidence for use in pediatrics.

Conclusions

Despite differences in the patient populations, the studies that evaluated efficacy found that the use of OEPs with tapering plans consistently reduced opioid consumption. The 24-h predischarge method provides a robust estimate of outpatient analgesic requirements, which can be complemented by risk group stratification for tapering speed. More rigorous studies are needed to assess the effectiveness of these tapering approaches on a larger scale.