Enhanced Recovery After Surgery (ERAS) pathways are multidisciplinary, coordinated, standardized care plans that integrate evidence-based interventions addressing multiple aspects and phases of the patient’s perioperative trajectory. One of the key paradigm shifts with the ERAS approach is a move from a largely provider-centric system, where surgeons, anesthesiologists, and nurses function within expertise silos (characterized by significant variability between practitioners and institutions), to a patient-centric approach with integration of each step of perioperative care into a cohesive pathway. An ongoing audit of adherence to the steps in the pathway and the clinical outcomes, e.g., length of stay and complications, is considered an essential component of ERAS programs.1 It is also recognized, however, that these measures incompletely reflect patient experience and functional recovery.1

What outcomes are important to evaluate the overall effectiveness of ERAS pathways? Although these programs were initially designed to target immediate in-hospital recovery by focusing on pain, ileus, mobilization, and early discharge,2 recovery is a complex construct encompassing many other dimensions of physical, emotional, economic, and social health – recovery means different things to different stakeholders.3 In ERAS studies, clinicians and researchers have been most interested in early and intermediate (in-hospital) recovery, usually using duration of hospital stay, complications, and organ dysfunction as outcomes.4 The ERAS Society guidelines also emphasize the importance of auditing adherence to the components of the program.1 Yet patients equate recovery with return to their normal activities,5 a process that occurs usually after discharge and requires a time frame of weeks to months.6,7 It is not clear if interventions impacting short-term biological changes will also have a downstream effect on longer-term recovery.8

Our intention in this review is to provide a framework for measuring outcomes of ERAS programs. We begin by defining outcomes in general and the construct of surgical recovery, with the various stakeholders (including patients and families, clinicians, and payers) emphasizing different outcomes at different times. We also review available measures for quantifying recovery in hospital and after discharge and summarize the outcomes that are currently being reported in ERAS studies. We then propose a set of outcomes that includes traditional clinical outcomes as well as patient-reported outcomes and could be used as a starting point in future studies. Finally, we discuss evidence gaps where future research would be helpful.

What are outcomes?

An outcome is a measure of the impact of an intervention on clinical or functional status, and it is used to assess the effectiveness of the intervention.9 Outcomes are inherently multidimensional, and there is no single outcome that fully captures the results of care for any condition.10 Clinical outcomes are endpoints used in clinical practice, such as complications, and are assessed by an observer. Patient-reported outcomes (PROs) are determined directly by the patient using scales or health profiles that may be specific for the disease state of interest or encompass general health. In such cases, endpoints may include symptoms (pain, nausea, fatigue), functional health status (return to activities, physical activity), or health-related quality of life (HRQOL). A core set of outcomes is an agreed set of outcomes (usually less than ten) to be measured for a particular health condition or treatment, prioritized by the relevant stakeholders.9 These include clinical outcomes and PROs. Measurement of the processes of care, such as adherence to elements of an ERAS pathway, is important for understanding how a system is functioning and where improvements should be made, but it does not replace outcomes measurement and should be considered separately.11

What does postoperative recovery mean to different stakeholders?

Postoperative recovery is a complex multidimensional construct that often means different things to different stakeholders. The lack of consensus over the true definition of recovery has hampered efforts to develop validated instruments. For instance, some anesthesiologists may consider recovery to be complete once the patient is awakened from anesthesia and able to be discharged from the postanesthesia care unit (PACU), whereas surgeons may consider recovery as the absence of complications. For patients, recovery is not complete until they fully return to their premorbid state of health and activity – free from pain – usually long after they have been discharged from hospital. Family members may consider the emotional and economic consequences of providing support and care for a loved one after surgery. Administrators and payers may consider the cost of providing care. In truth, there need not be a single all-encompassing definition.

Postoperative recovery is defined as a multidimensional construct that follows a particular trajectory.3 First and foremost, postoperative recovery affects multiple domains, including physical, physiological, psychological, social, and economic factors. Undue focus on any one particular domain while ignoring the remainder would paint an incomplete picture. Postoperative recovery follows a specific trajectory that starts with an abrupt deterioration from baseline function in the immediate postoperative period and then gradually rehabilitates back to or surpassing the preoperative baseline. The trajectory that serves to outline the overall recovery process is graduated such that less invasive procedures (e.g., laparoscopic cholecystectomy) will have a smaller decline and a quicker rehabilitation period, whereas highly invasive procedures (e.g., a pancreaticoduodenectomy) will have a more dramatic decline and a lengthy recovery period. Other factors, such as postoperative complications or adjuvant chemo and radiation therapy, may affect the recovery trajectory. It is important that any instrument used to measure recovery is able to detect these changes as well as the differences in the severity of procedures. The term recovery implies a comparison with the patient’s baseline functioning or with population norms, and it is quantified in relation to these standards. The logo of the ERAS Society alludes to this trajectory, given that enhancing recovery will shift the trajectory upwards by lessening the deterioration and accelerating the rehabilitation process (Figure).

Figure
figure 1

Hypothesized trajectory of postoperative conventional and ‘enhanced’ recovery. The X axis depicts time, and measurement begins at baseline, before surgery. The Y axis depicts a hypothetical measure of recovery. This could be a symptom like pain or fatigue, a functional status measure like physical activity, or a measure of quality-of-life. Postoperative recovery follows a specific trajectory with an abrupt deterioration from baseline in the immediate postoperative period, followed by a gradual rehabilitation back to or surpassing the preoperative baseline. Patients cared for using ERAS pathways are hypothesized to have less deterioration and faster return to baseline or “normal”. However, whether the slopes of the recovery lines are different is not known

Recovery can be divided into early, intermediate, and late phases. The early phase denotes the period immediately after surgery until discharge from the PACU. The intermediate phase signifies the time from PACU discharge until hospital discharge, and the late phase signifies the time from hospital discharge until return to normal (or baseline) function. The division of postoperative recovery serves two main functions. First, it provides a standard terminology and allows for the measurement of specific phase-dependent outcomes. For example, both biologic and physiologic outcomes are important in the early phase of recovery; basic activities and physical symptoms are important in the intermediate phase, and higher-level functional outcomes are important in the late phase (Table 1). Others suggest somewhat different time frames for early, intermediate, and late phases of recovery. In a recent narrative review of the measurement of the quality of recovery, Bowyer et al. define early recovery as encompassing factors important for hospital discharge (physiologic stability, pain, nausea, gastrointestinal function), intermediate recovery as the first few weeks after surgery (nociceptive, emotional functional, and cognitive recovery), and late recovery as more than six weeks after surgery (focusing on poor functional recovery, persisting pain, nausea, and cognitive decline).12

Table 1 Suggested phases of recovery

Qualitative work by our group and others support these suggestions. Urbach et al. interviewed inpatients within two weeks of major abdominal surgery and identified the major themes: basic physical limitations (“transitioning from a lying position to sitting or standing”), basic activities of daily living (bathing, dressing, and grooming), and physical and psychological symptoms (pain, visceral function, sleep, and mood).13 Once discharged from hospital, however, patients are more concerned about regaining their preoperative baseline function. Kleinbeck and Hoffart found that patients “do not define recovery as being healed physically; instead, they define recovery as being able to perform activities as they performed before surgery.”5 We have also identified similar themes in a qualitative study in which we interviewed discharged patients who underwent abdominal surgery at least one month prior (unpublished). Major themes identified from this work include patients experiencing a lack of physical endurance or energy postoperatively and an inability to perform or sustain their regular activities.

Are there recovery-specific QOL instruments?

There are multiple relevant outcomes during postoperative recovery, including pain, gastrointestinal functioning, mobilization, self care, cognitive functioning, fatigue, and others. There is significant interest in trying to capture these various domains in a single measure. This work focuses on the in-hospital and immediate post-discharge phases of recovery. A systematic review performed in 2006 identified 12 separate instruments to measure “general postsurgical recovery”, excluding instruments limited to the recovery room alone.14 In evaluating measurement properties, they concluded that none were fully validated for the construct of recovery. In particular, studies were lacking to assess responsiveness (ability to detect differences over time) and minimal clinically important change (the smallest difference that patients perceive as important). Nevertheless, the review concluded that the Post-discharge Surgical Recovery (PSR) scale and the Quality of Recovery 40 (QoR-40) score should undergo further investigation. The QoR-40 encompasses five dimensions of recovery (emotional state, physical comfort, psychological support, physical independence, and pain).15 A subsequent evaluation of the psychometric properties of the QoR-40 – from 17 studies in which it was used – concluded that it was a suitable measure.16 Then again, the QoR-40 was designed to reflect early recovery and normalizes within days to weeks.14 The PSR is a 15-item scale developed to measure recovery after ambulatory surgery. Items include health status, activity, fatigue, work readiness, and expectations.17 There are limited data evaluating the responsiveness of the instrument, although it was close to the ceiling score two weeks after surgery.18 A more recent review12 identified 11 individual instruments, only four of which were included in the previous review. The Postoperative Quality Recovery Scale19 was mentioned as a newer tool that emphasized cognitive functioning. We are aware of other measures, e.g., the Abdominal Surgery Impact Scale,13 that were not included in either review. This highlights difficulties in this area, where standard definitions of recovery hamper identification of scales in reviews and data synthesis. In addition, rigorous assessment of the level of validation for each measure could guide investigators as to the optimal utility of each tool.

Clearly, the concepts measured by an instrument will reflect its psychometric properties. An instrument that does not focus on outcomes of importance to patients during their recovery process will not be psychometrically valid. It is also important to consider the timeline for assessments, because the relevant outcomes may differ depending on the phase of recovery being assessed. Most of the recovery tools identified in the systematic reviews assess recovery over short-term intervals; therefore, the items in the instruments reflect outcomes important during early and intermediate recovery. They focus on safety for discharge (physiologic stability, pain, nausea, gastrointestinal function) and basic activities of daily living. Nevertheless, constructs important immediately after surgery may not be as important in the late phase of recovery where functional status and factors inhibiting full recovery (e.g., cognitive decline, persisting pain) are the major concerns.12

There is some evidence that measures of physical function and performance20,21 as well as some domains of self-reported health status or HRQOL22,23 may be sensitive to changes occurring in the postoperative period and useful in estimating longer-term recovery. The six-minute walk test (6MWT) was developed originally to test exercise tolerance but is now used clinically and in research to test functional exercise capacity, defined as “the ability to undertake physically demanding activities of daily living”.24 There is preliminary evidence supporting the validity of the 6MWT as a measure of recovery six to nine weeks after colorectal surgery.21 The minimal clinically important difference (MCID), or the smallest change that a patient perceives as important, for the 6MWT has been estimated as 14 m in the context of surgical recovery, which is smaller than that reported for other conditions.25 Advantages of measures of physical performance include the lack of a ceiling effect, no need to rely on patient self-report, the ability to measure at multiple time points, and favourable statistical properties.

Self-reported measures of physical activity also show promise in quantifying post-discharge recovery. The Community Healthy Activities Model Program for Seniors (CHAMPS) instrument is a 41-item questionnaire originally created to evaluate the effectiveness of interventions aimed at increasing the level of physical activity in elderly adults.26 Subjects report the frequency and total time spent performing a range of physical and social activities during the past week. This is weighted according to the metabolic value of each activity, and total caloric expenditure per kilogram per week is estimated. Preliminary evidence supports the validity of CHAMPS to estimate recovery after cholecystectomy,20 with the MCID estimated at 8 kcal·wk−1.25 Measures of activities of daily living (ADL) and instrumental activities of daily living (IADL) also require weeks or months to return to baseline and may be useful in estimating recovery.6

Measures of generic HRQOL may also be useful in measuring recovery. The Short Form 36 (SF-36®) is a widely used health profile which can be used for reporting surgical outcomes.27 It includes 36 items that can be divided into eight domains assessing the physical (physical functioning, role physical, bodily pain), psychological (vitality, role emotional, mental health), and social (social functioning) domains, as well as overall health (general health). Each domain is scored on a scale from 0-100, with higher scores being representative of better functioning and well-being.28 There is evidence that six of the eight domains of the SF-36 and the physical component summary score may be valid measures of recovery after colorectal surgery, although they did not differentiate between laparoscopic and open surgery.22 Nevertheless, health profile measures have limitations. There is no information about the relative importance of each domain. For example, if the intervention resulted in an improvement in mental health but a decline in physical functioning, there is no way to assign a relative importance to the domains and trade-offs.29 Also, all HRQOL measures may be affected by response shift, which is defined as the change in perception or understanding of a construct as a result of shifts in internal standards (“recalibration”), values (“reprioritization”), and conceptualization (“reconcenceptualization”).30 In essence, response shift may alter the interpretation of changes in HRQOL scores over time if the person adapts to the treatment or disease. Response shift is best exemplified by considering the experience of two hypothetical people, one with lung cancer and one in good health. Both rate their overall health as “good”.31 Over the next year, the healthy subject does not change their rating of health and, for all exteriorized signs, has not changed health status. Conversely, the person with lung cancer had a pneumonectomy and shows marked deterioration in health status. Despite everything, this individual is surprised to be alive and not to feel even worse, so rates his health as “very good” – showing an improvement. In essence, this is making a response shift. In the presence of a life-altering event, like major surgery, a response shift should be considered to have occurred, and statistical or qualitative methods should be in place to detect and consider response shift in the analysis.32

In economic studies, effectiveness is measured using quality-adjusted life years, commonly with indirect utility instruments, such as the Short Form 6D (SF-6D)33 or the EuroQol 5D (EQ-5D™).34 It is important to consider the available evidence that supports the validity of the chosen measure within the specific context of recovery and in the time frame of interest. An instrument valid in one context (e.g., treatment of asthma) may not be valid in another (e.g., recovery after colorectal surgery).22 Some commonly used instruments used to assess health may be less sensitive to the changes occurring during recovery or may normalize very quickly. For example, the SF-6D was more responsive to expected postoperatve changes at four and eight weeks after colorectal surgery when compared with the EQ-5D.35

What outcomes have been used to evaluate ERAS pathways?

In order to understand the current status of outcomes reporting in ERAS studies, our group previously performed a systematic review of 38 prospective comparative studies of ERAS programs vs traditional care in abdominal surgery – 25 studies were randomized controlled trials (RCTs).4 Length of hospital stay was reported in all but one study and was the primary outcome for 18 of 23 studies where this was explicitly included. Excluding complications, other biologic and physiologic outcomes were reported in 80% of studies, with gastrointestinal function being the most common endpoint. Half of the studies included at least one patient-reported symptom, most commonly pain. Another half of the studies reported a functional status outcome, almost exclusively focused on mobilization in hospital. Only eight studies included functional status outcomes after discharge, with only two measuring any outcome beyond one month. Cognitive function was reported in only one study. Measures of HRQOL were reported in seven studies with eight different instruments used, and none reported HRQOL after 30 days. Only one study36 used a recovery-specific HRQOL instrument (the Surgical Recovery Scale).

The review highlights where outcomes evaluation of ERAS programs has been incomplete. While clinical outcomes occurring in hospital have been emphasized – perhaps appropriately based on the overall goals of ERAS programs – the use of validated measures of in-hospital recovery was negligible. Furthermore, follow-up has generally been too short for appropriate capture of post-discharge functional recovery. Finally, a wide variety of instruments and definitions hampers attempts at comparisons between studies and synthesis based on the results.

Several meta-analyses of ERAS programs vs traditional care have also been published, concluding that ERAS programs are effective in reducing length of hospital stay and rates of some complications across multiple surgical subgroups.37,38 Most ERAS programs also report faster return of bowel function after colorectal surgery by about one day.38 Nevertheless, it proves more difficult to synthesize other outcomes between studies because multiple definitions and instruments are being used. In the 25 RCTs identified in our systematic review,4 early mobilization was reported variably as “time spent out of bed”, “time ambulating”, pedometer readings, time to reach “independent mobility”, or the proportion of patients walking on a given day. Despite this range of definitions in the studies reporting in-hospital mobilization outcomes, ERAS programs were associated with better outcomes in all ten studies. On the other hand, ERAS programs have not been associated consistently with improved HRQOL. Four studies assessed HRQOL two weeks post-discharge, with two finding benefits for ERAS (using EQ-5D and subscales of the European Organization for Research and Treatment of Cancer instrument)39,40 and two finding no benefits (using SF-36, Gastrointestinal Quality of Life Index [GIQLI], and the Cleveland Global Quality of Life instrument).41,42 Table 2 presents a summary of patient centred outcomes reported after hospital discharge from these studies.

Table 2 Summary of patient-centred outcomes reported after discharge4

The most frequently reported outcome in ERAS studies is length of stay. This likely reflects the use of this measure as a proxy for intermediate recovery, as discharge from hospital assumes pain control with oral analgesia, adequate oral intake, absence of significant complications, and the ability to perform activities of daily living. Nevertheless, discharge criteria may not be explicit and may vary significantly between institutions and geographic regions.43 Length of stay is also affected by many non-clinical factors, such as social situation, cultural expectations, and caregiver availability, as well as distance from the hospital, traditions, and availability of hospital beds. Even within ERAS programs, there is a discrepancy between in-hospital recovery (defined as pain control with oral analgesia, adequate oral intake, ADLs at preoperative level) and length of stay. While these benchmarks were achieved at a median of three days after colorectal surgery,44,45 patients remained in hospital for an additional two to three days. Less than one-third of patients were discharged on the day of recovery, but this situation occurs more frequently with increasing experience with ERAS pathways, highlighting the difference between length of stay and recovery of organ dysfunction.45 Thus, while length of stay remains a relevant outcome, particularly when following trends within an institution or benchmarking between institutions, its limitations should be understood. An alternative measure of intermediate recovery may be obtained by assessing the time to achieve specific discharge criteria (“time to readiness for discharge”).46 The main advantage of this measure is that only factors related to physiological recovery are taken into account without the influence of organizational and personal factors that affect length of stay. Research in colorectal surgery supported the validity and reliability of this measure when readiness for discharge was defined using consensus-based discharge criteria.46,47

A way forward: recommendations for outcomes measurement in ERAS research

A core set of outcomes for ERAS programs should reflect the different perspectives of stakeholders (patient, surgeon, anesthesiologist, nurse, etc.) and the stage of recovery (in-hospital or post-hospital). While the emphasis of in-hospital outcomes is on processes of care, symptoms, and adverse events, the focus of post-discharge outcomes is on patient-reported functional recovery. An example of a set of outcomes for research on ERAS effectiveness is provided in Table 3. This set of outcomes is derived from the systematic review of outcomes reported in ERAS studies,4 available measures of in-hospital recovery,14 and our interviews with patients and providers (unpublished data). Where a specific instrument is suggested, evidence was available for its validity, specifically in the context of recovery after abdominal surgery. Nevertheless, this is simply an example of what such a set of outcomes could look like. Development of a core set of outcomes for clinical trials involves a standardized and rigorous consensus-building process.48

Table 3 Example of an outcome set for evaluating ERAS pathways for abdominal surgery

Understanding the adherence to the individual elements of an ERAS pathway is important information to consider in judging the reported outcomes and providing information about process improvement; however, it should not replace direct measurement of outcomes. Patient satisfaction with care should also be considered as a reflection of the processes of care rather than as a specific outcome. On the other hand, satisfaction with health status (i.e., HRQOL) should be included as an outcome.10

In order to assess the value of ERAS programs, information about the financial implications as they relate to outcomes must also be available. Although cost is beyond the scope of this review, ideally, it should be representative of the entire trajectory of care.10 Additional resources may be required to implement and manage ERAS pathways (e.g., program coordinator, audit, patient education), but ultimately, these costs can be recouped by decreasing hospital stay and complications. A systematic review of ten economic evaluations of ERAS pathways for colorectal surgery suggested that the ERAS approach was less costly than traditional care, although significant limitations in the data were identified.49

Future directions

In light of the limitations of the existing methods to measure postoperative recovery, several recommendations can be made for future research. At the present time, there is no perfect measure of postoperative recovery, but there have been many new instruments introduced since the systematic review by Kluivers et al.,14 and some measures missed from a subsequent review.12 The psychometric properties of the newer instruments should be identified and evaluated. Ideally, validation studies should be performed for specific settings, patient populations, and time points. The downfall in this approach, however, is the likelihood that specific instruments may be valid only for specific conditions, leading to a wide range of measures from which to choose and a lack of comparability between studies.

A further complication is the fact that many of these instruments are not true measures in the sense that a ruler measures length or a scale measures weight. While item selection and reduction methods may be appropriate, items on most instruments are scaled with ordinal numbers, yet the scores are treated as if they had interval properties. For instance, consider an item with a three-level response, “How tired are you today?” Level 1 = “a lot”, Level 2 = “a little”, or Level 3 = “not at all”. It is unlikely that the difference from Level 1 to Level 2 is the same as the difference from Level 2 to Level 3. In this case, using a mean score would be inappropriate and may potentially confound any real difference.50

This importance limitation of existing instruments may be addressed by modern psychometric methods, such as item-response and Rasch measurement theory.50 A complete description of these techniques is beyond the scope of this review, but in brief, these techniques assess whether items on a questionnaire form a linear (i.e., interval property) scale and whether items can be hierarchically ordered by simultaneously evaluating item difficulty and person ability. Easy items will be completed by all but the most impaired, whereas only those at the high end of the ability spectrum will be able to complete hardest items. For example, changing from a transitioning from a lying to standing position may be an easy item, but running five kilometers will be a hard item. Therefore, a true scale should be created that can be generalized across settings and populations. Indeed, these new psychometric theories have already been widely used to evaluate and create instruments in the literature on multiple sclerosis and physical rehabilitation29 as well to create the item banks of the large-scale Patient-Reported Outcomes Measurement Information System (PROMIS).51

Finally, the development of any new recovery instrument should rigorously take into account outcomes that matter to all stakeholders involved in the recovery process. As outcomes may differ depending on the specific phase of recovery, the exact target time frame of recovery should be clearly defined. Furthermore, the development of any new instrument should adhere to the regulatory agency guidelines for the creation of patient-reported outcomes.52,53 Another strategy may be to combine multiple existing measures (as outlined in Table 3) into a single instrument, rather than developing an entirely new one. Modern psychometric techniques again prove useful in this strategy, as they can distinguish the spectrum of relevant items from the multiple pre-existing instruments that fit within the construct of postoperative recovery and integrate them into a valid linear scale.

Conclusions

The ERAS programs were originally designed to improve in-hospital outcomes, and evidence suggests they decrease length of stay and some complications, without increasing costs, for a variety of procedures. Nevertheless, it is less clear how they impact patient-reported outcomes, including symptoms, functional status, and overall HRQOL throughout the trajectory of recovery. We suggest a set of outcomes, including different stakeholder perspectives and outcomes from early and later stages of recovery, using instruments where evidence supports their validity to estimate recovery. The focus of future research should be to identify a consensus-based standardized core set of outcomes, to concentrate on domains of importance to the various stakeholders, and then to map these to existing PROs or create new instruments using modern psychometric methods. This approach would improve the quality of clinical trials in this area and allow for benchmarking between providers.

Key points

  • Studies evaluating the effectiveness of ERAS pathways have largely focused on the in-hospital phase with little information about post-discharge outcomes important to patients.

  • There is no single outcome to measure recovery, and studies should incorporate outcomes from a variety of stakeholders.

  • Standard definitions for outcomes and validated measurement instruments should be used when possible.

  • A consensus-based standardized core set of outcomes for ERAS studies would facilitate research.