Ten million Americans have inpatient surgery every year.1 Approximately 40% experience an adverse event during the perioperative period,2 including 10–15% who have a serious complication,3 and 1–2% who die within 30 days of surgery.4,5 Nevertheless, the risk of adverse events varies substantially between individual patients, and is largely predicted by personal factors.6,7 Patients consistently report a desire to receive more information prior to surgery.8,9 Despite this, personalized risks are rarely communicated to patients.8,10

Studies have shown that preoperative patient knowledge and anxiety are inversely related.11,12 Technical details of the operative period are well-described by clinicians, but patients express an un-met desire for personalized information regarding recovery, complications, and expected survival.13 Medicolegal reviews suggest that failure to explain risks and possible adverse effects of surgery are the most common allegations brought forward by plaintiffs.14 Despite the availability of validated risk prediction models15,16,17 and technology solutions to facilitate risk calculation and communication, personalized quantitative risk assessment is rarely performed in perioperative medicine. A recent scoping review identified only seven studies that directly calculated and communicated personalized risks to patients.18

Available examples of personalized risk communication in perioperative medicine have many limitations.18,19 In the only prospective evaluation of the National Surgical Quality Improvement Program (NSQIP) risk calculator in a preoperative clinic, the process relied upon a single anesthesiologist manually completing the online NSQIP calculator for each participant. Furthermore, this approach was not compared with standard care, precluding any causal inference.19 Therefore, we designed and developed the Personalized Risk Evaluation and Decision Making in Preoperative Clinical Assessment (PREDICT) app, a tablet application that was calibrated to local data, followed best practices for risk communication, and leveraged patients’ ability to provide and receive their own health information. We hypothesized that use of the PREDICT app would improve patients’ knowledge of their risks of morbidity and mortality and expected length of stay compared with standard care. We further hypothesized that anxiety levels would not increase, and that patient satisfaction would improve with using the app.

Methods

Study design and setting

Following protocol registration (ClinicalTrials.gov; NCT 03422133) and research ethics board approval (Ottawa Health Sciences Network Research Ethics Board Protocol #20170737-01H; 27 February 2018), we conducted a prospective before-after study at two geographically distinct campuses of an academic health sciences network (The Ottawa Hospital). The Ottawa Hospital network (900-beds) provides elective inpatient surgical care for general, vascular, neurologic, thoracic, major orthopedic, major gynecologic, and urologic surgeries. Each inpatient campus has a preoperative assessment unit and all patients are assessed before surgery by a registered nurse and/or physician anesthesiologist. This study is reported in accordance with the Consolidated Standards of Reporting Trial (CONSORT)-eHealth20 and Template for Intervention Description and Replication (TIDieR)21 checklists.

Study population

Eligible participants were 18 yr or older, were scheduled to undergo elective, major non-cardiac inpatient surgery, could communicate in English or French, were seen by a physician anesthesiologist for preoperative consultation, and provided written informed consent. Patients who were unable to provide consent on their own and bariatric surgery patients (because one hospital is a bariatric centre of excellence with care and discharge patterns distinct from typical perioperative care) were excluded. Throughout the manuscript, the before phase will be referred to as the “standard care phase”; these participants were enrolled between April 2018 and September 2018. The after phase will be referred to as the “PREDICT app phase”; these participants were enrolled between December 2018 and May 2019.

Before (standard care) and after (PREDICT app) conditions

Patients undergoing surgery at The Ottawa Hospital are triaged prior to coming to the preoperative assessment unit using standardized criteria (see criteria for consultation in the Electronic Supplementary Material [ESM], eAppendix 1) to receive a telephone assessment by a registered nurse, an in-person visit with a registered nurse, or an in-person visit with a registered nurse plus a physician anesthesiology consultation. These assessments occur one month to one week before surgery.

Self-reported patient health data were collected identically in both phases of the study using a tablet. Compared with physician history, patient-entered health histories in the preoperative clinic have substantial to near perfect agreement,22,23 while previous research showed similar predictive accuracy based on patient-reported data vs data collected using NSQIP processes.24 Nevertheless, in the standard care phase, there was no further interaction with the tablet, and calculated risks were simply stored in the study database. In the PREDICT app phase, participants continued through the PREDICT app process to receive and share their personalized predicted risks. Study processes and data collection in each phase are displayed in Fig. 1.

Fig. 1
figure 1

Study processes by phase. This figure illustrates the clinical and data-collection processes and timelines in the standard care and PREDICT app phases. PREDICT = Personalized Risk Evaluation and Decision Making in Preoperative Clinical Assessment

Standard care for before phase

Patients seen by a physician anesthesiologist undergo a systems-based history, physical examination and discussion of options, care processes, and risks at the discretion of the anesthesiologist. Formal or personalized risk calculation and/or communication is not a standard part of the preoperative anesthesiology consultation process and the electronic health record did not contain built-in risk scores or calculators. Anesthesiologists may use online risk calculators at their discretion, but their use is not formally documented. Patients receive procedure-specific documentation about preparation for surgery and day of surgery instructions.

Care processes in the PREDICT app phase

Participants were provided an iPad with the PREDICT app open in the waiting room of the preoperative assessment unit. Participants first inputted answers to health condition questions required to populate the risk calculator. Surgical procedure codes were inputted by the research assistant. Next, participants were prompted to provide (using free text) up to three benefits that they hoped to achieve from having surgery. The app then generated personalized risk predictions and a paper printout of anticipated benefits and personalized risk estimates was provided to the patient and their clinician. The printout also included three evidence-based questions used to encourage shared-decision-making: 1) what are my options; 2) what are the pros and cons of each option; 3) how do I get the support to make the decision that is right for me.25 These can be used to help encourage discussion of their care with their anesthesiologist (see Fig. 2). See eAppendix 2 (ESM) for the development of the PREDICT app.

Fig. 2
figure 2

PREDICT app output. Personalized risk output, benefits, and shared decision-making cues from the PREDICT app as provided to participants and clinicians. PREDICT = Personalized Risk Evaluation and Decision Making in Preoperative Clinical Assessment

Participants then engaged in their anesthesiology consultation. Anesthesiologists were not trained in use of the application or any study outcomes. Anesthesiologists were asked to provide informed consent and were made aware that patients may present personalized risk estimates in the clinic via the PREDICT app.

Outcomes

All outcome data were collected using a Research Electronic Data Capture (REDCap; Nashville, TN, USA) system and were entered by participants on a tablet (iPad). The primary outcome was the patient’s knowledge of their personalized risk profile after their anesthesiology consultation. This was measured using a knowledge questionnaire developed for this study based on recommended standards for knowledge questionnaires,26 designed to capture factual items specifically related to the patient’s personalized risk profile.27 The questionnaire was applied at the time of enrollment (baseline) and after anesthesiology consultation by asking patients to estimate their risk of morbidity, their risk of mortality, and their expected length of stay within one of five ranges (eAppendix 3; ESM). A question was marked as correct if the patient’s self-estimated risk and their model estimated risk were within the same range category. The pre- and post-appointment knowledge scores were normalized on a 100-point scale ([questions correct/total questions]·100).

Patient-centred secondary outcomes that were compared between the study phases included patient satisfaction of the process used to communicate their risk of surgery (assessed using a likelihood to recommend ten-point Likert scale)28 and patient anxiety levels (at baseline and after the consultation using the short form State-Trait Anxiety Inventory,29 which is normalized to a 100-point scale; higher scores mean greater anxiety, with a score ≥ 40 being clinically relevant).30

For PREDICT app phase participants, we also measured patient acceptability and how willing they would be to use the app again before a future surgery using a modified version of a validated five-point Likert scale.31 Clinician-centred secondary outcomes in the PREDICT app stage included acceptability questions (using five-point Likert scales)31 of the intervention (eAppendix 4; ESM) and the likelihood for clinicians to change management based on the risks generated by the PREDICT app (using a five-point scale with five indicating that a change would definitely be made). Feasibility reflected the proportion of patients for whom a risk score could be calculated and proportion of missing data from the patient-entered health questionnaire.

Sample size

Minimal data are available to estimate expected knowledge changes with a personalized preoperative risk communication tool. Therefore, we based our effect size estimate on the distinct, but related, decision aid literature, which suggests a standard deviation (SD) for the change in knowledge test of 10.32 We estimated that standard care phase participants would likely experience a small improvement in knowledge (5%) compared with those in the PREDICT app phase (10%). Although we could not identify a minimally important difference for patient knowledge tests, we prespecified a net improvement of 5% to be clinically important. Using a two-sided t test at the 5% level of significance and assuming a common SD of 10%, 86 patients per phase achieved 90% power to detect an important difference. Ninety-five patients per group were targeted to allow for 10% dropout. Use of analysis of covariance was expected to further increase the statistical power.33 Secondary outcome analyses were considered exploratory, so no adjustment for multiple testing was made.

Statistical analysis

Descriptive statistics were calculated for each group using means and SD for continuous variables and proportions for binary and categorical variables. Comparisons between groups were made between patients in the standard care and PREDICT app phases using Chi square tests for binary and categorical variables, and t tests for continuous variables.

As recommended, because our primary outcome (patient knowledge score) was a continuous variable measured at baseline and post-consultation follow up, we prespecified use of an analysis of covariance approach in a linear regression model.33,34 This means that the post-consultation score was our dependent variable, but each participant’s knowledge score at baseline was entered into the regression model as a covariate. We could then compare the knowledge score at the end of the anesthesiology consult between the study phases, conditional on the baseline scores. We conducted unadjusted analyses first, followed by primary analyses adjusted for patient characteristics that were imbalanced (sex, age [as a cubic term based on fractional polynomial transformation]) between phases. As a first sensitivity analysis, we also compared the change in knowledge score from baseline to post-consultation between groups. The change in score analysis is an alternative approach to analyzing continuous outcomes measured at baseline and follow up, but is thought to be more prone to bias. Nevertheless, when baseline scores differ between groups, some authors recommend comparing the different analytic approaches to ensure consistent results, especially in non-randomized studies.35 As a final sensitivity analysis of the primary outcome, we modeled the dependent variable as a count of correct responses in a repeated measures Poisson regression model using general estimating equations to account for repeated measures (baseline and post-consultation) within subjects.

For PREDICT app phase participants only, patient and clinician acceptability outcomes were analyzed descriptively to generate proportions responding with strong or moderately strong agreement, or strong or moderately strong disagreement. No data for study phase, covariates, or primary outcome measures were missing.

Results

We enrolled 201 participants (104 standard care, 97 PREDICT app). The final analytic cohort included 183 participants (90 before, 93 after) who were exposed to full study procedures (see Fig. 3; flow diagram, including post-enrollment exclusions). No surgeries were cancelled as a result of exposure to the PREDICT app. Participants in the standard care phase were younger and more often female; no other baseline characteristics were significantly different between the groups (Table 1).

Fig. 3
figure 3

Study population flow. This figure illustrates identification and accrual of eligible patients in the standard care phase (left side) and PREDICT app phase (right side). PREDICT = Personalized Risk Evaluation and Decision Making in Preoperative Clinical Assessment

Table 1 Baseline characteristics of study population

Change in knowledge scores

The mean knowledge score at baseline was lower in the standard care phase than in the PREDICT app phase (Table 2), but scores improved after anesthesiology consultation in both conditions. Controlling for baseline scores, our analysis of covariance showed that participants in the PREDICT app phase had significantly better knowledge of their personal risk profile after their anesthesiology consultation (an adjusted average increase of 14.3%; 95% confidence interval (CI), 6.5 to 22.0; P < 0.001) than patients in the standard care phase did (see Table 2). Results of the White’s test for heteroskedasticity were not significant (P = 0.09), suggesting variance across variables was consistent (an important assumption of linear regression). A significant improvement in knowledge continued to be present in the PREDICT app phase when analyzing changes in score (adjusted difference, 9.5%; 95% CI, 1.1 to 17.9; P = 0.03) and the number of correct responses (rate ratio, 1.43; 95% CI, 1.27 to 1.61; P < 0.001).

Table 2 Outcomes by study phase

Secondary outcomes

Table 2 provides summary statistics and effect estimates for secondary outcomes. Participants in the PREDICT app phase reported lower anxiety scores after their clinic visit than standard care participants, as well as higher satisfaction scores. After adjustment for age and sex, satisfaction scores were significantly higher; anxiety scores were not significantly different after adjustment.

Feasibility and acceptability of the PREDICT app

The PREDICT app was feasible for clinical use. All participants were able to provide the data required to calculate their personalized risk scores and risk scores were calculated successfully for all participants. We only instituted formal notation of how many people needed help with the tablet in the PREDICT app phase. There were 26 (27.8%) participants who required assistance with data entry. Our experience was that these were typically older people who were not accompanied by a younger support person (e.g., an adult child). Eighty-seven (93.6%) participants found the PREDICT app to be easy or very easy to use. Ninety-one (98.2%) reported that they would be willing or very willing to use the app before a future surgery. The majority of participants (67; 72%) reported actively discussing their risk estimates with their anesthesiologist.

Fifty-six clinicians rated the PREDICT app; 52 (96%) strongly or moderately agreed that the personalized risk profiles were clear and unambiguous, and 51 (91%) agreed or strongly agreed that the information provided was easy to use. Only 37 anesthesiologists provided their opinion on whether the PREDICT app would benefit patients, but 26 (70%) moderately or strongly agreed that the app would be of benefit. Clinician feedback on other domains is provided in eAppendix 4 (ESM). When asked after each consultation if the information provided by the PREDICT app would lead to them making a change in anesthetic management, only two (2%) anesthesiologists strongly or moderately agreed that the PREDICT app would lead to them making a change in anesthetic management, while 31 (33%) slightly agreed. The majority (65%) did not feel the information provided would lead to changes in anesthetic management.

Discussion

In this prospective before-after study, we found that exposure to a tablet-based, patient-facing personalized risk calculation and communication application (the PREDICT app) significantly improved patient’s knowledge of their personalized expected morbidity and mortality risks and length of stay. This increase in knowledge of risks was not accompanied by an increase in patients’ levels of anxiety. The application also led to a significant increase in patient satisfaction, with a higher likelihood of recommending a risk communication approach that included the PREDICT app compared with the standard approach in a clinical encounter. Finally, the PREDICT app was found to be acceptable to patients and clinicians. Future evaluation of this process (ideally in a cluster randomized trial) is warranted to show generalizability and increase the evidence base for the application’s effectiveness.

Despite studies documenting strong preferences from surgical patients to receive more personalized risk information before surgery,8,9,19 few examples of systematic preoperative approaches to personalized risk calculation and communication exist.18 Where these data do exist, significant limitations have been identified, including lack of comparator groups19 and high risk of bias study designs.18 Therefore, understanding whether engaging patients in a systematic process to calculate and communicate personalized risks and expected outcomes improves patient-important outcomes is a major gap in the perioperative literature. Based on our findings, it appears that personalized estimates of risk and expected outcomes can be feasibly and acceptably (to patients and clinicians) calculated and communicated to preoperative patients without altering clinic patient flows or other processes. By using the PREDICT app, a significant increase in patient knowledge may be achievable, and this change (over 10% higher with a 95% CI excluding the prespecified minimally important difference) is similar to what is shown by patient decision aids (which, it must be noted, have purpose beyond risk communication).32 Furthermore, our findings are consistent with those of Raymond et al. that demonstrated no increase in anxiety when preoperative personalized risks are communicated to patients.19

While our findings are promising and build directly on the ever-increasing availability of mobile technology and validated perioperative risk models,15,16,17 future design or implementation of related processes should be informed by evidence-based considerations. First, risk estimates should be based on valid and accurate models. In our study, we harnessed the well-validated NSQIP universal risk calculator, combined with our hospital network’s NSQIP outcome data, to create locally calibrated models to predict risks and expected outcomes. Next, we actively engaged patient partners in designing our application and piloting our process. We ensured that risk information was communicated to patients using best practices, including absolute risk estimates displayed using pictographs and explained in terms of personalized but population-informed probabilities.36 We made sure to elicit patients’ perceived benefits of having surgery to support consideration of the risk-benefit balance. Finally, we provided evidence-based shared decision-making cues25 to support patients in actively discussing their risks and care with their anesthesiologist.

Further research will be required to fully understand the role and impact of systematic personalized risk calculation and communication on patient experience, outcomes, and health system performance. Further development will be required to allow direct interfacing with a variety of electronic health record systems, which should facilitate input of variables (such as surgical codes) typically unknown by patients and to allow direct upload of results into the medical record. Questions related to timing and setting should also be considered. While risk assessment and stratification are well within the scope of practice of anesthesiologists, it is reasonable to consider the implementation of a personalized risk calculation and communication process at the time of surgical consultation (as this is typically where a patient is making the decision about proceeding with surgery). Nevertheless, formal acceptability data from surgeons has not yet been collected. Formal integration with electronic health records could allow for automatic input of surgical codes and the possibility for patients to access this tool at their convenience, even from home. Furthermore, directly linking this process to formal evidence-based shared decision-making resources (such as decision aids or decision coaching)32,37 could also support better quality decision-making for individual patients. Finally, linkage of personalized risk estimates with actionable processes of care also has the potential to improve patients’ health outcomes and health system performance but will require robust evaluation.

Strengths and limitations

Our study should be appraised in consideration of its strengths and limitations. First, although a randomized trial would provide stronger causal evidence, we could not perform an individual patient randomized trial as the risk of contamination was deemed to be very high (we were concerned that exposure to a patient randomized to the PREDICT app would bias the clinician to access an online risk calculator [e.g., NSQIP online calculator] when caring for subsequent patients randomized to the control condition). An alternative design would be an interrupted times series; however, we estimated that 960 participants would have been required. Instead, we focused on robust conduct of a prespecified and registered prospective before-after design. Nevertheless, this design is liable to certain biases including confounding bias, temporal bias, and inadequate allocation concealment. Fortunately, we were able to pre-specify, register, and complete our study over a short time frame. This helped to ensure that both phases were identical in terms of data collection processes and temporally proximate; importantly, no other changes in practice at our hospital occurred. Furthermore, baseline characteristics of both groups were similar and primary results were consistent after adjustment for unbalanced covariates. While we could not blind clinicians in the PREDICT app phase, they were not aware of the standard care phase and participants were not aware of being in one phase of the study vs the other. Additionally, all participant data and outcome collection were performed prospectively and in the same fashion in both groups. Outcomes were based on validated approaches where possible26; however, to our knowledge, a standardized and validated preoperative risk knowledge questionnaire does not exist. Our approach was to follow methods used in the decision aid literature, but as the knowledge assessed comes from the app, this could bias our results in favour of the PREDICT app. As a single-centre study, we cannot comment on the generalizability of our findings to other hospitals or jurisdictions with differing approaches to preoperative assessment. Future studies should also follow patients after surgery to record outcomes and satisfaction with the preoperative process after surgery has occurred.

Conclusions

In a prospective before-after study, we found that exposure to a tablet-based, patient-facing, personalized risk communication application improved patients’ knowledge of their personalized risks and expected outcomes compared with standard preoperative assessment by a physician anesthesiologist. This process was feasible and acceptable, and led to improved patient satisfaction without increasing anxiety levels.