Postoperative recovery is a multifaceted process, influenced by numerous factors such as patient characteristics, type and duration of surgery, and anesthetic protocol. Within the emergency surgery context, the perioperative period corresponds with increased morbidity, potentially indicating compromised quality of recovery.1,2 Impaired quality of recovery in the postoperative phase has been associated with mid-term complications,3 and appears to correspond to diminished long-term quality of life.4 Therefore, identifying suboptimal quality of recovery might facilitate earlier interventions to enhance long-term quality of life.

Most emergency surgery studies have concentrated on reducing perioperative morbidity and mortality, with few evaluating patient recovery holistically. Nonetheless, scales for measuring postoperative recovery quality, including the Quality of Recovery (QoR)-40 questionnaire and its condensed 15-item version, the QoR-15 questionnaire, have been developed.5,6 Each item on the QoR-15 scale is scored from 0 (unfavourable) to 10 (favourable), resulting in an aggregate score from 0 (no recovery) to 150 (total recovery). This reliable, sensitive questionnaire is easy to administer in clinical practice and provides a comprehensive view of postoperative quality of recovery as evaluated by the patient themselves. The QoR-15 questionnaire has been recommended as a measure for patient comfort in clinical trials per the Standardized Endpoints in Perioperative Medicine initiative,7 or even as a monitoring component by the American Society for Enhanced Recovery.8 Consequently, the QoR-15 questionnaire has been validated in multiple languages.

Nonetheless, none of these validation studies have sought to confirm the QoR-15 questionnaire’s use in an emergency surgical population, whether traumatic or otherwise. Furthermore, only a handful of studies have applied this score within an emergency context without affirming its psychometric validity and reliability for this population.9,10 Even less research had evaluated its association with quality of life or the number of days at home at three months.

Our objective was to confirm the validity and reliability of the QoR-15 questionnaire for use with the emergency surgical population and to examine the early QoR-15 score’s associations with both quality of life and the number of days spent at home.

Materials and methods

Data source

We conducted this single-centre, prospective cohort study at the University Hospital of Angers, France from 15 August 2021 to 13 April 2022. Written consent was not requested; however, all patients were informed and accepted information collection as mandated by French law.11 This study was reviewed by a French ethics committee (Comité de Protection des Personnes Ile de France VI; registration ID: 21.02487.003521) and was registered with ClinicalTrials.gov (NCT04845763; first submitted 11 April 2021). The study is reported according to the Strengthening the Reporting of Observational Studies in Epidemiology guidelines.

Study population

Eligible patients were identified during the presurgery anesthesia consultation. Participants met the following inclusion criteria: 18 yr or older, admitted for urgent surgery (desired surgical procedure time < 72 hr), capable of completing the questionnaire independently or with assistance, French-speaking, and willing to participate in the study. We excluded patients with significant psychiatric or neurologic disorders impeding cooperation in questionnaire completion, patients under legal wardship or guardianship, patients admitted for cardiac or obstetric surgery (Cesarean delivery), patients admitted for revision surgery, and patients previously included in the study during prior admissions.

Available data

A validated French version of the QoR-15 score (FQoR-15) was used.12 We relied on the FQoR-15 to ensure thoroughness in gathering data, as this version was already commonly used at our hospital centre. Participants completed the questionnaire at three timepoints: before surgery (baseline, H0) and at 24 (H24) and 48 (H48) hr after surgery. Patients completed the questionnaire independently, or with an assessor’s assistance if required. For postoperative evaluations, phone interviews were conducted if the patient had been discharged home or to another facility (follow-up care and rehabilitation department). We collected several characteristics at inclusion, such as demographic information (age, weight, height, sex), the American Society of Anesthesiologists Physical Status score, comorbidities, trauma status, surgery type, and emergency level according to the Timing of Acute Care Surgery (TACS) classification.13 The time taken to complete the questionnaire was noted by the patient or a medical staff member. Perioperative information at H24, such as the executed surgical procedure and its duration, the surgical outcome risk tool (SORT) score indicating the procedure’s severity,14 and anesthesia type, were also recorded.

Opioid use in the last 24 hr (including consumption in the postanesthesia care unit for the H24 evaluation), significant complications as per the postoperative morbidity survey (POMS) classification evaluating nine domains: pulmonary, infectious, renal, gastrointestinal, cardiovascular, neurologic, hematologic, wound, and pain),15 and hospitalization status were recorded at H24 and H48. Total length of stay and postoperative complications were documented at discharge or three months after surgery if the patient remained hospitalized. We also conducted a three-month assessment of quality of life via phone interview, using the EQ-5D-3L questionnaire and EQ-VAS (vertical visual analog scale scored from 0 to 100, 0 being “worst” and 100 being “best imaginable” health). Data on the number of days spent at home during the first 30 and 90 days after surgery (DAH30 and DAH90, respectively) were also collected. The patient’s vital status was confirmed, or the date of death was recorded.

The psychometric assessment

We conducted a psychometric study to confirm the validity and reliability of the FQoR-15 questionnaire within the context of emergency surgery.16,17

  • Content validity examines whether the questionnaire items effectively encapsulate the Quality of Recovery concept. The QoR-15 questionnaire items have already been validated in a postoperative context. We applied them to a distinct target population, namely patients undergoing emergency surgery. As we did not modify these items, we did not specifically re-evaluate this validity in our study.

  • Internal consistency refers to the degree to which the items capture the Quality of Recovery construct. We also assessed the unidimensionality of the questionnaire.

  • Convergent validity is the association between the questionnaire and a “gold standard.” We compared the QoR-15 with a general state visual analogue scale, which yields a score between 0 (“very impaired health”) and 10 (“excellent health”). The question used was: “How would you rate your overall health over the past 24 hours?” We assessed this at H0, H24, and H48.

  • Construct validity indicates the score’s suitability concerning theoretical alterations related to quality of recovery in the context of emergency surgery. To test this, we formulated several hypotheses, with over 75% requiring confirmation. We hypothesized that there would be a gender variation in the score (higher scores for men), a positive correlation with age (higher scores in younger individuals), a negative correlation with the emergency level of surgery (according to TACS classification), a negative correlation with surgical risk of the procedure (according to the SORT score), a negative correlation with the occurrence of postoperative complications (according to the POMS classification), a negative correlation with morphine use, and a negative correlation with length of stay.

  • Reproducibility is determined through a “test-retest” comparison. It suggests that repeated tests on stable individuals yield similar results. Two measurements, conducted 24 hr after surgery and separated by 30 min to one hour, were performed to assess response consistency. Agreement pertains to absolute measurement error, while reliability indicates the extent to which patients can be differentiated from one another, despite potential measurement error.

  • Responsiveness reflects the ability of a questionnaire to detect clinically relevant changes over time.

  • We considered floor or ceiling effects if more than 15% of respondents achieved the lowest or highest possible score.

  • Acceptability and feasibility are measures of user-friendliness, such as the patient recruitment rate, the total participation rate in the three time frames (H0, H24, and H48), and the time taken to complete the questionnaire.

  • The minimum clinical difference (MCD) and clinically significant difference (CSD) denote the smallest differences that need to be perceived in the total QoR-15 score to identify a minimal or substantial change in a patient's recovery status. At H24 and H48, patients evaluate their recovery over the past 24 hr on a seven-item Likert scale ranging from “much worse” to “much better.” The mean difference in QoR-15 score between “same” and “slightly better” was used to ascertain the MCD, based on the anchor-based method.18 To determine the CSD, patients were asked, “Do you believe you have had a good recovery?” at H24 and H48.19

Sample size

At present, there are no established guidelines for calculating sample size in the context of a psychometric assessment study.16 An acceptable limit was identified as 300 patients.20 Accounting for a potential 20% loss to follow-up or instances of missing data, we targeted a sample size of 375 patients for the assessment of QoR-15 in the context of emergency surgery.

Statistical analysis

We present data as mean (standard deviation [SD]) or median [interquartile range] for quantitative variables. Qualitative variables are represented by the number of patients and the percentage (%). To compare continuous variables, Student’s t tests were used for normally distributed variables and Wilcoxon tests were used for non-normally distributed variables. Associations between quantitative variables were measured using Spearman correlation coefficients.21 An interitem correlation matrix is proposed and composed of Spearman correlation coefficients. Internal consistency was measured using the Cronbach α coefficient. The objective was to obtain a value between 0.70 and 0.90.22 We explored the number of dimensions of the questionnaire by the total percentage variance explained by the first factor. Test-retest reliability was measured using the agreement intraclass correlation coefficient. A value of 0.70 is usually recommended as a minimum standard for reliability.22 The test-retest provided an estimate of the standard error of measurement (SEM), including systematic differences (i.e., SEM agreement). Responsiveness was quantified using the Cohen effect size (average change score divided by the SD at baseline) and standardized response mean (change scores divided by the SD of the change scores). The final MCD corresponded to the average of the MCD obtained by the distribution methods (corresponding to 1.96 × SEM, a larger variation than the random variation at 5% of uncertainty) and anchoring methods (the difference in mean score values between the “same” and “slightly better” status). The 95% confidence intervals (95% CIs) were obtained by bootstrapping (adjusted bootstrap percentile method, with 1,000 bootstrap replicates, using the R “boot” library).23,24 We rejected the null hypothesis if the P value was < 0.05. To control the false discovery rate, we adjusted the P values obtained during the analysis using the Benjamini–Hochberg method. We performed all statistical analyses using R software version 4.1.3 (R Foundation for Statistical Computing, Vienna, Austria).

Results

Description of the population

Within the study duration, we included 375 patients. Out of these, 352 (93.9%) completed the H24 questionnaire, 350 (93.3%) completed the H48 questionnaire, and 338 (90.1%) completed the three-month assessment. The patient flow chart is provided in Fig. 1, and patient characteristics are summarized in Table 1. The represented surgeries comprised orthopedic (51%), gastrointestinal (27%), urologic (13%), vascular/thoracic (4%), neurosurgical (3%), and others (2%). Electronic Supplementary Material (ESM) eTable 1 gives more details on the types of surgery. For 32% of the cases (117 procedures), patients were discharged on the same day. Hospitalization status and POMS complications are available in ESM eTable 2. The median hospital length of stay was 2.0 [1.0–7.0] days. Patient-reported recovery statuses are compiled in ESM eTable 3.

Fig. 1
figure 1

Study flow chart

Table 1 Characteristics of patients

Psychometric assessment

The mean (SD) QoR-15 score was 100.3 (22.9) vs 106.7 (22.6) at H24 and 115.9 (22.2) at H48, with no ceiling or floor effects observed in these three timelines. Figure 2 illustrates score distributions across the three timepoints. The increase in score between H0, H24, and H48 signifies the dynamics of postoperative recovery and confirms responsiveness following emergency surgery. The Cohen’s effect size for the QoR-15 score was 0.29 at H24 and 0.68 at H48 compared with preoperative scores. Electronic Supplementary Materials eTables 4 and 5 present the responsiveness from baseline to H24 and H48, respectively.

Fig. 2
figure 2

Representation of QoR-15 scores preoperatively (H0) and at 24 hr (H24) and 48 hr (H48) after surgery

For internal consistency, Cronbach’s alpha coefficients were 0.84 (95% CI, 0.81 to 0.86) at H24 and 0.84 (95% CI, 0.81 to 0.86) at H48. The overall mean interitem correlation was 0.26 (95% CI, 0.23 to 0.29) at H24 and 0.33 (95% CI, 0.29 to 0.36) at H48. Electronic Supplementary Material eFigs 1 and 2 offer heatmap visualizations of the QoR-15 inter-item correlations at H24 and H48. The unidimensional nature of the QoR-15 score was corroborated, with a total variance explained by the first dimension of 36.1% at H24 and 42.7% at H48 (scree plots diagram provided in ESM eFigs 3 and 4). The correlation coefficients between the 10-point general state measurement and the QoR-15 score were 0.57 (95% CI, 0.49 to 0.64) at H0, 0.64 (95% CI, 0.57 to 0.71) at H24, and 0.71 (95% CI, 0.63 to 0.76) at H48.

For questionnaires completed with a 30–60-min interval at H24, the intraclass correlation coefficient was 0.94 (95% CI, 0.91 to 0.97), and the SEM was 5.2 (95% CI, 4.8 to 5.6).

Regarding construct validity, Table 2 sums up the results of testing assumptions on the QoR-15 score. Of the 14 hypotheses, 11 (79%) were confirmed.

Table 2 Summary of tested assumptions (construct validity) on the global Quality of Recovery-15 score at 24 and 48 hr after emergency surgery

The flow chart attests to the patients’ acceptability to answer the questionnaire: 100% completed it before surgery, 95% at H24, and 94% at H48. The average completion time was 4.5 min, with durations ranging between one and 15 min.

For the MCD estimation at H24, we found 10.2 with the statistical method and 5.9 with the anchoring method (subjective patient assessment), averaging an MCD of 8.0 out of the QoR-15 scale’s 150 points. The average QoR-15 score at H24 in the group of patients who felt they had recovered well was 110.1 vs 79.0 for those who did not feel they had recovered well. Thus, the CSD was 31.1 points at H24.

Association between Quality of Recovery-15 score and three-month outcome

The EQ-5D-3L questionnaire responses at the three-month timeframe are presented in ESM eTable 6. The QoR-15 score at H24 was associated with each EQ-5D-3L domain. At three months, the mean (SD) EQ-VAS health status analog score was 77 (20) (out of 100). There was a statistically significant correlation between the QoR-15 at H24 and the EQ-VAS at three months (r = 0.24; 95% CI, 0.14 to 0.34; P < 0.001) (Fig. 3). Median [interquartile range] hospitalization duration was 2.0 [1.0–7.0] days; DAH30 was 28 [21–29], and DAH90 was 88 [81–89]. At three months, 6% of patients were still hospitalized. Statistically significant correlations were found between the QoR-15 score at H24 and the DAH30 (r = 0.33; 95% CI, 0.23 to 0.41; P < 0.001) and DAH90 (r = 0.31; 95% CI, 0.22 to 0.40; P < 0.001).

Fig. 3
figure 3

Relation between the Quality of Recovery-15 value at 24 hr after surgery and the quality of life according the EQ-5D-VAS

QoR-15 = Quality of Recovery-15; VAS = visual analog scale

Discussion

Our study examined the psychometric properties of the QoR-15 questionnaire among patients requiring emergency surgery. Analyzing 352 patients, we confirmed the QoR-15’s relevance in emergency surgical contexts for early quality of recovery assessment. Importantly, an association was noted between the initial QoR-15 score and the quality of life three months postsurgery.

A key strength of our study is the extensive validation of QoR-15’s psychometric qualities in an emergency surgical setting. Echoing the original study,6 our findings reinforced that QoR-15 remains a unidimensional tool. The holistic score provides insights into early quality of recovery, boasting excellent internal consistency and reproducibility.6,25 A significant portion (over 75%) of the hypotheses grounded in the Quality of Recovery concept were confirmed in our patient cohort. This is in line with prior QoR-15 score studies where negative correlations with gender, complication incidence, and length of stay were reported.6,12 Additionally, we uncovered a previously unevaluated negative correlation with morphine consumption. Nevertheless, we found no association between the degree of urgency and the QoR-15 score, likely because of the underrepresentation of the most severe surgical emergencies (i.e., requiring surgery within one hour from diagnosis).

Mirroring the original study, the QoR-15’s feasibility and acceptability were excellent.6,25 The single-page design of the questionnaire boosts its usability. In our demographic, 31% of participants were home the same day after surgery and later reached via phone. This showcases the QoR-15’s adaptability for telephonic evaluations in both emergency and routine surgical scenarios.6,12,25,26,27 Future adaptations might see the QoR-15 integrated into digital platforms, enhancing accessibility in outpatient settings. From a clinical standpoint, the QoR-15’s MCD resonated with established literature at 8.0.28 A contemporary re-evaluation suggested that the MCD for the QoR-15 score is likely around 6.0,29 pivotal for framing and interpreting upcoming clinical studies centred on the QoR-15 score.

Analysis of QoR-15 score responsiveness revealed a postoperative increase in the score relative to the preoperative assessment, indicating an apparent improvement in the patients’ clinical status after surgery. We presume that the preoperative assessment conducted during emergency surgery was performed within a context of already compromised health status, thereby limiting the analysis of the isolated impact of surgery. This stands in contrast to routine surgical settings where presurgery evaluations are more reflective of the patient's typical health status.9,10 A distinctive element of our QoR-15 assessment in emergency surgery is the diversity of our population, incorporating various surgical procedures spanning over seven surgical specialties. Despite these substantial differences in surgical procedures, the psychometric appraisal of the score remained valid and reliable, endorsing its comprehensive applicability in emergency surgery contexts.

An additional objective of our study was to evaluate the association between the QoR-15 score and longer-term recovery. Prior literature had shown an association of this score with 30-day postoperative complications.3 Nevertheless, there is currently a paucity of data concerning the longer-term outcomes, particularly regarding quality of life.4 Myles et al. highlighted the association between the early postcardiac surgery QoR-40 score and the quality of life three years later.30,31 Our study found the QoR-15 score at H24 correlated with all EQ5D-3L domains, the EQ-VAS, and the number of postoperative days spent at home. Given a significant proportion of our participants were recontacted at three months, administering the QoR-15 questionnaire in the immediate postoperative phase could pave the way for earlier, impactful interventions or even detailed follow-ups to improve long-term quality of life.

Our study has certain limitations. Firstly, it was a single-centre study, and as such, we cannot ascertain that the results can be extrapolated to other centres. Secondly, our population comprised only a few extreme emergency surgeries. In fact, the cohort was mainly made up of patients admitted for urgent surgery expected to be performed within 24–48 hr. Even though we have shown an association between the early preoperative QoR-15 score and the quality of life at three months, this relationship remains moderate with a dispersion of values in our population. Lastly, although our population was diverse, with various surgical specialties represented, we could not include all surgical specialties and cannot guarantee that the questionnaire is suitable for unrepresented surgical procedures.

Conclusion

Our study confirms the validity, reliability, and clinical utility of the QoR-15 in the emergency surgery setting. The tool meets necessary psychometric standards. A significant link exists between the immediate postsurgery QoR-15 score and long-term quality of life. Its application may enable prompt, effective quality-of-life interventions. We recommend its use for evaluating the quality of recovery after emergency surgery, whether as a standard outcome measure in clinical trials or as a routine clinical evaluation parameter.