Introduction

Delirium is a serious condition of acute neurological dysfunction occurring in up to 38% of older emergency department (ED) patients [1,2,3]. Although older adults (i.e., people 65 years of age and older) are at high risk of developing delirium in the ED [4,5,6,7], it is missed in 57% to 85% of these patients [2, 8]. Lack of detection and treatment in the ED is associated with poorer outcomes such as increased length of hospital stay and mortality [7, 9,10,11,12,13,14]. Improving delirium care for older ED patients is hindered by underlying knowledge gaps and lack of practice standards for care in this setting [15]. Mechanisms to evaluate ED practice performance are needed to identify gaps and variations in quality care to focus delirium care improvement strategies where they are most needed.

Performance measures (PMs) are tools to quantify measurable aspects of practice performance [16, 17]. These are usually classified as structures (i.e., conditions under which care is provided) or processes (i.e., diagnosis, treatment, rehabilitation, and prevention of health conditions) of care as defined in Donabedian’s seminal framework of healthcare quality and measurement [18]. The extent that PMs are observed in practice provides an indication of the quality of the care provided and the likelihood of attaining optimal patient outcomes (i.e., changes in an individual or population attributable to healthcare) [17, 19, 20].

Numerous researchers and organizations assert that clinical practice guidelines (CPGs) are an essential first step in developing quality statements (i.e., concise statements defining best practice in a specific context), which in turn can be transformed into operationalizable metrics as PMs [16, 17, 19, 21,22,23]. Therefore, quality statements can be used to guide best practice and are an important antecedent to developing PMs that are necessary to evaluate the quality of care provided [16, 17, 19, 21]. In the past decade, work has been done to increase the methodological rigor of developing guideline-based PMs [16, 17, 24, 25]. Nothacker et al. (2016), as part of the Guidelines International Network (G-I-N), developed standards for generating guideline-based PMs [16], which have been used to inform our work.

Based on the results of an umbrella review of current delirium CPGs [26], we developed a preliminary set of quality statements and PMs grouped into four categories of delirium care: screening, diagnosis, risk reduction, and management. Criteria for quality statement and PM development from the umbrella review were: (1) agreement across two or more CPGs that the action or intervention be done, and (2) that the action or intervention was identified as a priority for implementation by at least one CPG group. No high-quality ED-specific delirium CPGs were found during our umbrella review, therefore we included CPGs from across the care continuum. However, the ED setting was included in the evidence base for many of the recommendations included in our synthesis [26]. To supplement the umbrella review and support the development of the PMs, structured searches of the Scopus and PubMed bibliographic databases were conducted for any recently published research relevant to delirium care for older adults in the ED specifically. Seven evidence syntheses [1,2,3, 27,28,29,30] and three multi-centre observational studies [8, 14, 31] were included as additional evidence to support the PMs.

Results from the umbrella review, supported by additional recent ED-specfic research, provided an evidence-based foundation for the creation of a set of quality statements (N = 10), and subsquent PMs (N = 24), for delirium care in the ED. Methods and criteria for conducting the transformation from CPG recommendation synthesis—to quality statement—to PM were incorporated into developing this preliminary set [16, 19, 25, 32]. For example, ensuring that each developed PM addresses an aspect of structure or process that is linked by evidence to improved outcomes, uses specific and unambiguous (i.e., concise) language, and is designed to be measurable [16, 19, 25, 32]. The greater number of PMs versus quality statements reflects the potential need to develop a structure and process PM from the same quality statement, or to develop more than one process PM to address the same quality statement to ensure the PMs are concise and are measurable. The next step in establishing a set of PMs for use was to conduct a formal consenus process with a diverse panel of experts to finalize a set of PMs from the transformed recommendations [16]. As the transformed recommendations were from CPGs for the entire care continuum, this next step was vitally important to ensure the final set of PMs are relevant to the ED, as well as to increase their credibility and acceptability in this setting [16, 23,24,25, 32,33,34].

The purpose of this study was to gain consensus on a set of guideline-based quality statements and PMs to guide and evaluate delirium care quality for older ED patients. To achieve this, we conducted a modified e-Delphi study to reach consensus among key clinical experts on a set of ED quality statements and PMs.

Methods

This 3-round modified e-Delphi study was conducted between April and July 2023. The methods of this study have been previously detailed in our open-access protocol [35], and are briefly described here. The design was informed by the Guidance on Conducting and REporting DElphi Studies (CREDES) [36], and other recommended criteria [33, 37]. This study received approval from the University of Manitoba Health Research Ethics Board (ID HS25728 [H2022:340]). Informed consent was received from all participants before completing any questionnaires. The consent process and each round was conducted electronically and anonymously using the Research Electronic Data Capture (REDCap) secure online platform for building and managing online surveys [38] through the University of Manitoba licensing agreement [39]. The study flow and objectives for each round are illustrated in Fig. 1.

Fig. 1
figure 1

Delphi study flow

Study steering group

We convened a steering group with members consisting of co-authors (with methodological and clinical expertise), two patient/family representatives, and two further clinical experts (one ED physician and one registered nurse with experience as an ED front-line provider and clinical decision-maker). The steering group met at key stages to provide oversight on protocol design; to give feedback on Delphi questionnaire development, structure, and clarity; and to help identify potential Delphi participants. Including patient/family representatives in the steering group helped support the patient-centeredness of the quality statements and PMs by ensuring they contain aspects of care important to patients and families and suggesting alternative terminology that better reflects patient views. Members of the steering group who are not co-authors did not have access to raw study data and were not able to influence the study process. Feedback and changes suggested by the steering group were agreed between the study co-authors before implementation.

Expert panel selection

An expert in this study is defined as one with clinical knowledge in the care of older adults in the ED.

Inclusion criteria:

  • Able to read and write in English;

  • Willing to participate, and;

  • Meet one or more of the following criteria: (1) clinical experience in a relevant field to the ED care of older adults for five or more years post-basic graduation, (2) postgraduate qualifications or credentials relevant to the management of delirium in older adults, or (3) recognition by peers as an expert in the area (e.g., member of a relevant organization or network).

Exclusion criteria:

  • Insufficient clinical knowledge and experience in a relevant area (e.g., system-level decision-maker, patient, or < 5 years post-basic clinical experience), or;

  • Unable to commit to be available for the entire process.

Recruitment

The initial recruitment period lasted 4 weeks to identify Canadian experts through existing professional networks and associations (e.g., Canadian Association of Emergency Physicians [CAEP] and National Emergency Nurses Association [NENA]); email invitations and advertisements; social media calls (LinkedIn, Twitter, and/or Facebook); and snowballing from other experts. Our minimum a priori sample size (N = 17) was not achieved after 4 weeks therefore, recruitment was extended for an additional 4 weeks and expanded to include eligible international clinical experts.

Survey design and development

The primary questionnaire in our study consisted of closed-ended questions with the opportunity to provide comments for each quality statement and related PMs to justify decisions and suggest edits to increase clarity. Each round was also accompanied by an introduction section to refamiliarize panelists to the study, state the intentions of the round, and provide definitions for key concepts.

Questionnaire development was informed by PM assessment instruments (i.e., AIRE [40] and QUALIFY [41]), criteria used by organizations that develop and implement PMs (i.e., the National Quality Forum [42] and NICE [19]), as well as syntheses of these sources [16, 32, 43]. Panelists were asked to judge each quality statement according to its importance and actionability; then, were asked to judge each related PM according to its necessity (see Table 1). The quality statements and PMs were scored using these criteria, on a Likert scale ranging from 1 to 9. Delphi participants were advised to think of each 9-point scale being made up of three parts (i.e., tertiles), where 1 to 3 could be used to record low ratings (i.e., not at all important, actionable and/or necessary), 4 to 6 record average ratings (i.e., somewhat important, actionable and/or necessary), and 7 to 9 record high ratings (i.e., very important, actionable and/or necessary). A response option of ‘I do not know’ was also provided to capture uncertainty. A question was included in the first two rounds on the Delphi to gain consensus on a reasonable timeframe to complete delirium screening upon arrival to the ED. This timeframe was incorporated in the associated quality statement and PM in the final round of the Delphi. A rationale for each quality statement and its PMs (including a summary of the evidence) was provided in each round to enable participants to make informed judgements.

Table 1 Selection criteria used to rate quality statements and PMs

Prior to implementing the Delphi survey, the questionnaire was piloted with clinical expert steering group members. The purpose of this process was to ensure that the PMs are clearly and precisely worded with unambiguous language, and that each set of PMs reflect the quality statement they are meant to measure. For example, as quality statements are defined as “concise statements defining best practice in a specific context” it was agreed to call the quality statements ‘best practices’ in the Delphi questionnaire to decrease the number of new concepts introduced to participants and to increase comprehensibility and readability of the survey. A copy of the Round 1 questionnaire that was approved for implementation by the steering group, which includes all preliminary ‘best practices’ (i.e., quality statements) and PMs, is provided in Supplemental File 1.

Procedure

In our modified e-Delphi process, a minimum of three rounds were planned a priori to allow participants to have feedback, revise previous responses, then stabilize responses [44, 45].

Defining consensus and stability

We used the RAND criteria for agreement to define consensus [46], which aligns with PM development frameworks [32, 47]. Consensus was defined as 80% of ratings within the 3-point tertile of the overall median. The lower tertile (1–3) represents scores that are ‘not at all’, the middle tertile (4–6) represents scores that are ‘somewhat’, and the upper tertile (7–9) represents scores that are ‘very’ important, actionable, and/or necessary. To be included in the final set, quality statements and PMs needed to reach consensus in the upper tertile (i.e., overall panel median of 7 to 9, with 80% of ratings within the 3-point tertile of the overall median). Those that achieved consensus just below a priori thresholds were considered during qualitative data analysis and interpretation to determine justification for potential inclusion [37].

A measurement of stability was used as a stopping criterion for the Delphi process. This was defined as the consistency of responses between successive rounds (i.e., no meaningful change) [44, 45, 48]. Meaningful change was defined as a median change between tertiles and a greater than 15% change in the percentage of participants whose scores changed tertiles [45, 48].

Stopping and PM removal criteria

For the overall study, the criterion to stop the Delphi process was defined as no meaningful change in scores between the current and preceding round on at least 75% of quality statements and PMs assessed. Additionally, criteria for PM removal were considered after the second round. To be removed from the process, a PM’s scores must have shown no meaningful change from the previous round, and there must have been consensus that the PM is not necessary (i.e., overall panel median of 1 to 3, with 80% of ratings in the lower tertile).

Delphi rounds

In Round 1, along with the questionnaire containing the preliminary set of quality statements and PMs, participants also completed a participant demographics form. In Round 2, a personalized questionnaire was sent to each participant with: quantitative group results (i.e., median, minimum, and maximum ratings) presented numerically and graphically, qualitative feedback (i.e., summary of participants’ comments), and the participant’s own response to illustrate their position in relation to the group. In Round 3, a new personalized questionnaire was sent to each participant with the revised: quantitative group results presented numerically and graphically, qualitative feedback, and the participant’s own response to illustrate their position in relation to the group. Member checking of themes that emerged from feedback in Rounds 1 and 2 was also completed.

Data analysis

Descriptive statistics (frequency distributions) were calculated to determine if consensus, stability, PM removal and stopping criteria were met, as well as to present quantitative feedback to participants (median, minimum and maximum values) in rounds two and three. If a participant rated a question as ‘I do not know’ the value was counted as missing and the denominator was adjusted for that question to reflect the number of valid responses. Statistical analyses were performed using Microsoft Excel™ for Mac.

Inductive content analysis was used to code and summarize participants’ comments to be fed back to the group in rounds two and three, as well as to provide context during data interpretation and to support justification for including quality statements and PMs slightly below a priori quantitative thresholds in the final set [44, 49, 50]. Following a period of data familiarization, data were coded and counted iteratively to identify themes within each set of, as well as across all, quality statements and PMs. To be classified as a theme to be included in the comment summary for each quality statement and set of PMs, the code needed to be described by a minimum of two participants. Original wording from one expert that best represented the wording for that theme was used where possible [44, 50]. Across all rounds of the Delphi, overarching themes emerged from the coded data. To explore the trustworthiness of these results, extra questions were included in the final round of the Delphi as a form of participant validation (or member checking) [51].

Results

Delphi panel

Fifty-three experts were identified and contacted by the lead author, of which nine were known to snowball the invitation within their professional networks. Advertising by professional associations (e.g., CAEP, NENA, and iDelirium) through email fan-outs and/or social media reached approximately 8,000 individuals. Twenty-four experts expressed interest and met eligibility requirements. Of those, 22 provided consent and were enrolled into the study.

Over half of panelists were physicians (n = 12, 54.6%), followed by registered nurses (n = 6, 27.3%) (see Table 2). Over two-thirds of panelists’ primary work setting was the ED or urgent care (n = 15, 68.2%), and approximately two-thirds had 10 or more years of clinical experience (n = 14, 63.6%). Self-reported level of experience in older adult care on a 9-point Likert scale ranged from 5 to 9 (median = 7). Almost all panelists were from west or central Canada (n = 21, 95.5%). All panelists were retained in the final round of the Delphi (N = 22, 100%), with one missing response in Round 2.

Table 2 Demographic details of Delphi panel (N = 22)

Quality statement and PM selection

Quantitative results summary

Panelists evaluated 10 quality statements with 24 associated PMs, addressing four areas of delirium care (screening, diagnosis, risk reduction, and management). Criteria to stop the Delphi process were met after Round 3. Results were 98% stable between the second and third round. Quantitative results from all rounds are presented in Supplemental File 2. None of the PMs met removal criteria, therefore, all PMs and quality statements were retained in all three rounds to achieve or strengthen consensus on a final set. The experts reached consensus that six of the 10 quality statements were very important and very actionable (see Table 3). A further three quality statements were established as very important, but no consensus was reached as to their actionability; ratings ranged from 63.6% in the upper tertile for Quality Statement 08 (Multicomponent management) to 72.7% for Quality Statements 04 (Multicomponent risk reduction) and 09 (Cautious use of antipsychotic medications). For these nine quality statements, experts agreed that all associated PMs were very necessary, except for PM 06 (Repeat screening every shift) which fell below the a priori threshold (72.7%).

Table 3 Final quantitative quality statement and PM results (N = 22)

Quality Statement 06 (Reduce unnecessary within-ED transfers) and its associated PM 14 did not meet criteria to be included in the final set. Although the Quality Statement reached consensus for importance (90.9% in upper tertile) only 27.3% rated it to be very actionable, and 50.0% rated the associated PM to be very necessary.

Qualitative results summary

Three overarching themes emerged from the qualitative responses, all related to the current actionability of the quality statements. The overall themes, organizing concepts for each theme presented to panelists in the final round, and examples of associated quotes (i.e., supporting quotes used to develop themes, as well as opposing theme quotes where applicable) are presented in Table 4. There was high agreement (95.5%) with first theme, ‘System-Level Impacts on the ED’, in which panelists described system-level issues, such as access block and ED crowding, thought to decrease the current actionability of the quality statements although they were perceived to be important. In contrast, panelists explained that the complex nature of bed management and patient flow in the ED made the reduction of within-ED transfers (Quality Statement 06) both non-actionable and unnecessary to evaluate. Instead of focusing on transfers within the ED, there was unanimous agreement with the second theme, ‘Prioritization for Transfer to Care Unit or Home’. Although panelists agreed older adults should be prioritized for transfer out of the ED, they also expressed the importance of improving care within the ED and shared ideas how some of the actionability issues could be addressed with adequate resources. These ideas are represented in the final theme, ‘Additional Healthcare Provider Supports’, which was also endorsed unanimously.

Table 4 Overarching themes, organizing concepts, and associated quote examples

Qualitative results provide justification for including three quality statements that achieved consensus slightly below a priori thresholds (Quality Statements 04, 08, and 09). Many panelists who rated quality statements lower for actionability, rated the associated PMs as very necessary in recognition that the evidence generated from their measurement has the potential to guide improvement efforts (e.g., “This is extremely important and valuable to the care of the patient. For statistical purposes, it would be good to know what proportion of patients are receiving this, but my impression would be that this would be dismal”, “Unlikely to achieve, but it would be nice to have this data to drive change”, and “Data important to guide further change” [Experts rating quality statements lower for actionability and higher for PM necessity]). Lastly, PM 06 (Repeat screening every shift) reached consensus slightly below our a priori inclusion threshold as some participants viewed ongoing delirium screening being out of the scope of ED care (e.g., “…The fact that these poor patients stay hours if not days in an ED screams system problem. The fact that the ED teams will have to do daily screens for these patients should be the real issue” [Expert rating PM 06 in mid tertile]). While other participants thought repeat screening was necessary, as older adults tend to spend a longer time in the ED (e.g., “I wonder if the frequency of every 24 h is not enough to capture and evaluate ED care. Perhaps every nursing shift (so twice a day)…” and “I think delirium screening should be occurring once per shift (every 12 h), given how quickly it can develop in the ED” [Experts rating PM 06 in high tertile]). Due to known long standing issues with increased lengths of stay in EDs across Canada [52], and globally [53, 54], as well as increased incidence of delirium associated with ED lengths of stay over 10 h [3], it was decided to retain this PM for preliminary testing under the recognition that as care improves, attainment increases, and variability decreases over time, de-implementation of some of the PMs, such as PM 06, may be warranted [16].

Final set of quality statements and PMs

The final quality statement and PM set consisted of nine quality statements and 23 PMs, including nine structure PMs and 14 process PMs (see Table 5). Wording for some of the quality statements and PMs were modified after Rounds 1 and 2 based on panelist feedback. For example, multiple panelists viewed conducting delirium screening re-assessment once per nursing shift (instead of daily) was more practical. Two additions were also made based on expert consensus. First, a time benchmark of initial screening to be completed within 4 h of arrival to the ED was added, with 85.7% of panelists agreeing on this as a reasonable timeframe after two rounds. Second, prioritizing transfer to more appropriate care spaces was added as part of multicomponent interventions for risk reduction and management (Quality Statements 04 and 08) based on 100% agreement during the participant validation exercise. This addition is also supported by a recent systematic review and meta-analysis in which Oliveira and colleagues found that the odds of developing delirium increased over two-fold in older adults with ED lengths of stay over 10 h (OR, 2.23; 95% CI, 1.13–4.41) [3].

Table 5 Final set of quality statements and PMs by category

Discussion

There have been few attempts to establish PMs to evaluate quality of care for older ED patients in relation to delirium. Existing PMs are reported to be of low methodological quality and predominately based on pre-existing metrics [57,58,59,60]. PMs are only as good as the evidence and methods used to develop them [32, 43]. Poorly developed PMs can lead to unintended consequences by providing misleading information to guide decision-making, policy development, and quality improvement efforts [43]. There is general agreement in the ED quality of care literature that there is a need to rigorously develop new evidence-based PMs instead of basing work on pre-existing metrics [53, 57, 61]. In our study we developed a set of delirium quality statements and PMs for the care of older adults in the ED setting by conducting a formal consensus process. To our knowledge, this is the first known research to develop a de novo set of guideline-based metrics on this topic.

A diverse group of clinical experts reached consensus on a set of quality statements that are important, and associated PMs that are necessary to evaluate the quality of delirium care older adults receive in the ED. All quality statements in the final set reached consensus at or slightly below the a priori criterion for actionability, with a large caveat that much of this care would only be actionable with appropriate tools and resources available. There was overwhelming agreement that the current state of healthcare systems, especially in which many EDs are dealing with access block (inpatient boarding), crowding, and staffing constraints decreases the actionability of much of the care in the delirium quality statements, although there is agreement this care is important. Similarly, Eagles and colleagues (2022) found that ED clinicians identified delirium as important but it was not prioritized in the care of older ED patients [62]. Perceived lack of time, competing priorities, and increased demand have been identified as key barriers to delirium assessment and management by Canadian ED clinicians in two recent qualitative studies [62, 63]. These barriers to high-quality care delivery in the ED have only worsened in recent months [64] as the global pandemic has come to an end [65]. System-wide staffing shortages, bed closures, and pent-up demand have exacerbated access block and put increased pressures on EDs across Canada [64, 66, 67], as well as in many other countries such as the United States [68], United Kingdom [69], and Australia [70]. The one participant who disagreed with this theme in our study pointed out that there will always be competing priorities in the ED, and it is critical to make delirium care for older adults a priority or it will always be overlooked.

Despite system-level issues, clinical experts agreed more can be done within the ED to support the actionability of the quality statements and improve quality care. Implementing roles for other providers such as Nurse Practitioners and Geriatric Emergency Management (GEM) nurses were perceived to have great potential benefit in the screening, assessment, and management of delirium in the ED. Previous research has demonstrated that advanced practice nurses have a key role in successfully implementing practice change and improving ED quality care for older adults [71, 72]. Beyond the introduction of additional healthcare providers, tools are also needed to support ED clinicians to provide high-quality care.

The quality statements generated by this study can be used to guide practice change and enhance standardized electronic documentation. For example, they can be used to develop and embed risk reduction and management protocols, as well as embed screening tools into an electronic documentation system. Recent studies have reported improved delirium assessment and diagnosis in the ED with similar initiatives [73,74,75]. Further, embedding such tools and protocols will support reliable data capture, which will facilitate the ability to use the developed PMs to monitor and evaluate patient care. The PMs are necessary to provide baseline data to guide improvement efforts where they are most needed. Metrics such as these have been identified as an important component of quality improvement efforts, not only to provide evidence to governments and administrators, but also to increase frontline-provider awareness, enhance staff education, and increase buy-in [29, 73,74,75]. Prior to implementation, the PMs will undergo preliminary testing to ensure they are feasible to use to evaluate ED quality care [16].

This study has some key limitations. Despite our efforts to recruit broadly across Canada and internationally we encountered difficulties. This, unto itself, may speak to the increased burnout being experienced by ED providers world-wide since the pandemic [76]. Similarly, while we attempted to recruit other types of healthcare professionals (e.g., clinical educators and managers) none of these experts were willing to participate in our study. As most of our panelists were from Canada, this may limit the generalizability of our results. ED researchers can use the final set of quality statements and PMs from this study to repeat a similar consensus-building process with a different group of clinical experts to validate or further contextualize our results for different countries. Further, situational and personal biases can influence differences in how panelist make judgements when using the Delphi method [44]. We attempted to constrain these biases by limiting time between rounds, providing detailed background information and clear definitions for all concepts, providing quantitative and qualitative personalized feedback, as well as clearly defining consensus and stopping criteria.

Conclusion

Our results confirm that high-quality delirium care is an important focus in the ED, although the quality statements and PMs were based on evidence from across the care continuum. This is the first known study to develop a set of guideline-based quality statements and PMs to evaluate the quality of care older adults receive in the ED setting. Results will be used in future research to test the feasibility of using the PMs to evaluate delirium care quality and guide improvement efforts.