Background

Missed deterioration is a cause of sub-optimal care in hospital patients, and Track and Trigger Tools (TTT), also known as Early Warning Scores (EWS), are a popular response to this problem. Deterioration is often preceded by a period of physiological instability which, when recognised, provides an opportunity for earlier intervention, and improved outcome. TTTs consist of sequential recording and monitoring of physiological, clinical, and observational data. When a certain score or trigger is reached, this directs a clinical action including, but not limited to, altered frequency of observation, a senior clinical review or more appropriate treatment or management. In the adult population, TTTs are deployed in several countries (Australia, USA, Netherlands), and in the UK a national early warning score, developed by the Royal College of Physicians and endorsed by NHS England and NHS Improvement is widely used. The use of TTTs in paediatrics is more challenging, however, because of variation in accepted physiological parameters across the age range.

Paediatric mortality rates in the United Kingdom are amongst the highest in Europe [1]. The PUMA study was commissioned to develop, implement, and evaluate a Paediatric Track and Trigger Tool (PTTT) for national implementation. Three linked evidence reviews were undertaken to inform the intervention, these focused on i) tool validity, ii) effectiveness in reducing mortality and critical events [2], and the iii) impact of the wider clinical microsystem (i.e. work practices and relationships, culture; and socio-technical infrastructure) on TTT use [3]. The two reviews on validity and effectiveness [2] found that several PTTTs have been evaluated, although most are derived from a limited number of original parent tools. Although many PTTTs have been narrowly validated in single centres or specialist units, none have been validated across different settings and populations, and many have only been tested in theory and modelling, rather than through use in practice. There is moderate evidence that paediatric early warning system interventions may reduce unplanned transfers to a higher level of care, but corresponding reductions in hospital-wide or paediatric intensive care unit mortality have not been reported. No studies evaluated a whole systems approach to improving the detection and response to deterioration. The third review highlighted multiple failure points in paediatric early warning systems: lack of monitoring equipment, inadequate staffing, knowledge deficits, insufficient situational awareness, poor inter-professional communication, uncertain escalation policies, and cultures that deter escalation. Several interventions to address specific system weaknesses have been proposed and some evaluated, but there is limited evidence to recommend their use. Overall, the findings of the three reviews did not support an exclusive focus on PTTTs to address the problem of missed deterioration and indicated the need for approaches that focus on the wider clinical microsystem. As a result of the findings from the reviews, we revised the study aims from an exclusive focus on a PTTT, to the development of a system wide improvement programme: The PUMA (Paediatric early warning system Utilisation and Morbidity Avoidance) Programme [4].

Methods

Study design

The research was a prospective, mixed-methods, before-and-after study, with two work streams.

Work Stream 1

Development and implementation of an evidence-based paediatric early warning system improvement programme (the PUMA Programme), drawing on three systematic reviews of the literature [2, 3].

Work Stream 2

Prospective mixed methods evaluation of the PUMA Programme in four UK hospitals, with an embedded qualitative formative and summative process evaluation.

A patient and public involvement (PPI) group informed both work streams. An experienced PPI lead (Jenny Preston) co-ordinated parent involvement throughout the study to advise on the tool and implementation package development (Work Steam 1); information leaflets for research ethics purposes; the design of interview schedules and the data generation templates; and qualitative data analysis, particularly parent interviews and dissemination strategies (Work Stream 2).

The study protocol covering the development, implementation and evaluation has been published [4]. Ethics approval was granted on 13 April 2015 by the National Research Ethics Service Committee South West, registration number 15/SW/0084.

Theoretical framework

The study was informed by translational mobilisation theory (TMT) [5] and normalisation process theory [6]. TMT is a sociological theory, which provides a framework for understanding and investigating the organisation of collaborative work practices in institutional contexts. It was used to systematically analyse the socio-material relationships in paediatric early warning systems and the conditioning effects of the local institutional contexts [5]. Normalisation process theory (NPT) is a theory of implementation, which focuses on the actions necessary to embed a new intervention into practice. It informed the development and evaluation of the implementation strategy.

Work stream 1: development and IMplementation of the PUMA Programme

The PUMA Programme was developed from the findings of the three systematic reviews [2, 3] and founded on OUTCOME, a novel approach to improvement. Informed by TMT, NPT, and the Model for Improvement [7] (see Additional Material 1, for a summary of the theories that underpin OUTCOME). OUTCOME was developed as part of the study and is intended to overcome some of the weaknesses of orthodox approaches to health-care improvement, namely:

  • Solutions are often identified before problems are properly understood [8,9,10].

  • Interventions are implemented without an understanding of the local systems of work in which they must have their effects [6, 8].

  • The desire for standardisation limits freedom to adapt to local context [11].

  • When an intervention is imposed from outside the organisation, there is little ownership and limited opportunity to capitalise on local expertise [12].

  • Service-led projects that do utilise local expertise often lack adequate evaluation and reportage, which precludes shared learning [13].

  • The form of an intervention is often given more consideration than its function – with a tendency to give precedence to a tool that can be implemented over an adjustment to the system [5].

  • Improvement efforts are often time-limited and not sustained over the longer-term [12].

OUTCOME comprises six principles and is designed to support local teams to bring about the changes necessary to achieve a desired outcome in context specific ways. The OUTCOME principles and their application in the development of the PUMA Programme are described below and summarised in Table 1.

Table 1 The OUTCOME Framework: principles, structures, theory, and application in the PUMA study

Principle 1: outcomes directed

The first principle of OUTCOME is that improvement is driven by an agreed outcome, rather than by predefined interventions. This reflects a growing concern that health-care improvement is often solution driven, rather than focused on improving practice. The emphasis on outcomes in the framework is informed by the concept of ‘projects, which is the primary unit of analysis in TMT and refers to the network of people and materials oriented to a shared goal. Thinking about improvement in terms of the associated project helps to define the boundaries of the initiative. The literature on the detection of deterioration identifies four integrated components which work together to provide a safety system for at-risk patients: (1) the afferent component which detects deterioration and triggers timely and appropriate action; (2) the efferent component which consists of the people and resources providing a response; (3) a process improvement component, which includes system auditing and monitoring; and (4) an administrative component focusing on organisational leadership and education required to implement and sustain the system [14]. In the PUMA study, the project of interest was the afferent component of a paediatric early warning system, which detects deterioration and triggers timely and appropriate action, and excluded the efferent component, which consists of the people and resources providing a response.

Principle 2: functions oriented

The second principle of OUTCOME is that improvement is oriented towards the functions necessary to achieve the goal. This requires specification of the primary mechanisms of action that are necessary in an overall process for the goal to be achieved. In the PUMA study, the core functions of an afferent early warning system were identified through the application of TMT to the systematic review and refined through discussions with clinicians to produce seven functions in total: monitor, record, interpret, review, prepare, escalate, and evaluate [3].

Principle 3: systems focused

The third principle of OUTCOME is that improvement is focused on the socio-material resources, processes and mechanisms needed to enact the essential functions for achieving the goal. This requires specification of the minimum system requirements and draws on the concept of the strategic action field in TMT. Strategic action fields provide the structures, organising logics, technologies and materials, and interpretative repertoires that condition projects of collective action [5].

In the PUMA study, the system standard was specified in a propositional model of minimal conceptual requirements organised around the seven functions of an afferent paediatric early warning system (PUMA Standard). The model drew together two kinds of evidence from the systematic review: evidence of the challenges that must be overcome in detecting and acting on deterioration and evidence on proposed and/or evaluated solutions to challenges. The propositional model was reviewed and refined by parents with experience of a child’s deterioration and by clinical experts on the PUMA study team.

Principle 4: context specific

The fourth principle of OUTCOME is that improvement is focused on the development of context- specific initiatives to achieve the goal. Proponents of change often favour top-down approaches to bring about improvements; yet the list of interventions and improvement efforts that flounder when spread or scaled up continues to grow, [11, 12, 15] in part because of failures to normalise and embed interventions into local contexts. Avoiding these pitfalls requires structures to support systematic and rigorous local improvement efforts in relation to a service standard. In addition to specification of the minimum system requirements to support an improvement project, OUTCOME also involves the development of associated assessment tools that can be deployed to improve understanding of the local system and identify areas for improvement.

In the PUMA study, in collaboration with expert clinicians and parents, two complementary assessment tools were developed from the PUMA Standard: a Staff System Assessment Tool (SSAT) and a Family Feedback Tool (FFT). The tools were designed to prompt wider discussion among the improvement team, to reach a shared understanding of the local afferent paediatric warning system and areas that might be targeted for improvement.

Principle 5: locally led

The fifth principle of OUTCOME is that improvement capitalises on the expertise and knowledge of those delivering services. This is intended to encourage local ownership of the improvement initiative. The PUMA Programme included the development of an improvement guide drawing on the Model for Improvement to support teams in driving their own improvement processes and designed to operationalise the core constructs of NPT and start-up and action planning workshops to support local leadership of the improvement process [7].

Principle 6: learning systems

The final principle of OUTCOME is the creation of a learning system around the improvement project, with participants attuned to system features with strong feedback loops [12]. Health-care systems are dynamic, and wider changes to the system may be consequential for an area of practice, resulting in ‘drift, [16] or the need for further adjustments to the system. OUTCOME deploys the use of assessment tools to keep systems under review, and structures for supporting local leadership. In the PUMA Programme, this was reflected in written guidance on how to ‘sustain progress’ which included system assessments every 12–24 months to reflexively monitor performance, select and review initiatives.

The PUMA Programme was implemented in two tertiary children’s hospitals (with on site Paediatric Intensive Care Unit (PICU)) and two general hospitals (with no on site PICU) in the UK between June 2016 and November 2017. Two sites had a PTTT in place for the duration of the study, two did not (Table 2).

Table 2 Summary of Study Sites

Work stream 2: evaluation of the PUMA Programme

The study deployed an interrupted time series (ITS) design, in conjunction with ethnographic case studies, which combined observations and qualitative interviews, to evaluate changes in practice and outcomes over time. Ethnographic methods were also deployed in a formative and summative evaluation of implementation processes.

Quantitative evaluation

The quantitative evaluation tracked monthly aggregate outcomes across all in-patient wards at each site for a minimum of 40 months (May 2015 – October 2018). The purpose was to evaluate the effect of the intervention on trends in markers of in-patient deterioration over time. Sites were analysed as separate case studies.

Outcome measures

We identified eight outcome measures commonly reported in the literature [2] for assessment of the effectiveness of paediatric early warning systems: mortality, cardiac arrest, respiratory arrest, unplanned admission to PICU, unplanned admission to High Dependency Unit (HDU), PICU reviews, other medical emergencies requiring immediate assistance and non-ICU patient bed days. Each outcome definition was agreed with sites (Additional Material 2), piloting work was conducted to ensure the feasibility of data collection, and then consistently applied across all sites.

The primary quantitative outcome was a composite outcome metric (‘adverse events’) representing the total number of children in a given month that experienced at least one of the following events: mortality, cardiac arrest, respiratory arrest, unplanned admission to PICU, or unplanned admission to HDU. Secondary outcomes including the five components of the composite primary outcome (mortality, cardiac arrest, respiratory arrest, unplanned admission to PICU, unplanned admission to High Dependency Unit (HDU)) and also PICU reviews, other medical emergencies requiring immediate assistance and non-ICU patient bed days were analysed separately. Monthly patient-bed days (see Additional Material 2 for definition) were collected to calculate the rate.

Sample size calculation

A simulation-based approach [17] was used to calculate power, based on the original study aims to develop and evaluate a PTTT. Whilst the primary outcome was a composite measure, there was limited availability of data and therefore we took the conservative option of focussing on unplanned transfers to PICU for our estimations. Utilising historical data from two sites, the prevalence of unplanned transfers to PICU was 1%. Additionally, previous research indicated that implementation of paediatric calling criteria with a rapid response team could result in a risk ratio of 0.65 for total avoidable hospital mortality [18]. We assumed that the PUMA intervention might result in a similar risk ratio [19]. The estimated effect size, mean difference, and common standard deviation were 2.8, 2.0 and 0.7, respectively. We estimated that 24-months of observations (12 pre- and 12 post) would give 90% power for an effect size is of least 2 [17]. When the research aims were changed from the implementation of a PTTT to the implementation of the PUMA Programme, we retained the focus on collecting 12 months pre- and 12 months post-intervention but allowed 12 months for phase in of the intervention to give a total of 36 months. We were able to collect data for up to 6 more months retrospectively for the pre-intervention period. This gave 42 months of data and increased our sample size.

Analysis

All outcomes were expressed as rates per 1000 patient bed-days. In two sites, we received only partial denominator data for certain months (e.g., patient bed days were only recorded for 25 out of 30 days). In these cases, a weighted average relative to the month size was used to impute missing bed numbers and calculate monthly bed-days.

An ITS approach [20] was used to analyse data over time. Aggregate monthly rates of mortality and morbidity outcomes were tracked for up to 18 months before, 12 months during, and 12 months after implementation. A segmented linear regression was fitted on data from each site using an autoregressive integrated moving average (ARIMA) [21] method to analyse the primary and secondary outcomes. The assumptions of linear regression were checked investigating residual plots. The Durbin-Watson statistic, autocorrelation and partial autocorrelation function were used to identify the order of autocorrelation and moving average.

The most common approach to ITS analysis is to compare trends across two separate time periods: a pre- and post-intervention phase. Typically, the intervention is discrete and time-bounded - such as implementation of a PTTT – and thus might be expected to have an immediate effect on the outcome. In this study we expected that implementation of the PUMA Programme would take longer, but that we might be able to observe gradual changes in measures of in-patient deterioration. Therefore, we decided a priori to investigate both the short-term effect of the PUMA Programme (two phases taking the start of implementation as the time of change) and the longer-term effect (three phases incorporating pre, during and post change). We also used impact models [22] that allowed immediate (level) and trend (slope) change after introducing or completing implementation. Any statistically significant change in either level or trend would imply that the intervention had demonstrated an effect on outcomes.

Some of the secondary outcomes, e.g. mortality, were rare either by nature or because of the relatively low number of children being seen at some sites. For such scenarios, it is neither easy to transform the time series into a stationary series nor to detect a trend. Depending on the number of zero count months, we either added an indicator variable into the model to account for the zero months effect or we combined data into two-monthly blocks and where possible the trajectory was modelled. Exploratory and sensitivity analyses were also conducted (for details please see Additional Material 3). All analyses were performed using statistical software (R v 3.5.2).

Qualitative ethnographic evaluation

Data generation

In each site, single ward case studies were undertaken to evaluate changes in the paediatric early warning systems. In the pre- and post-implementation phases, data were generated through observation of practice and semi-structured interviews with clinicians, managers, and families. Data were collected and analysed from observations (446 h) and interviews (n = 193) across the four sites (Table 3).

Table 3 Qualitative data collection for each case study

Data generation was informed by Translational Mobilisation Theory [5] and Normalisation Process Theory [6], which directed attention to the socio-material network of actors (people, processes, technologies and artefacts) and their relationships in paediatric early warning systems. Observations were conducted at different times of day/night and on different days of the week, including weekends, to ensure that a range of time periods were covered. We focused on what participants did, the tools they used, the concepts they deployed and the factors that facilitated and constrained action [23]. Observations were recorded in low inference field notes which documented in concrete terms what was said and what happened without interpretation and were later word-processed. Interviews were digitally recorded with consent. Field notes, interview transcripts, and documents were uploaded into Computer Supported Qualitative Data Analysis Software (Altas/ti) and coded for ease of retrieval and management.

Analysis

Concrete descriptions of pre- and post-implementation paediatric early warning systems were developed for each ward and independently assessed by researchers using the PUMA Programme Staff System Assessment Tool (see page 38 of Additional Material 4). Each component of the system was scored from 0 to 10, with 10 indicating the existence of requirements fully aligned with the PUMA Standard and 0 indicating the absence of requirements. Cross-case analysis was undertaken to understand the relationship between the implementation of the PUMA Programme, local context, mechanisms, and outcomes.

Implementation process evaluation

A parallel process evaluation explored teams’ experiences of implementing the PUMA Programme. The process evaluation focused on the delivery and response to the PUMA Programme, and barriers and facilitators to implementation. Data were generated through observation of facilitated sessions and meetings with site Principal Investigators (PI) (n = 5); semi-structured interviews with PIs (n = 7), clinical staff and improvement teams; records of telephone facilitation discussions (n = 40); analyses of documents - minutes of improvement team meetings and implementation activity.

Analysis

Concurrent formative process evaluation analysis identified adjustments to the PUMA Programme required to facilitate implementation processes and the necessary modifications undertaken in a process of reciprocal learning between the research team and site PIs. The summative process evaluation analysis was thematic, focusing on delivery and response to the core components of the PUMA Programme, understanding of the OUTCOME approach, barriers to change and implementation, facilitators of change and implementation, and sustainability.

Results

Workstream 1: development and implementation of the PUMA Programme

The PUMA Programme comprised of:

  • PUMA Standard: an evidence-based and theoretically informed propositional model of a paediatric early warning system organised around the seven functions of an afferent paediatric early warning system (Fig. 1)

  • PUMA Wheel: A visual schematic of the PUMA Standard (Fig. 2)

  • Paediatric early warning system assessment tools: Staff System Assessment Tool (SSAT) and the Family Feedback Tool (FFF)

  • Manualised implementation guidance to support improvement initiatives based on a five-step process (see Additional Material 4):

  1. 1.

    form an improvement team.

  2. 2.

    assess the system.

  3. 3.

    select and plan improvement initiatives.

  4. 4.

    implement and review initiatives.

  5. 5.

    sustain progress.

  • Face-to-face structured facilitation and ongoing support (see Table 4).

  • Materials to support implementation

  • Structured worksheets

  • Power point slide pack for local dissemination

Fig. 1
figure 1

The core components of a paediatric early warning system: the puma standard

Fig. 2
figure 2

The core components of a paediatric early warning system: the puma wheel

Table 4 Summary of support and resources provided for each of the five improvement steps

The PUMA Programme provided a framework and resources to support local teams to assess their paediatric early warning systems, identify areas for improvement, and decide locally how these would be addressed in each site. It provided a standardised approach across different settings, but still enabled those responsible for implementing interventions to select solutions they believed would work within the local context. The start-up meeting covered OUTCOME principles, the PUMA Standard, the importance of engaging clinical teams in the improvement process, and instruction on how to administer the system assessment tools and collate results. The Action Planning meeting involved a facilitated discussion about initiatives that could be used to address identified areas for improvement. Members of the PUMA study team (1x Consultant Paediatrician and 1x Implementation Scientist) delivered the start-up and action planning sessions and provided on-going support.

All sites formed an improvement team of local clinicians and managers, which oversaw system assessment, the identification of weaknesses in the system, and the selection, implementation, and review of improvement initiatives. Assessment of each paediatric early warning system using the PUMA Staff System Assessment Tool revealed how well each system was functioning against the core system components, outlined in the PUMA Standard. Each site had its own fingerprint of strengths and weaknesses [Fig. 3] and contextual differences (patient populations, technological and physical infrastructures, PICU access) which shaped the selection of initiatives and implementation processes. Once sites had identified areas for improvement, they were guided through a process of selecting appropriate improvement initiatives. Local teams led the improvement process in each site.

Fig. 3
figure 3

Strengths and weakness of paediatric early warning systems pre-implementation

Findings from the concurrent formative process evaluation led to modifications of the PUMA Programme. The PUMA Standard was refined to provide a more easily accessible version of the original with these changes reflected in adjustments to the PUMA Wheel and Staff System Assessment Tool. The original version of the Family Feedback Tool generated little information of value, with high scores being achieved on all measures. The Family Feedback Tool was subsequently revised and expanded (the new version was co-developed by the PUMA study team and the Patient and Public Involvement Group); an additional number of free-text questions were included, and the language used was clarified.

The PUMA Programme was designed to be implemented by local improvement teams with minimal external facilitation or support. However, over the lifetime of the study this was increased in recognition of the fact that the PUMA Programme resources were being refined and developed in parallel with implementation. Support took the form of individual telephone and/or email-mediated support and site-specific in-person meetings. All PIs either attended or contributed to the face-to-face meetings, and two sites chose to use facilitated telephone calls, during which a PUMA study team member provided tailored support, reviewing, and explaining the intended aims and improvement steps of the PUMA Programme, and assisting with problem-solving in relation to specific initiatives. In response to site PI feedback, additional information was added to the Implementation Guide.

Qualitative evaluation

Improvement initiatives

All sites selected initiatives and made changes to their paediatric early warning systems aligned with the PUMA Standard [Table 5]. Many of the initiatives identified were intended to address issues for which existing interventions were either unavailable or inappropriate, and often involved multiple small interventions that adjusted and harmonised existing processes. In some cases, the team used the PUMA Programme as a vehicle for implementing changes that had been under consideration for some time, for example the new Standard Operating Process for on-call medical team handover at night and the weekend, selected at Site 1. Sites also selected different initiatives to address similar issues. In Site 2 improving awareness of at-risk children was addressed through nursing-medical safety huddles, and in Site 3 through minor adjustment to nursing-medical communications. Teams also found alternatives if initial plans could not be implemented: Site 2 abandoned the development a joint medical-nursing handover sheet but introduced a structured approach to nursing handover.

Table 5 Summary of embedded site initiatives against the PUMA Standard

Implementation trajectories

There were different implementation trajectories in each site, reflecting several factors. First, it depended on the specific initiatives selected and whether these were relatively quick fixes or minor adjustments to existing processes, or whether they required more investment in development work, such as agreeing a new escalation policy (Sites 3 & 4). Second, it reflected the scale of work undertaken to embed the interventions, which related to organisational size and complexity. With only one ward, implementation at Site 2 was relatively straightforward. For the larger sites, the process was more difficult and required extensive engagement work and decisions about which initiatives should be implemented across the whole organisation, and which could be left to the local determination of wards. Third, it reflected the capacity of the improvement teams. The single site PI in Site 4 provided strong leadership for implementation, and delegated responsibility for leading on specific initiatives to identified individuals. But an unplanned absence from work led to a loss of momentum during the implementation phase, highlighting the potential risks of investing leadership exclusively in one person. In Site 3, staff turnover made sustaining an improvement team challenging, and most of the initiatives were progressed exclusively by the site PIs. Membership of the improvement team in Site 1 also fluctuated, and, at this site, the energy of PIs was taken up by the requirement to oversee large-scale changes relating to a regulatory requirement. In Site 2, there was a clearly defined implementation/improvement team that took on responsibility for different initiatives, which meant that some of the initiatives were implemented quickly. Fourth, it reflected wider organisational support for the improvement programme. Only Site 1 had a high level of organisational support for their initiatives, as these aligned with regulatory mandated changes arising from a critical incident.

Changes to paediatric early warning systems

All sites brought about improvements in reviewing sick children and planning for action so that there was a shared understanding of children at risk. Several sites addressed equipment shortages (Sites 3 & 4). All sites implemented initiatives to involve parents more systematically in detecting and acting upon deterioration but with limited success.

Some initiatives were implemented but never embedded in practice and some initiatives were never implemented (for a summary of initiatives proposed, implemented, and embedded see Additional Material 5). In several cases, initiatives required the negotiation of organisational barriers beyond the sphere of influence of improvement teams. For example, in Sites 2 and 4 interventions to support professional development were not implemented as staff could not be released from clinical areas. Implementing all selected initiatives was not possible within available timescales.

At the close of the study improvement work continued in several sites.

Paediatric early warning system dynamics

The study findings highlighted the dynamic qualities of paediatric early warning systems. For example, across the sites, improvement initiatives strengthened some components of the system, but weakened others. For example, the introduction of an electronic early warning system in Site 2 strengthened medical access to patient data, but disrupted nursing work as there were insufficient computers available to allow nurses to enter vital signs, leading to a delay between monitoring and recording activity. Finally shifting wider contextual factors impacted on the functioning of early warning systems in all sites. For example, Site 4 was involved in wider organisational restructuring which impacted on governance approval processes for a new escalation policy, and a critical incident in Site 1 led to a series of hospital-wide mandated changes aligned with the PUMA Standard, following recommendations in a Care Quality Commission (CQC) report, with a level of organisational sponsorship not apparent in the other sites [For a summary of changes in the study sites, see Table 6].

Table 6 Contextual factors impacting on paediatric early warning systems during implementation

The paediatric early warning system in each site was assessed in the post-implementation period and demonstrated improvements in most components of the system (Fig. 4). Table 7 summarises the positive (+) and negative (−) changes to the paediatric early warning systems in each site.

Fig. 4
figure 4

Strengths and weakness of paediatric early warning systems post-implementation

Table 7 Positive and negative changes to paediatric early warning system - post implementation

Quantitative evaluation

Data were collected on eight outcome measures for 42 months. Modelling the impact of the PUMA Programme on quantitative outcomes was challenging. Although mortality, cardiac arrest, respiratory arrest, and unplanned admission to PICU/HDU have been commonly used in combination to assess paediatric early warning systems, in practice they occur relatively infrequently, and this was apparent in the smaller general hospitals with fewer patients, which are rarely included in this type of study. Figure 5 shows the fitted trend lines for pre-intervention, implementation, and post-intervention rates of adverse events, per 1000 patient bed-days, with estimates and p-values shown in Tables 8, 9, 10 and 11. Overall, they show a mixed picture across the four sites, with wide confidence intervals illustrating the challenges in assessing trends in outcomes with low event rates. For Site 2 the numbers were so low, that it was not possible to model all three periods and therefore a two-stage model with implementation and post-intervention combined.

Fig. 5
figure 5

Scatter plots for primary outcome in each of the four sites with fitted line from segmented linear regression

Table 8 Estimates from segmented linear regression for adverse events in Site 1
Table 9 Estimates from segmented linear regression for adverse events in Site 2
Table 10 Estimate from segmented linear regression for adverse events in Site 3
Table 11 Estimates from segmented linear regression for adverse events in Site 4

ITS and qualitative findings were triangulated for each site. Site 1 implemented multiple organisational level changes aligned with the PUMA Standard, mandated in response to a critical Care Quality Commission (CQC) report, which were associated with significant improvements in adverse event trends in the post-intervention phase relative to implementation phase (ß = -0.09 (95% CI: − 0.15, − 0.05); p = < 0.001) [Fig. 5]. Several other quantitative findings appeared to relate to qualitative data. Site 4 implemented several organisational level system changes at an early stage in the study, which coincided with a decreased slope in adverse event rates during the implementation phase relative to the pre-intervention trend (ß = -0.64 (95% CI: − 1.15, − 0.13); p = 0.02). Site 2 introduced a safety huddle and electronic recording, which strengthened some aspects of the local system and weakened others. There was no significant ‘interruption’ to the adverse event rate after implementing the PUMA Programme (ß = 0.02 (95% CI: − 0.30, 0.33); p = 0.98), which continued to gently decrease in line with pre-intervention trends. Very early in the pre-intervention period, a new ward manager implemented a strategy to reduce HDU transfers, which may have contributed to declining event rates over the study period. Site 3 made several improvements in certain wards, but no organisational level changes. There was a significant downward slope in the adverse event rate trends observed in the post-intervention phase relative to the implementation period (ß = -0.27 (95% CI: − 0.47, − 0.07); p = 0.01), but the overall event rate did not decrease. This mixed pattern of findings may have been clearer if we had continued to collect data over a longer period.

Implementation process evaluation

Improvement team members embraced the OUTCOME principles underpinning the PUMA Programme to different degrees, but all considered the system assessment process to have value. Discussing results and agreeing how to rank their system against the PUMA Standard was regarded as important. They also proposed that the system assessment made the process of improvement easier, as it allowed them to engage staff groups from an early stage, providing on-the-ground expertise and evidence of areas for improvement:

It wasn’t just [site leads] plucking out what did we want to take forward, this is what everybody on the team has said needs improving.

Yet while teams reported strong ownership of the improvement process, they required encouragement to develop local approaches to system problems rather than reaching for off-the-shelf solutions. Teams did not have specialist quality improvement skills nor dedicated time to undertake improvement work, which impacted on progress and team stability. Implementation was challenging in all sites and highlighted the need for organisational sponsorship for improvement programmes.

Discussion

The PUMA Programme was developed to facilitate local improvements to paediatric early warning systems oriented to a common standard. Cumulative research highlights the need for a systems approach to improve the detection and response to deterioration in hospitalised patients. Hitherto no frameworks have existed to support system level improvement.

Our findings highlight the impacts of the PUMA Programme on clinical outcomes when system level change is organisationally mandated (Site 1) but also the challenges of locally led improvement in the absence of organisational sponsorship (Sites 2, 3 and 4). While the PUMA Programme was designed to support context-appropriate approaches to improving paediatric early warning systems, the findings point to several areas where common standards have value. First, clinical expertise is a component of any paediatric early warning system, and staff turnover has potentially disruptive effects. Several sites (2 and 4) identified the need for education and training in their improvement initiatives, yet it was only in Site 1 where training was organisationally mandated that these initiatives became embedded in practice and staff were released from clinical work to attend. Professional development should be a critical component of all systems and mandated multidisciplinary training considered. Second, in several sites a lack of access to appropriate equipment was identified as impacting negatively on the system – this ranged from appropriate monitoring equipment to access to computers for data entry. A process to ensure the correct equipment is available and functioning is a prerequisite of any paediatric early warning system irrespective of the singular features of local context. Third, all sites recognised the importance of involving parents in detecting and acting on deterioration but had limited success in implementing changes to the system. Parental involvement in the detection of deterioration is difficult to address outside of wider strategies to facilitate parental involvement in children’s care. Fourth, by observing over time, the study highlighted the dynamic qualities of paediatric early warning systems, the impacts of internal and external contextual changes, and the distributed costs and benefits of change for participants. This points to the need for regular assessment of system functioning as part of a continuous improvement culture.

To our knowledge no studies have robustly assessed the impact of interventions to improve paediatric early warning systems. While a large randomised controlled trial of a specific score has recently been reported, this focused on patient outcomes rather than wider system change [24]. Most other studies have examined the feasibility or validation of scores, rather than systems, and have been heterogeneous in their design and reproducibility [2]. Our results are in keeping with other cohort studies [25] which demonstrated improvements over time regardless of interventions. The robust mechanism with which we looked at a variety of outcomes also meant that some of the gains seen in single outcome measure studies were not realised [26].

Determining the impact of the PUMA Programme using quantitative measures of in-patient deterioration was challenging. First, implementation was a process rather than a discrete event, creating challenges for the ITS. The ‘implementation period’ was conceptualised as 12 months for analytic purposes, in practice this likely varied between sites and was less well defined than in some intervention studies. Second, the commissioning brief related to interventions to reduce mortality and so our primary outcome (‘adverse events’) was a composite measure that included mortality and other related clinical metrics. The decision to use a composite metric for the primary outcome mirrors other single-site effectiveness studies of paediatric early warning system interventions [24]. It was largely a pragmatic decision, reflecting the low event rates of individual clinical outcomes such as mortality and arrests in hospitalised children. Even using this composite outcome, incorporating unplanned HDU and PICU transfers, we observed several zero months in our smallest DGH. Low event rates for key outcome metrics in DGHs point to the difficulty in assessing changes over time in smaller hospitals, and a key reason paediatric early warning systems research is dominated by studies conducted in large specialist centres.

Mortality is significantly lower in children than in adult in-patient settings, [27] here is an ongoing decline in child mortality, [25] and even in-patient deterioration is a relatively infrequent occurrence [24]. Analytic approaches to rare event modelling, such as Bayesian Belief Networks, could be adapted from other fields to support the focus on preventing these events, however a clear assessment of potential is required. The literature on rare events requires clear causal pathways and the complexity of child deterioration and death may not be amenable to such approaches. New methodologies are required.

Including HDU and PICU transfers as markers of in-patient deterioration is common in the literature, but not without difficulty. As we demonstrated in the qualitative work, use varies in response to other system pressures or changes in clinical practices of senior staff. Our findings lend weight to debates about the appropriateness of downstream individual level outcome measures in this field and point to the need to reach agreement on up-stream indicators of paediatric early warning system performance. These may include inter alia measures of process, culture, parental involvement, and staff situational awareness. While these are worthy of future study, at the inception of this study, adequate up-stream indicators of paediatric early warning system performance did not exist. The PUMA Standard offers a valuable framework for progressing the development of alternative metrics, through consensus methods, such as a Delphi Study.

Conclusions

System level change to improve paediatric early warning systems can bring about positive impacts on clinical outcomes, but in paediatric practice, where the patient population is smaller and clinical outcomes event rates are low alternative outcome measures are required to support research and quality improvement beyond large specialist centres, and methodological work on rare events is indicated.

Paediatric early warning systems are dynamic, and their functioning is influenced by wider contextual changes. The PUMA Programme offers structures to support regular assessment, learning and local improvement.

The PUMA Programme offers a new approach to improving the detection and response to deterioration in the in-patient paediatric context by focusing on the whole system. With appropriate organisational support, the PUMA Programme has value as a framework for continuous improvement of paediatric early warning systems across diverse national and international contexts, including developing healthcare systems. The OUTCOME approach to improvement, has the potential to be used more widely.