1 Introduction

Patient physiology can change unpredictably and dynamically over the course of a hospitalization. Every patient who is admitted to the hospital is at risk of experiencing acute physiological deterioration (APD), defined as acute and persistent abnormality in one or multiple physiological measures, potentially resulting in cardiac arrest, unscheduled intensive care unit (ICU) admission, or death. Major challenges are that APD is difficult to identify, response timing is critical, current measures for evaluating patient condition are not consistent, and rules for involvement of critical care expertise are ignoring patient characteristics.

Patient rescue is a complex problem. The acute care of patients experiencing APD relies on early recognition and rapid response to stabilize the patient’s condition. Medical emergency/rapid response systems are composed of periodic monitoring of physiologic status with critical thresholds that trigger the involvement of early critical care expertise during APD with the goal of preserving health and preventing undesired health outcomes. Early Warning Systems (EWSs) are quantitative scoring systems that assign values to selected physiological measures to detect abnormalities and inform clinical decision making in cases of APD [1, 2]. However, there is little standardization with respect to the best use of these scores and current critical thresholds are based primarily on subjective clinical judgement.

Physiological deterioration is characterized by notable changes in vital signs. Routinely collected vital signs can inform recognition of these changes. Increasing complexity of care delivery processes and the fragmented nature of health care delivery are some of the main challenges that lead to lack of or delayed recognition and response to APD. Failure to (or delay in) recognize and respond to APD remains a challenge for health care systems. The National Patient Safety Agency reports that 11 % of serious hospital incidents arise through failure to act on deterioration, with failure to recognize the importance of physiological deterioration as one of the primary reasons [3].

The uncertainty in physiological deterioration and recovery processes during hospitalization can be modeled as a sequential decision problem under uncertainty. EWS-based dynamic decision models can inform acute-care practice at the point of care delivered by medical emergency teams with critical care expertise, such as a Rapid Response Team (RRT) [4, 5]. Clinical guidelines classify physiological condition using EWSs and suggest RRT activation for scores above critical thresholds [1]. Predictive performance of EWSs for undesired incidents during hospitalization is well studied; however, several gaps remain in the literature [611]. The individualized use of EWSs to identify personalized resuscitation interventions, such as RRT activation, is a natural extension of this literature with a focus on the impact of patient-specific risk.

While RRTs have been widely implemented in health care systems, evidence to support their effectiveness is insufficient [12]. Several studies focused primarily on patient outcomes to assess the RRT effectiveness; however, the findings are mixed [1315]. A major multi-center controlled trial (the Medical Early Response Intervention and Therapy (MERIT)) study investigated whether RRT implementation reduces the incidence of cardiac arrests, unplanned admissions to intensive ca re units (ICU), and deaths [13]. The study was unable to provide sufficient evidence to demonstrate effectiveness of RRTs. The results of the study showed that RRT implementation was not associated with a decrease in cardiac arrests, ICU admissions, or deaths. Chan et al. (2010) conducted a meta-analysis of 18 studies (1950 through 2008) to assess the effect of RRTs on reducing cardiopulmonary arrest and hospital mortality rates, and reported mixed results [14]. While some studies showed a reduction in cardiopulmonary arrest rates in adults outside the ICU, the reductions were not associated with lower hospital mortality rates. Jones et al. (2009) reviewed several studies to assess whether RRT dose (i.e., RRT calls per 1,000 patient admissions or discharges) impacts patient outcomes [15]. The findings suggested a greater effect in patient outcomes (e.g., reduction in the rate of cardiac arrests) with a greater dose of care from RRT. Leach and Mayo (2013) conducted a qualitative analysis of RRT effectiveness and identified RRT management challenges including inconsistency of team members from day to day, limited opportunity for RRT members to develop team skills, and greater need for team training compared to clinical teams that work together regularly under less time pressure to perform [12]. In conclusion, measurement of RRT effectiveness and management remains a challenge, and quantitative metrics focusing on earlier recognition of physiological decline and better utilization of limited RRT resources may support capturing the impact of RRT implementation in clinical practice.

An effective EWS-based decision model needs to address the impact of both patient and provider heterogeneity to accurately capture the dynamics of APD and identify optimal RRT-activation policies. APD impacts the heterogeneous patient population with different reasons for admission and different clinical trajectories. Existing EWSs ignore this heterogeneity. For example, existing RRT activation criteria are the same for all patients and do not differentiate the intervention based on patient characteristics. Further, neglecting patient heterogeneity may translate to unnecessary activation of RRT, resulting in suboptimal use of limited personnel resources, and inaccurate estimation of resource needs.

In addition to patient heterogeneity, acute care is delivered by diverse care provider teams including physicians, critical care nurses, and respiratory therapists. The team composition may differ by facility and may be subject to changes over time within the same facility. Heterogeneity in provider teams may result in clinical practice variation, impacting the evaluation of patient resource needs. This variation may impact RRT activation decisions. Understanding the dynamics of APD associated with heterogeneity at the patient and provider levels and the stochastic nature of the health system is necessary to foster personalized medical decision making.

In this article, we develop a semi-Markov decision process (SMDP) model for the management of a patient’s physiological condition. The model allows for stochastically changing health states (the main source of uncertainty) while determining patient subpopulation-specific National Early Warning Score (NEWS)-based RRT activation thresholds. The objective is to minimize the total time associated with patient deterioration and stabilization, including the times associated with clinical distress from the providers’ (decision maker’s) perspective, nursing activities, RRT activities, and stabilization. Further, to incorporate the impact of provider heterogeneity we develop a modified SMDP model that examines the worst-case scenarios in terms of nursing and RRT time (by weighting these times) and using the maximum time (instead of actual times) of nurse and RRT resources to associate a relative value between the two types of resources. The goal of modifying the nursing and RRT times and the relative values between nursing and RRT effort is to represent provider heterogeneity in the response to APD. Using clustering methods, we identify categories of providers with distinct RRT-activation preferences and distinct valuation of resource needs of patients.

1.1 Main contributions of this article

The main contributions of this article are the following: (i) formulation and implementation of a stochastic decision model that explicitly takes into account both patient and provider heterogenity; and (ii) formulation and evaluation of optimal personalized RRT-activation policies and total expected stabilization-related time. We have analyzed large datasets containing over 50,000 patient records to identify clinically relevant patient subpopulations and populate the SMDP model. Considering the heterogeneity and uncertainty in acute-care systems, including patients and providers, we hypothesize that the differences in provider valuation of RRT resource requirements and patient nursing needs influence the RRT activation decision making. However, we hypothesize that clusters of providers may be identified which have common RRT strategies.

An outline of the remainder paper is as follows. First, an overview of the relevant literature is presented. Using findings from our previous work, we present the baseline SMDP model illustrated on a retrospective case study including 55,385 patients hospitalized at a single facility (self-identifying name of the facility omitted) from January 2011 to December 2012. Next, provider heterogeneity is modeled in the modified SMDP model. We discuss the findings and their implications.

1.2 Previous literature on SMDP models in health care

A Markov Decision Process (MDP) model provides a framework for representing multi-stage decision problems in the presence of uncertainty. Markovian models are commonly used in health care to support screening, diagnosis, and treatment decisions for chronic conditions, patient flow, and hospital operations optimization [1619]. For continuous-time MDPs, the Markov property ensures that the time spent in a state before a transition occurs, i.e., the sojourn time, follows an exponential distribution. However, in cases of physiological deterioration and recovery, the sojourn time may not be exponentially distributed. A semi-Markov decision process (SMDP) model is a generalization of an MDP in which the state transitions follow a Markov chain; however, the sojourn time is a random variable following an arbitrary distribution [2022]. SMDP models have been applied in health care since the early 1970s to model the following: (i) resource planning and personnel scheduling within hospitals [23, 24]; (ii) modeling patient condition during hospitalization such as recovery processes of acute leukemia patients [25], coronary patients [26], and end-stage renal disease patients [27]; and (iii) patient flow in a maternity service unit [28]. Unfortunately, this research ignores patient and provider heterogeneity as well as the impact of the sojourn time and its role in clinical decision making. Further, (self-identifying reference omitted) used a National Early Warning Score (NEWS)-based continuous-time SMDP model to develop subpopulation-specific resuscitation thresholds [29]. However, their study did not consider provider heterogeneity, did not assign weights to time-based metrics and their relative values, did not differentiate between different types of stabilization, and did not use clustering methods. Our study bridges this gap in the literature to capture patient heterogeneity by identifying subpopulations based on patient characteristics, and using optimization and clustering methods to incorporate provider heterogeneity.

2 Methods

2.1 Identifying patient subpopulations

Subpopulations within a heterogeneous patient population are identified to enable patient-centered decision making. Our methodology for selecting patient characteristics to identify statistically significantly different subpopulations is described in (self-identifying reference omitted), who identified two patient characteristics for classifying subpopulations: risk of deterioration (ROD) during hospitalization (i.e., low, moderate or high ROD based on the Braden skin score, which is a risk assessment tool commonly used at admission); and admission type (i.e., medical or surgical) [29]. Medical admission refers to the patients admitted for symptoms of discomfort or illness and admitted for a reason other than surgery. Surgical patients are admitted for a surgical procedure. In this article, we use six subpopulations defined by ROD and admission type:

  • subpopulation (A) – medical patients with low ROD;

  • subpopulation (B) – surgical patients with low ROD;

  • subpopulation (C) – medical patients with moderate ROD;

  • subpopulation (D) – surgical patients with moderate ROD;

  • subpopulation (E) – medical patients with high ROD; and

  • subpopulation (F) – surgical patients with high ROD.

Figure 1 summarizes our methodological approach.

Fig. 1
figure 1

Methodological approach for stochastic acute-care decision optimization considering patient and provider heterogeneity. The squares on the left represent each modeling step, and boxes with dashed borders on the right represent corresponding methods

2.2 Baseline SMDP model

The baseline SMDP model, defined by the set of elements (S, T, A, P, r, F), represents the progress of a patient’s physiological condition as a continuous-time stochastic process: a finite set of health states S, an infinite time horizon T, a finite set of actions A, state transition probability P (defining probablistic movement between states conditional on the current state and action), stabilization and resource time r(s, d(s)) associated with performing the action d(s) in state s, and the sojourn time cumulative distribution function, F(t|s, d(s)) for a patient in state s and when action d(s) is taken. For a given stationary policy π in the set of Π of feasible policies, the minimum total expected discounted stabilization and resource time for health state s is given by the value function

$$ {v}_{\alpha, {\pi}^{*}}(s)\equiv { \min}_{\pi \in \varPi \kern0.5em }\left\{{v}_{\alpha, \pi }(s)\right\}, $$

where

$$ {v}_{\alpha, \pi }(s)=r\left(s,d(s)\right)+{\displaystyle {\sum}_{j\in S}P\left(j|s,d(s)\right){\displaystyle {\int}_0^{\infty }{e}^{-\alpha t}F\left(dt|s,\ d(s)\right)\ {v}_{\alpha, \pi }(j)}}, $$

And π * denotes the optimal stationary policy. We use continuous discounting denoted by e − at with α = 0.03 corresponding to a commonly used discrete discounting factor [30].

2.2.1 Decision epochs

The stochastic process starts at the beginning of a hospitalization episode. This can be a general ward admission, or a return to the general ward from a higher-level care unit. A decision epoch is defined as a point in time corresponding to a provider team assessment during regular hospital rounding that identifies a change in the patient’s health condition. Thus, a new decision epoch in the continuous-time stochastic model occurs only when the patient’s health condition differs from the immediately preceding evaluation.

2.2.2 Health states

Health states focus on features of the patient’s condition. The set of states S is the same for all six subpopulations. The core continuous-time stochastic process, {X t t ≥ 0} represents the patient’s current health state measured by the NEWS [1]. Model states X t  = s ∈ S = {1, 2, 3, 4f, 4s, 5, 6} are ordered such that a lower value of NEWS is associated with better health (Table 1). The classification of NEWS values and their clinical interpretation in Table 1 are derived from clinical guidelines [1].

Table 1 Health states in the baseline SMDP and the corresponding clinical interpretation [1]

The states {4f,4s,6} are model extensions of clinical guidelines to capture stabilization and discharge conditions. State 6 includes all outcomes that result in the patient leaving the general ward, including transfer to a higher-level care, discharge alive, admission to hospice, and death. State 5 represents a patient who has not been observed with NEWS >0 since the start of the current hospitalization episode. State 4f represents returning to NEWS =0 following APD. Based on expert medical opinion, we assume that maintaining NEWS =0 for at least 1 h following APD represents stabilization, i.e., remaining in state 4f for at least 1 h results in a movement from 4f to the stabilized state 4s.

2.2.3 Actions

Care provider actions are allowed only at decision epochs (i.e., are prompted by changes in patient condition) and include waiting, i.e., d(s) = 1, postponing the RRT activation until the next clinical assessment, or activating the RRT immediately, i.e., d(s) = 2. Waiting refers to the case when the bedside provider decides to provide necessary acute care for the patient without using critical care (RRT) resources for guidance. Activating RRT refers to the case when the bedside provider decides to obtain critical care guidance by calling the RRT immediately. Waiting is allowed in all states, whereas activating the RRT is allowed only in states {1,2,3} , which are associated with NEWS > 0. The state-dependent action space is chosen for the following reasons: (i) the historical data did not show any RRT activation in states {4f,4s,5}; (ii) the stabilized condition of patients with NEWS =0 typically does not need bedside critical care evaluation; and (iii) activating the RRT in discharged state 6 is not possible because RRT provides critical care only for the patients in the general ward.

2.2.4 Transition probabilities

The state transition probabilities govern changes in health state as a function of the subpopulation. Figure 2 illustrates the possible state transitions.

Fig. 2
figure 2

State transition diagram showing possible movements between model states

In Fig. 2, the patient’s health states are classified into three groups: states {4f,4s,5} corresponding to NEWS =0, states {1,2,3} with NEWS >0 and an absorbing state 6. Dashed arrows are bidirectional state movements, and solid arrows represent unidirectional state movements. For example, given the current state is 1, the next state is in the set {2,3,4f,6}. Because 6 is an absorbing state, it is not possible to leave state 6, so the arrow 1 → 6 is solid, whereas the arrow from state 1 to a state in {2,3,4f} is dashed, i.e., state movements 1 → {2,3,4f} → 1 are allowed.

2.2.5 Sojourn times

We assume that the patient’s physiologic condition is observed retrospectively at decision epochs. Owing to changes in the patient’s physiological condition that are observed via irregular monitoring of patients in the general ward, the time between decision epochs is a random variable. Sojourn times in the SMDP model are estimated from empirical distributions conditional on subpopulation, current state, action, and next state, as derived from electronic medical records (EMRs) (Appendix 1, Tables 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 and 17).

2.2.6 Stabilization and resource time parameters

The stabilization and resource time accumulated between two decision epochs, r(s,d(s)), includes the following components: γ nurse(s); γ RRT(s), Time to Stabilization (TTS) and Failure to Rescue (FTR). The nurse time, γ nurse(s), refers to the average additional time in hours to provide care for a patient in health states as derived from the nursing records provided by (self-identifying name of the facility omitted) compared with average nursing needs in state 5. The term γ RRT(s) is the average RRT resource time in hours from activation until departure from the patient’s bedside, given the patient was in state s at the time of RRT activation. We assume that no RRT and nurse times are accumulated after discharge for the purposes of modeling APD in the general ward.

TTS refers to time from the start of APD until the patient is stabilized. Start of APD is defined by expert medical opinion as the point in time when the deviation from the normal range for any physiological measure included in NEWS exceeds 30 min. Therefore, TTS includes the following:

  • recovery (i.e., the current state is s ∈ {1, 2, 3}, and next state is j = {4f});

  • successful stabilization (i.e., movement out of the current state s = {4f} occurs after 1 h so that next state is j = {4s}); and

  • unsuccessful stabilization (i.e., movement out of the current state s = {4f} occurs in less than 1 h, and next state is j ∈ {1, 2, 3, 6}).

FTR refers to the case when NEWS remains ≥ 7 (i.e., the patient is in distress or model state 1) for at least 1 h without RRT activation. Current NEWS guidelines recommend that: (i) a NEWS value of 5–6 should trigger a medium-level clinical alert, i.e., an urgent clinical review; (ii) a NEWS value of 7 or more should trigger a high-level clinical alert, e.g., an RRT activation, and (iii) an extreme weight in any one physiological parameter should trigger a medium-level alert [1].

The objective of the baseline SMDP model is to minimize the sum of the stabilization time (TTS), time in a critical clinical condition (FTR), and resource use (including nurse and RRT time). Final expressions of the stabilization and resource time functions are presented below. The function, r(s,d(s)), is given by:

$$ r\left(s,d(s)\right)=0\kern1.25em \mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{5,\ 4\mathrm{s}\right\},\mathrm{d}\left(\mathrm{s}\right)=\left\{1\right\}, $$
(1a)
$$ \begin{array}{l}r\left(s,d(s)\right) = {\gamma}_{\mathrm{nurse}}(s)+I\left(d(s)=2\right)\ {\gamma}_{\mathrm{RRT}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1 \cdot P\left(4f|s,d(s)\right)dt}\right]}\ {\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{2,\ 3\right\},\ \mathrm{j}=\left\{4\mathrm{f}\right\},\kern0.5em \mathrm{d}\left(\mathrm{s}\right)\in \left\{1,\ 2\right\},\hfill \end{array} $$
(1b)
$$ \begin{array}{l}r\left(s,d(s)\right) = {\gamma}_{\mathrm{nurse}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1 \cdot P\left(4f|s,d(s)\right)dt}\right]}{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}+{\displaystyle {\int}_1^{\infty }{\displaystyle {\sum}_{j\in S}}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1 \cdot P\left(j|s,d(s)\right)dt}\right]{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du}\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{1\right\},\ \mathrm{j}\in \mathrm{S},\ \mathrm{d}\left(\mathrm{s}\right)=\left\{1\right\},\hfill \end{array} $$
(1c)
$$ \begin{array}{l}r\left(s,d(s)\right) = {\gamma}_{\mathrm{nurse}}(s) + {\gamma}_{\mathrm{RRT}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^U{e}^{-\alpha t}\cdot 1\cdot P\left(4f|s,d(s)\right)dt}\right]{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du}\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{1\right\},\ \mathrm{j}=\left\{4\mathrm{f}\right\},\kern0.5em \mathrm{d}\left(\mathrm{s}\right)=\left\{2\right\},\hfill \end{array} $$
(1d)
$$ \begin{array}{l}r\left(s,d(s)\right) = {\gamma}_{\mathrm{nurse}}(s)\hfill \\ {}+{\displaystyle {\int}_0^1{\displaystyle {\sum}_{j\in \left\{1,2,3,6\right\}}P\left(j|4f,\ d(4f),Z<1\right)}}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot dt}\right]{\lambda}_Z{e}^{-{\lambda}_Zu}du\hfill \\ {} + {e}^{-{\lambda}_Z}{\displaystyle {\int}_0^1{e}^{-\alpha t}\cdot 1\cdot dt}\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{4\mathrm{f}\right\},\ \mathrm{d}\left(\mathrm{s}\right)\in \left\{1\right\}\hfill \end{array} $$
(1e)

where: (i) I(d(s) = 2) is the indicator function for the condition in the outermost parentheses so that I(d(s) = 2) = 1 if the condition d (s) = 2 is true, and I(d(s) = 2)=0 if d(s) ≠ 2; (ii) the term λ s,d(s) is the state- and action-dependent rate parameter for the sojourn time distribution in state s until the next decision epoch; (iii) λ s,d(s) = ∑ j ∈ S,j ≠ s λ s,d(s),j where λ s,d(s),j is the rate corresponding to the exponential time spent in s before a transition to j would occur, given action d(s); and (iv) Z = min {H 4f,d(4f),j  : j = {1, 2, 3, 6}} is exponential with rate parameter λ Z  = ∑ j ∈ (1,2,3,6) λ 4f,d(4f),j where H 4f,d(4f),j represents the potential time to make the transition 4f → j. Derivation of the expressions of the stabilization and resource time are discussed in detail in Appendices 2, 3 and 4.

2.3 Modified SMDP model

Most of the literature on MDPs assumes that the model parameters such as state transition probabilities and costs are known to the decision maker [31]. However, in practice, the model parameters must be estimated from data. For Markovian models, deriving model inputs from data may cause errors in estimation of model parameters [32]. Several studies address uncertainty in state transition probabilities [31, 33, 34]. Fewer studies focus on uncertainty in cost [32, 35]. In our study, because time is critical in the response to APD, time is considered as the “cost”. The acute-care provider team may consist of different providers and the team composition dynamically changes over time. In addition, providers may have different degrees of clinical experience and there may be variations in response to APD from one team to the next. It is particularly important to account for variation in the providers’ valuation of stabilization and resource time associated with health states and actions. The modified SMDP model incorporates the following: (i) the classification of care providers into clusters having similar characteristics; and (ii) the determination of the impact of provider heterogeneity on optimal RRT-activation policy as well as stabilization and resource time.

2.3.1 Provider heterogeneity parameters

Provider heterogeniety is captured by identifying groups of providers with similar time perception characteristics, defined as clusters. We introduce two state-dependent model parameters: ρ(s) and ξ(s). In the modified SMDP model, providers assign weights to average and maximum nurse and RRT times, with the maximum nurse and RRT times representing the worst-case scenario. The parameter ρ(s) takes values in [0, 1] and the stabilization and resource time function r′(s, d(s)), given state s and action d(s), is:

$$ {r}^{\prime}\left(s,d(s)\right)=\rho (s)r\left(s,d(s)\right) + \left(1-\rho (s)\right)\overline{r}\left(s,d(s)\right). $$

The term \( \overline{r}\left(s,d(s)\right) \) is the “worst-case” stabilization and resource-time function in which the average additional nurse time and average RRT resource time are replaced with the corresponding maximum values for each model state as derived from EMRs. The term r(s, d(s)) is as defined in Eq. (1ae). The final expressions for \( \overline{r}\left(s,d(s)\right) \) are presented below . The function \( \overline{r}\left(s,d(s)\right) \) is given by:

$$ \overline{r}\left(s,d(s)\right)={\overline{\gamma}}_{\mathrm{nurse}}(s)\kern1.5em \mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{5,\ 4\mathrm{s}\right\},\mathrm{d}\left(\mathrm{s}\right)=\left\{1\right\}, $$
$$ \begin{array}{l}\overline{r}\left(s,d(s)\right)={\overline{\gamma}}_{\mathrm{nurse}}(s)+I\left(d(s)=2\right){\overline{\gamma}}_{\mathrm{RRT}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot P\left(4f|s,d(s)\right)dt}\right]}{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{2,\ 3\right\},\ \mathrm{j}=\left\{4\mathrm{f}\right\},\ \mathrm{d}\left(\mathrm{s}\right)\in \left\{1,\ 2\right\},\hfill \end{array} $$
$$ \begin{array}{l}\overline{r}\left(s,d(s)\right)={\overline{\gamma}}_{\mathrm{nurse}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot P\left(4f|s,d(s)\right)dt}\right]}{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}+{\displaystyle {\int}_1^{\infty }{\displaystyle {\sum}_{j\in S}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot P\left(j|s,d(s)\right)dt}\right]}}{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{1\right\},\ \mathrm{j}\in \mathrm{S},\ \mathrm{d}\left(\mathrm{s}\right)=\left\{1\right\},\hfill \end{array} $$
$$ \begin{array}{l}\overline{r}\left(s,d(s)\right)={\overline{\gamma}}_{\mathrm{nurse}}(s)+{\overline{\gamma}}_{\mathrm{RRT}}(s)\hfill \\ {}+{\displaystyle {\int}_0^{\infty}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot P\left(4f|s,d(s)\right)dt}\right]}{\lambda}_{s,d(s)}{e}^{-{\lambda}_{s,d(s)}u}du\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{1\right\},\ \mathrm{j}=\left\{4\mathrm{f}\right\},\ \mathrm{d}\left(\mathrm{s}\right)=\left\{2\right\},\hfill \end{array} $$
$$ \begin{array}{l}\overline{r}\left(s,d(s)\right)={\overline{\gamma}}_{\mathrm{nurse}}(s)\hfill \\ {}+{\displaystyle {\int}_0^1{\displaystyle {\sum}_{j\in \left\{1,2,3,6\right\}}P\left(j|4f,\ d(4f),Z<1\right)}\left[{\displaystyle {\int}_0^u{e}^{-\alpha t}\cdot 1\cdot d}t\right]{\lambda}_Z{e}^{-{\lambda}_Zu}du}\hfill \\ {} + {e}^{-{\lambda}_Z}{\displaystyle {\int}_0^1{e}^{-\alpha t}\cdot 1\cdot dt}\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{s}\in \left\{4\mathrm{f}\right\},\ \mathrm{d}\left(\mathrm{s}\right)\in \left\{1\right\}.\hfill \end{array} $$

where \( {\overline{\gamma}}_{nurse}(s)= \max \left\{{\gamma}_{nurse}(s):s\in S\right\} \) and \( {\overline{\gamma}}_{RRT}(s)= \max \left\{{\gamma}_{RRT}(s):s\in S\right\} \) as observed in the EMRs. We assume that a care provider who is more conservative regarding nurse and RRT resource needs in state s assigns a higher weight to \( \overline{r}\left(s,d(s)\right) \), i.e., a lower value to p(s). For states corresponding to NEWS = 0 or discharge, we assume that providers do not assign a positive weight to \( \overline{r}\left(s,d(s)\right) \), i.e., p(s) = 1 for s ∈ {4f, 4s, 5, 6}. This is supported by the assumption that for a stable, stabilized, or discharged patient, nurse- and RRT-related average time parameters derived from EMRs are considered a good approximation of the true patient needs according to clinical expert opinion. Derivation of the expressions of \( \overline{r}\left(s,d(s)\right) \) are in Appendix 2, Eq. (2a2e) and (3b3f).

The term ξ(s) is a ratio representing the care provider’s perception of the workload associated with RRT time relative to nurse time for a patient in health states s ∈ S:

$$ \xi (s)=\frac{I\left(d(s)=2\right) \cdot {\mathrm{time}}_{\mathrm{RRT}}(s)}{\gamma_{\mathrm{nurse}}(s)} $$

The term timeRRT(s) represents the actual RRT time observed from activation until departure from the patient’s bedside given the patient was in health state s at the time of RRT activation. In other words, the higher ξ(s), the more the provider values RRT-related resource time compared with the additional nurse time for stabilizing a patient in state s. Because RRT calls are not allowed in health states {4f, 4s, 5, 6}, ξ(s), takes the value zero in these states.

2.3.2 Modeling care provider heterogeneity

We define a care provider profile as a set of weights {ρ(s), s ∈ S} representing the care provider’s resource time-based perception of resource requirements and ratios {ξ(s), s ∈ S} representing the care provider’s valuation of RRT evaluation and assessment time relative to nurse time for a patient in a given health state s. The inverse transformation method is used to simulate 100 care provider profiles. The simulation refers to sampling time-perception measures p(s) and ξ(s) from probabilistic distributions. The parameter p(s) is generated from a Uniform [a(s)], b(s) distribution by selecting a state-dependent range [a(s), b(s)], determined by clinical expert opinion, from which the provider is equally likely to choose weights p(s) for s ∈ {1,2,3}. Based on clinical expert opinion, the range [a(s), b(s)] is greater in states 2 and 3 compared with the range in state 1. The health states 2 and 3 represent slightly concerning and concerning conditions due to the elevated NEWS; however, in many cases a patient’s condition can move in either direction (i.e., the patient can recover on their own, stabilize, or further deteriorate). This uncertainty is reflected by a wider range for the weights. Further, a Weibull [θ(s), φ(s)] distribution with state-dependent shape parameter θ(s) and scale parameter φ(s) is used to generate ξ(s). The Weibull distribution is identified as a good fit for the simulated data from (self-identifying name of the facility omitted). The input parameters for the distributions are summarized in Table 2.

Table 2 Distribution of time-perception measures for the modified SMDP model

Once the input parameters are identified, we generated sets of p(s) and ξ(s) for s ∈ S {1,2,3} to represent care provider profiles which are divided into groups using cluster analysis.

2.3.3 Classifying care providers

Cluster analysis is a method for partitioning data into groups of objects such that objects in the same cluster are more similar to each other than to objects residing in other clusters [36]. In this study, the objects are the provider profiles defined as a set of random variables [p(s), ξ(s)] with s ∈ S. A cluster is a homogenous group of provider profiles with similar resource time perceptions. The clusters are identified with hierarchical clustering algorithms using the distance between objects to identify the clusters. A modified SMDP model is developed for each cluster using the corresponding cluster average values for p(s) and ξ(s). Three Baseline Scenarios and five Modified Scenarios were analyzed for the cluster analyses (Table 3).

Table 3 Clustering variables, algorithms for the baseline and modified scenarios

The Baseline Scenarios explore how the cluster structure and content differ depending on the clustering algorithm. Modified Scenarios were developed by changing Baseline Scenario 1 by: (i) using wider or narrower bounds on the Uniform distribution for p(s) for s ∈ {1,2,3} (Modified Scenarios 1–3); and (ii) using a subset of the clustering variables (Modified Scenarios 4–5).

2.3.4 Modified SMDP model formulation

The modified SMDP model is solved using the primal and dual Linear Programs (LP). As a numerical example, we present the results for medical patients with moderate ROD, which can be easily applied to other subpopulations. Let {β(s), sS} denote arbitrarily selected positive scalars satisfying ∑ s ∈ S β(s) = 1, which allows the interpretation of this set as a probability distribution on the state space S. The associated dual LP identifies the optimal policy for each state, while the primal LP calculates the total expected discounted stabilization and resource time for each state [20]. The dual LP formulation is

$$ \mathrm{Minimize}\ {\displaystyle {\sum}_{s\ \in\ S}{\displaystyle {\sum}_{d\ \in A}{r}^{\hbox{'}}\left(s,d\right)x\left(s,d\right)}} $$

subject to∑ d ∈ A x(j, d) − ∑ s ∈ S d ∈ A  [∫ 0 e − αt P(j|s, d)F(dt|s, d)] x(s, d) = β(s) for all j ∈ S x(s, d) ≥ 0  for s ∈ S, d ∈ A.

The term x(s, d) represents the total discounted joint probability that the patient is in state s and the provider chooses d under the distribution {β(j)} [20]. If {x(s, d) : s ∈ Sd ∈ A} is a feasible solution to the dual LP, then the stationary policy is given by

$$ P\left\{d(s)=a\right\}=\frac{x\left(s,a\right)}{{\displaystyle {\sum}_{d\in A}}x\left(s,d\right)}\;\mathrm{f}\mathrm{o}\mathrm{r}\;a\in \mathrm{A} $$

where P{d(s) = a} is the probability of selecting the action a in state s. The primal LP formulation is

$$ \mathrm{Maximize}{\displaystyle {\sum}_{s\in S}\beta (s){v}_{\alpha, \pi }(s)} $$

subject tov α,π (s) − ∑ j ∈ S [∫ 0 e − αt P(j|s, d)F(dt|s, d)] v α,π (j) ≤ r '(s, d) for dA and s ∈ S v α,π (s) unconstrained for s ∈ S.

The primal and dual LP formulations are solved for each Baseline and Modified Scenario and each cluster for the selected subpopulation. For a cluster containing one profile, the corresponding time perception measures, p(s) and, ξ(s) are used in the LP models. If a cluster contains multiple profiles, we compute cluster averages of p(s) and ξ(s), and solve the LP models using these average values.

3 Results

3.1 Case study setting

This is a retrospective study using patient-level data extracted from the EMRs provided by (self-identifying name of the facility omitted). The study cohort includes 55,385 adult general ward patients hospitalized at the facility from January 2011 to December 2012. The inclusion criteria are age at admission (≥18 years) and care location (only data collected during a stay in the general ward are used).

3.2 Baseline SMDP model results

For each subpopulation, the optimal total stabilization-related time and the optimal RRT-activation policy were computed using the Policy Iteration Algorithm [20]. State transition probabilities and sojourn time distributions were derived from the EMRs using Maximum Likelihood Estimates (MLEs) [37]. State-dependent time parameters were derived from EMRs. Empirical analysis suggested that the exponential distribution is a good fit for the sojourn time distribution for all model states, except state 4f where the sojourn time is restricted to be in [0,1] by the definition of stabilization. Sojourn time distribution in state 4f is presented in the Appendices 2 and 3. Table 4 shows the optimal NEWS-based RRT-activation thresholds by subpopulation.

Table 4 Optimal policy by subpopulation where A: medical with low ROD; B: surgical with low ROD; C: medical with moderate ROD; D: surgical with moderate ROD; E: medical with high ROD; and F: surgical with high ROD

Optimal policies suggest RRT activation for NEWS values exceeding a critical threshold which differs by subpopulation. Activating RRT immediately is optimal for patients with a “slightly concerning” or worse health state (NEWS values above 0) for all subpopulations, except surgical patients with low ROD (denoted by B) for whom it is optimal to wait until the patient is in a “concerning” health state (i.e., threshold NEWS >4). This result suggests there are two critical patient populations: surgical patient with low ROD and all other patients, with different RRT activation criteria. This translates to a simple rule for activating RRT with two thresholds, NEWS >4 for subpopulation B, and NEWS >0 for the other considered subpopulations. These results imply that surgical patients with low ROD may be healthier and better able to recover from deterioration on their own. From a clinical perspective, surgical patients can be hospitalized for a non-urgent elective surgery whereas medical patients are admitted due to discomfort and possibly symptoms of acute or chronic conditions. Thus, surgical patients with low ROD may recover from a NEWS value below 4 without additional RRT intervention, whereas medical patients with low ROD benefit from an RRT intervention at a lower NEWS threshold. Further, if activating RRT is the optimal action for a given health state, then it is the optimal action for all worse health states, i.e., there is a state-dependent control-limit structure. This result categorizes RRT activation rules into two sets of policies, and simplifies clinical implementation.

For all subpopulations A–E, total expected stabilization and resource time increases as the patient’s condition deteriorates. Further, the total expected stabilization and resource times in health states {4f, 4s, 5} are similar for some subpopulations (A, C, and D) and differ for others (B, E, and F). Specifically, the total expected cost for high ROD patients depends on the stabilization condition, i.e., if a patient’s NEWS has not exceeded 0 (i.e., never reached the “slightly concerning” level) and the patient is in health state 5, or the patient’s NEWS exceeded 0 (has reached or exceeded the “slightly concerning” level), which resulted in moving to health states 4f or 4s. This difference may occur because high ROD patients may not present in health state 5 at the beginning of a hospitalization episode due to their physiological condition, or they may not remain in this health state, which impacts the resource time accumulated in state 5. In addition, surgical patients are more resource intensive (in terms of time required for their care) than medical patients for any given ROD category. Surgical patients may stay longer in the general ward before and after the procedure, which may result in higher resource times, and possible TTS and FTR. In addition, surgical patients may require more intense care even when they are in better health states.

With the goal of demonstrating that the proposed model brings benefits to a clinical setting, we used a time-based metric focusing on patients who experienced an RRT activation during the study period: average time to activate RRT, defined as the average time from the beginning of a hospitalization episode until RRT activation criteria are met. Specifically, for a selected subpopulation we compared the average time to activate RRT using the current policy (as indicated by the observed RRT activation times in the dataset) with the hypothetical average time to activate RRT if the proposed model was used during the same time period. The comparison was applied to one subpopulation over the 2-year time period (January 2011–December 2012), which can be easily applied to other subpopulations. Medical patients with moderate ROD was selected for comparison because they represent the largest subpopulation with a high RRT incidence (18,803 unique patients, mean NEWS 1.99, standard deviation of NEWS 2.14, minimum NEWS 0, maximum NEWS 17, and 827 RRT activations). Compared to the current policy, the optimal RRT policy resulted on average in 5.2 h earlier RRT activation. Based on clinical expert opinion and relevant literature, serious undesired events followed by APD expose patients to an increased risk of death [38]. Many of these events result from insufficient or delayed medical care [38]. For example, it has been shown that majority of cardiopulmonary arrests are preceded by dramatic changes in vital signs or other clinical decline during a 6–8 h period before the arrest [6]. Earlier recognition of the need for RRT activation by the proposed model can improve the stabilization of patients, and potentially reduce the incidence of undesired outcomes, including death.

Overall, results from this study highlight the importance of a personalized approach when treating APD, and provide optimal subpopulation-specific RRT-activation rules with insight into stabilization and resource time requirements.

3.3 Modified SMDP results

The clustering analysis was applied to subpopulation C (medical patients with moderate ROD) as an example, which can be easily expanded to other subpopulations. The results showed that different clustering algorithms impact the size and content of clusters. All scenarios identified a main cluster and multiple “outlier” clusters consisting of one profile. The main cluster was characterized by increasing time-perception measure p(s) as the health condition worsens, i.e., a provider in the main cluster assigned a higher weight to the maximum nurse and RRT times, assuming the physiological deterioration in these states will require high RRT and nurse resource use. The outlier clusters tended to have higher valuation of the RRT time relative to the nurse time compared with the main cluster. Baseline Scenario 1 (Single Linkage Method) identified four clusters, including one main cluster and three outlier clusters (Appendix 5, Table 20). Baseline Scenario 2 (Ward’s Method) and Baseline Scenario 3 (Centroid Method) identified the same main cluster and outliers as in Baseline Scenario 1, and additional outlier profiles were separated from the main cluster (Appendix 5, Tables 21 and 22). Further, the Modified Scenarios showed that changing the Uniform distribution input parameters, such as increasing the upper bound or decreasing the lower bound, further partitioned the main cluster and separated additional outliers from the main cluster (Appendix 5, Tables 23, 24, 25, 26 and 27).

Table 5 presents the LP results including the total expected stabilization and resource time in hours and optimal RRT activation policy for the Baseline Scenario 1 in comparison with the baseline SMDP model results for medical patients with moderate ROD.

Table 5 LP Results for baseline and modified SMDP models for medical patients with moderate ROD

Table 5 shows that for all clusters under the Baseline Scenario 1, the total expected stabilization-related times are non-increasing in state \( s \). Further, the total expected stabilization and resource time for each state is higher in the modified SMDP model compared with the baseline SMDP (Appendix 5, Table 19). In the modified SMDP model under all scenarios, Cluster 2 includes the same outlier profile characterized by a high perception of RRT time relative to nurse time in clinical distress state (i.e., state 1). For this cluster, the optimal policy in state 1 is to wait (i.e., not use the RRT resources and the decision about the patient’s care is made by the bedside provider) rather than activating RRT immediately (Appendix 5, Tables 20, 21, 22, 23, 24, 25, 26 and 27). In clinical practice, activating RRT is a decision made by the bedside provider. Based on clinical expert opinion, if a bedside provider has a high valuation of RRT time compared with nurse time, he/she may consider activating RRT in slightly concerning and concerning states (states 2 and 3) to obtain additional critical care guidance because of the uncertainty associated with the patient’s clinical path. However, in clinical distress state (state 1), the bedside provider may prefer waiting to avoid wasting any critical care resources because the decision about the patient’s care is clear due to severity of the physiological condition. This is different than the baseline SMDP optimal policy in health state 1, which is to call RRT immediately. This result highlights the impact of a provider’s perception on the optimal RRT activation policy. The remainder of the LP results is presented in Appendix 5.

4 Discussion

As opposed to focusing on disease-specific interventions, APD requires a system-wide approach with an emphasis on patient characteristics because it can affect patients across diseases. Further, an acute-care delivery team may consist of providers with different time perceptions of resource requirements and estimates of patient needs. Thus, the recognition of, and response to, APD, e.g., RRT activation, may be affected by patient and provider heterogeneity. Measurement of RRT effectiveness and management in clinical practice remains a challenge. Although several studies focused on patient outcomes or team-based qualitative metrics, their findings highlight the need for enhanced metrics to assess the impact of RRT on patients’ clinical trajectories and care processes [1215].

We hypothesized that the heterogeneity and uncertainty in acute-care systems may result in different RRT activation policies and further there are provider clusters, defined by providers’ perception of time requirements and relative valuation of stabilization-related resource needs, which differ in their RRT-activation policies. The baseline SMDP model identified subpopulation-specific RRT-activation thresholds and established a framework for incorporating patient heterogeneity in acute-care delivery. The modified SMDP model incorporated provider heterogeneity through variation in the value of time associated with stabilization. Results showed that the stabilization and resource time was higher in the modified model compared with the baseline model for all health states. This increase may be explained by the weighted average formulation for the stabilization and resource time functions, which allows the values of the nurse- and RRT-related time parameters to vary by provider, and by sampling the ratios ξ(s) from Weibull distributions, which allows the relative RRT time values to take larger values than those estimated by observed data. The LP results highlighted the impact of the selection of the clustering algorithm and the distributions for time-perception measures on RRT-activation policies. The optimal RRT policy was different for one outlier cluster with a high ξ(s) for which waiting (instead of activating RRT immediately) was optimal in the clinical distress state (state 1). A care provider in this profile might argue that for a patient in a clinical distress state (state 1), acute-care requirements might be clear due to the severity of the physiological condition, and therefore the provider may feel comfortable with making care decisions independent of RRT. Therefore, the increase in resource time due to RRT activation is not considered optimal in this health state.

By sampling time-perception measures from probablistic distributions and clustering provider profiles, the modified SMDP model identified cluster-specific optimal RRT policies. The results suggested that differences in stabilization and resource time perceptions may impact the resuscitation preferences of different providers in the clinically distressed state. This finding supports our hypothesis that the provider clusters differ in their RRT-activation preferences as well as their estimation of stabilization-related resource needs.

Informing RRT activation decisions and capturing the underlying stochastic deterioration process using a quantified score offers several opportunities to improve patient care. RRT is a scarce and valuable resource. Optimizing the RRT activation may decrease wasted time and resources due to the RRT activation for the wrong patient at the wrong time. In addition, in clinical practice, RRTs can occur simultaneously. Developing a patient-centered approach to RRT activation can help prioritize RRT calls.

Our study is motivated by the challenge of addressing the fact that patients were dying because providers have been failing to recognize acute changes in their clinical status. Traditional educational interventions were failing. A much more strategic, systems engineering and analytic approach was required. This study contributes to recognition and response to APD by analyzing key components of physiological deterioration and performing statistical analyses to identify patient subpopulations using clinical parameters, such as patients’ admission type and risk of deterioration. Using clustering methods, we identified distinct provider clusters with regard to their RRT activation preferences and estimation of resource needs to address provider heterogeneity in critical care settings. This work provided quantitative evidence to support the clinical research that “one-size-fits-all” criteria for alerting critical care teams in response to patient deterioration has not been effective and needs to be personalized to fit the needs of individual patients.

We acknowledge that there is some subjectivity in the current RRT activation criteria at the study hospital, e.g., the criteria regarding “concern about the patient”, and the optimal RRT rules derived from the SMDP models do not incorporate these subjective elements. In clinical practice, subjective clinical judgement of frontline providers contains valuable information that may not be captured by objective criteria. This is particularly critical for nursing assessment and the concern about a patient. Nursing assessment provides information necessary to support identification of problems and symptoms that are relevant for patient care [39]. In addition to vital signs and laboratory tests, nursing assessment provides better understanding of the patients’ physiological condition. In this context, subjectivity in the current RRT system is important in detecting signals of APD. While our SMDP models provide simple optimal RRT activation rules as clinical decision guidance, they do not aim to replace the subjective clinical judgement.

Future research could expand our work in several ways. A limitation of the current study is that the absorbing state represents all types of discharge from the general ward (i.e., transfer to higher-level care, discharge alive, discharge to hospice, or death). Future work could modify the terminal stabilization and resource time associated with different absorbing states based on discharge dispositions. Further, we recognize that different health systems may operate differently, and the results drawn from our study based on data from one hospital may not be generalizable. Therefore, future work would be to expand this framework for various facilities to study the dynamics of their specific patient subpopulations. Another limitation is that the relative RRT time perception measures ξ(1), ξ(2), ξ(3), were generated independently of the time perception measures ρ(1), ρ(2), ρ(3), and then combined to generate the simulated provider profiles. However, these measures may be dependent. For example, a provider who assigns a higher weight to the worse-case stabilization and resource time function in state s may select a higher relative RRT measure in this state, i.e., ρ(s) and ξ(s) may be correlated. An area for future research could explore the dependency structure between time perception measures by generating dependent random variables.

In conclusion, APD during hospitalization, while commonly signaled by abnormal vital signs hours before critical events, can be difficult to discern. EWS-based dynamic and stochastic models can aid data-driven clinical decision making by enhancing the ability to capture changes in patient condition over time in a patient-centered manner. This is particularly applicable for providers in an environment where patients are monitored at irregular intervals, e.g., the general ward. Using EWS-based approaches, standardized and structured communication between the provider team members can help to mitigate communication errors arising from fragmented nature of health care delivery and frequent handoffs.

The methods described in this study extend the existing literature by analyzing the stochastic nature of APD, evaluating health-state dependent resource needs, and incorporating both patient and provider heterogeneity in a unique framework. While our model is developed for adult patients, our approach can be easily extended to different patient subpopulations. The identified optimal RRT policies utilize physiological measures that are readily available during routine hospital rounding. In addition, the modified SMDP model provides insight regarding stabilization and resource time as a function of the patient type and the providers’ estimates of stabilization-related patient needs. These methods can be used by hospitals to guide RRT-activation policies in clinical practice for patients who present with APD or develop APD during a hospitalization. Moreover, these methods provide a better understanding for the impact of provider heterogenity on resuscitation actions and time-based stabilization-related resource use.