Measuring the price of anarchy in critical care unit interactions
Abstract
Hospital throughput is often studied and optimised in isolation, ignoring the interactions between hospitals. In this paper, critical care unit (CCU) interaction is placed within a game theoretic framework. The methodology involves the use of a normal form game underpinned by a two-dimensional continuous Markov chain. A theorem is given that proves that a Nash Equilibrium exists in pure strategies for the games considered. In the United Kingdom, a variety of utilisation targets are often discussed: aiming to ensure that wards/hospitals operate at a given utilisation value. The effect of these target policies is investigated justifying their use to align the interests of individual hospitals and social welfare. In particular, we identify the lowest value of a utilisation target that aligns these.
Keywords
game theory queueing theory healthcare1 Introduction
The effect of state dependent service rates in healthcare has been well studied in isolated hospitals. In Batt and Terwiesch (2012), an empirical study is undertaken and service rate slow down is identified. Furthermore, it is shown that modelling whilst ignoring these state dependent rates leads to errors. This effect is further identified in Chan et al (2014), Kc and Terwiesch (2012) where for example the negative effect on patient outcome is revealed. In Kim et al (2013), Shmueli et al (2003), this is expanded to consider a variety of admission policies in two CCUs; in particular the effect of rejection (due to too high occupancy) is measured. These policies are studied in the setting of a single hospital, and thus, from a game theoretic point of view correspond to rational behaviour (a good game theory reference text is Maschler et al (2013)). In practice, a simplification of rational strategies is managed by policy makers by setting utilisation targets that ensure that hospitals do not run at a level likely to have a high rejection rate (example of these can be seen in Bevan and Hood (2006), Kesavan (2013)).
The aim of this paper is to further investigate the effect of rational policies employed by Hospitals. In particular, the goal is to place this in a game theoretic setting so as to identify whether or not target policies result in uncoordinated behaviour that is damaging for patients.
A critical care unit (CCU), also sometimes referred to as an Intensive Therapy Unit or Intensive Therapy Department, is a special ward that is found in most acute hospitals. It provides intensive care (treatment and monitoring) for people who are critically ill or are in an unstable condition. CCUs face major challenges, on average, 8% of patients are refused admission to a CCU because the Unit is full (Audit Commission, 1999). The CCU occupancy rates for some hospitals are reportedly very high (Mitchell and Grounds, 1995; Smith et al, 1995) and a shortage of beds has been identified throughout the United Kingdom.
Many previous researchers have developed queueing models to help manage bed capacities in hospitals: Cooper and Corcoran (1974), Dumas (1984), Gallivan and Utley (2011), Gorunescu et al (2002), Griffiths et al (2013), Harper and Shahani (2002). Also, a vast amount of literature has been devoted to the simulation of CCUs; Cahill and Render (1999), Costa et al (2003), Griffiths et al (2005), Kim et al (1999), Litvak et al (2008), Shahani et al (2008).
This paper describes part of a project undertaken with managers from the Aneurin Bevan University Health Board (ABUHB), which is an NHS Wales organisation in South Wales, that serves 21% of the total population of Wales (Board, 2014). Critical care is delivered on two sites, at the Nevill Hall hospital in Abergavenny and at the Royal Gwent hospital in Newport. For the remainder of this paper, the Nevill Hall hospital will be referred to as NH and the Royal Gwent hospital as RG.
Most research where game theory is applied in healthcare has mainly concentrated on emergency departments (EDs) and how to deal with diversions of patients and ambulances. In Hagtvedt and Ferguson (2009), cooperative strategies for hospitals are considered, in order to reduce occurrences when ambulances are turned away due to the ED being full. In Deo and Gurvich (2011), a queueing network model of two EDs is proposed to study the network effect of ambulance diversion. Each ED aims to minimise the expected waiting time of its patients (walk-ins and ambulances) and chooses its diversion threshold based on the number of patients at its location. Decentralised decision making in the network is modelled as a non-cooperative game.
Some other work that has not concentrated on EDs, but has been applied to healthcare includes: Knight and Harper (2013), where results concerning the congestion related implications of decisions made by patients when choosing between healthcare facilities were presented. In Howard (2002), a model of the accept/reject decision for transplant organs is developed.
In the wider intersection of game theory and queueing theory (where this work lies), papers that consider price and/or capacity include Allon and Federgruen (2007), Cachon and Harker (2002), Cachon and Zhang (2007), Kalai et al (1992), Levhari and Luski (1978). In these models, the choice of price/or capacity determines the arrival rate for each firm; this is similar to the approach taken in this paper.
The work presented in this paper contributes to the growing body of literature by applying state-dependent queueing models in a game theoretical context to CCU interaction. In particular, this consideration allows for the investigation of targets imposed by central control (Bevan and Hood, 2006). The findings of this work justify and identify a choice of targets that align the interests of the individual hospitals with social welfare.
Parameter values used in the model
Parameter | Parameter description | Parameter value |
---|---|---|
\(c_{\text {NH}}\) | The bed capacity at NH | 8 |
\(c_{\text {RG}}\) | The bed capacity at RG | 16 |
\(\lambda _{\text {NH}}\) | The arrival rate at NH (per day) | 1.50 |
\(\lambda _{\text {RG}}\) | The arrival rate at RG (per day) | 2.24 |
\(\mu _{\text {NH}}\) | The service rate at NH (days) | 0.262 |
\(\mu _{\text {RG}}\) | The service rate at RG (days) | 0.198 |
t | Bed utilisation target | 0.8 |
Note that, the inter-arrival and service rates are not state dependent. These will serve as a base level for the state dependent rates used throughout the game theoretic models.
The paper is organised as follows: Section 2 presents the general methodology as well as a theoretical existence condition for Nash Equilibrium. Section 3 presents the findings from two models. Finally, conclusions and further ideas for progression are given in Section 4.
2 Queueing and game theoretic models
To formally investigate the impact of decentralised decision making, the interaction between two CCUs is placed within a non-cooperative game framework. The interaction will be modelled through a two dimensional continuous Markov chain that will now be described.
2.1 Queueing model
In total there are \((c_{\text {NH}}+1)\times (c_{\text {RG}}+1)\) states and they are indexed lexicographically: \((0,0), (0,1), (0,2), \dots\).
For the parameters of Figure 4, the utilisation rates are 59% at NH and 62% at RG and a throughput of 1.23 at NH and 1.98 at RG (patients per day).
2.2 Game theoretic model
Based on the discussion above, the game theoretic model is presented as a synchronous optimisation problem shown in the following optimisation problem.
Optimisation problem.
The following result is a sufficient condition for the existence of an equilibrium:
Theorem
Let\(f_{H}(k):[1,c_{\bar{H}}]\rightarrow [1,c_H]\) be the best response of player\(H\in \{\text {NH}, \text {RG}\}\) to the diversion threshold of\(\bar{H}\ne H\) (\(\bar{H}\in \{\text {NH}, \text {RG}\}\)).
If\(f_{H}(k)\) is a non-decreasing function ink then the game of (3) has at least one Nash Equilibrium.
Proof
The function \(f_H\) is well defined as it maximises a continuous function over a finite discrete set. In case of multiple values that minimise \(U_H\), it is assumed that \(f_H\) returns the lowest such value, this is consistent with the Price of Anarchy (PoA) being a theoretical upper bound of the effect of uncoordinated behaviour (Roughgarden, 2005).
This point of intersection corresponds to a Nash Equilibrium of (3). \(\square\)
This Theorem is in itself not that useful as the properties of \(f_{H}\) are difficult to ascertain. Although the methodology alluded to is how the equilibria are found for the work presented here (exhaustive investigation of best response functions). The following Lemma will however be of more use in Section 3.
Lemma
Using the convention of Figure2:
If\(\lambda _{\text {NH}}^{(h,l)}\le \lambda _{\text {NH}}^{(h,h)},\lambda _{\text {NH}}^{(l,l)}\le \lambda _{\text {NH}}^{(l,h)}\) then\(f_{\text {NH}}(k)\) is a non-decreasing function ink.
If \(\lambda _{\text {RG}}^{(l,h)}\le \lambda _{\text {RG}}^{(h,h)},\lambda _{\text {RG}}^{(l,l)}\le \lambda _{\text {RG}}^{(h,l)}\) then\(f_{\text {RG}}(k)\) is a non-decreasing function in k.
Observation
The utilisation \(U_H=U_H(\lambda )\) is an increasing function in \(\lambda\). As the traffic intensity at H increases: H gets busier.
Proof
A proof for the first part of the Lemma is given (the proof for the second part is analogous).
Let \(\bar{\lambda }_{\text {NH}} = \bar{\lambda }_{\text {NH}}(K_{\text {RG}})\) be the effective arrival rate at NH. If \(\lambda _{\text {NH}}^{(h,l)}\le \lambda _{\text {NH}}^{(h,h)},\lambda _{\text {NH}}^{(l,l)}\le \lambda _{\text {NH}}^{(l,h)}\) then this implies that \(\bar{\lambda }_{\text {NH}}(K_{\text {RG}}) \ge \bar{\lambda }_{\text {NH}}(K_{\text {RG}} + 1)\) as shown in Figure 7.
- 1.
\(S(K_{\text {RG}})\ne S(K_{\text {RG}}+1)\)
- 2.
\(S(K_{\text {RG}})= S(K_{\text {RG}}+1)\)
If \(S(K_{\text {RG}})\ne S(K_{\text {RG}})\) this implies that \(f^{+}(K_{\text {RG}})\le f^{-}(K_{\text {RG}}+1)\) which giveswhich is the required result.$$\begin{aligned} f(K_{\text {RG}}) \le f^{+}(K_{\text {RG}}) \le f^{-}(K_{\text {RG}}+1) \le f(K_{\text {RG}}+1) \end{aligned}$$(11)
The aim of the work presented is to measure the inefficiency created by the removal of central control between CCUs.
We let \(\widetilde{T}\) denote the sum of throughputs at the Nash Equilibrium obtained by solving the game of (3) (in case of multiple equilibria we take \(\widetilde{T}\) to be the lowest throughput) and let \(T^*=\max _{K_{\text {NH}}, K_{\text {RG}}}\left( T_{\text {NH}}+T_{\text {RG}}\right)\). This optimal throughput \(T^*\) is independent of \(t\).
Note that intuitively it could be thought that \(T^*\) is obtained when \(K_{\text {NH}}=c_{\text {NH}}\) and \(K_{\text {RG}}=c_{\text {RG}};\) however, this is not always the case (numerical experiments have been carried out to verify this).
3 Results
The game theoretic model of (3) is solved using exhaustive consideration of best responses whilst taking advantage of the structure identified by the Lemma of Section 2. For any given pair of threshold strategies, the matrix equation \(\pi Q=0\) is solved by obtaining a basis for the kernel of Q. For the purpose of this paper, this is implemented in Sagemath (Stein et al, 2013).
3.1 Model 1: Strict diversion
This model assumes that if the bed occupancy level at both CCUs exceeds a predetermined threshold, then the admission to the CCU is cancelled. This cancellation could correspond to sending the patient to a completely different CCUs (outside of the model), moving one of the current CCU patients (ready to be discharged) to another ward and/or using the post-anaesthesia care unit as a temporary overflow measure. This model corresponds to the first possibility: the patient is lost (from the point of view of this model).
If either CCU chooses their threshold at zero, patients are not admitted at all, and, consequently both Units are closed.
For this model, the Nash Equilibrium is at (8, 16), which gives \(\widetilde{T}=3.65\) and we obtain \(T^*=3.65\). Importantly, a PoA of 1 is not guaranteed for this problem. For example in Figure 10b similar best response behaviour is shown for \(t=0.6\) for which the \(\text {PoA}=1.18\) (the optimal throughput is again at (8, 16)).
Numerical values of PoA for different target and demand rates
x | \(t=0.15\) | \(t=0.27\) | \(t=0.4\) | \(t=0.52\) | \(t=0.64\) | \(t=0.76\) | \(t=0.88\) | \(t=1\) |
---|---|---|---|---|---|---|---|---|
−0.9 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.61 | 1.64 | 1.04 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.33 | 3.27 | 1.66 | 1.22 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.03 | 4.43 | 2.55 | 1.64 | 1.34 | 1.06 | 1.0 | 1.0 | 1.0 |
0.26 | 5.23 | 2.99 | 2.1 | 1.51 | 1.28 | 1.09 | 1.0 | 1.0 |
0.55 | 7.34 | 3.21 | 2.25 | 1.74 | 1.34 | 1.16 | 1.0 | 1.0 |
0.84 | 7.6 | 3.81 | 2.32 | 1.79 | 1.46 | 1.17 | 1.03 | 1.0 |
1.13 | 7.73 | 3.88 | 2.36 | 1.82 | 1.48 | 1.25 | 1.04 | 1.0 |
1.41 | 7.81 | 3.91 | 2.61 | 1.97 | 1.58 | 1.26 | 1.09 | 1.0 |
1.7 | 7.86 | 3.94 | 2.63 | 1.98 | 1.58 | 1.32 | 1.14 | 1.0 |
1.99 | 7.89 | 3.95 | 2.64 | 1.98 | 1.59 | 1.33 | 1.14 | 1.0 |
We see that an extremely large PoA is obtained for \(t<0.2\). For values of \(t>0.5\) the PoA is still high: a PoA of 2 corresponds to 100% less throughput of patients. These findings seem to give some backing to the targets implemented throughout the NHS (Bevan and Hood, 2006).
In this model, there is the potential for both CCUs to divert patients at the same time, and so patients are lost to the entire system. The model of the next section will investigate the effect of not allowing total rejections.
3.2 Model 2: Soft diversion
This means that if bed occupancy levels at both Units exceed a pre-determined threshold, then diversions do not occur and each CCU has to accommodate their own patients. In effect we are modelling a certain level of cooperation in this case where CCUs only divert if the other CCU is not busy.
Numerical values of PoA for different target and demand rates
x | \(t=0.15\) | \(t=0.27\) | \(t=0.4\) | \(t=0.52\) | \(t=0.64\) | \(t=0.76\) | \(t=0.88\) | \(t=1\) |
---|---|---|---|---|---|---|---|---|
−0.9 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.61 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.32 | 1.01 | 1.01 | 1.01 | 1.01 | 1.0 | 1.0 | 1.0 | 1.0 |
−0.03 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.0 | 1.0 |
0.26 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.0 |
0.55 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.06 | 1.01 |
0.84 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.04 |
1.13 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.04 |
1.42 | 1.04 | 1.04 | 1.04 | 1.04 | 1.04 | 1.04 | 1.04 | 1.04 |
1.71 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 |
1.99 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 |
We immediately note that the underlying cooperation that is now being forced on our players (divert only if the other player can accommodate the patients) has reduced the PoA. Note that PoA \(=1.02\) still implies a reduced throughput of 2% which has very large cost implications for a national health service. A tipping point is now visible as demand increases; this is similar to the profiles shown in Knight and Harper (2013) and can be explained as follows:
Also, for very low values of demand, cooperation can be obtained with no target. When the demand is low, there is no scope for uncoordinated behaviour to be damaging. When the demand is very high, the system is saturated and once again uncoordinated behaviour has no negative effect in comparison to optimal behaviour. There is however a region of demand for which there is a high PoA.
Soft diversion results for \(t=0.8\)
x | Nash equilibrium | \(\tilde{T}\) | Nash \(T_{NH}\) | Nash \(T_{RG}\) | \(T^*\) | PoA |
---|---|---|---|---|---|---|
0.1 | (8,16) | 3.92 | 1.50 | 2.42 | 3.92 | 1 |
0.2 | (8,16) | 3.92 | 1.50 | 2.42 | 3.92 | 1 |
0.3 | (6,12) | 4.19 | 1.65 | 2.54 | 4.33 | 1.03 |
0.4 | (4,0) | 4.22 | 1.65 | 2.57 | 4.48 | 1.06 |
0.5 | (4,0) | 4.22 | 1.65 | 2.57 | 4.48 | 1.06 |
0.6 | (0,0) | 4.42 | 1.69 | 2.72 | 4.68 | 1.06 |
Clearly, as the demand change increases, the Nash Equilibrium thresholds decrease. This is due to the fact that both CCUs are attempting to divert their patients in less busy states as these states become rarer. If one CCU diverts early, the other will follow suit (both CCUs incrementally reacting to each other). As a result the Nash Equilibrium for \(x=0.6\) is at (0, 0), meaning that each CCU takes care of their own patients. As the demand increases even further the Nash Equilibrium remains at (0, 0) and the PoA decreases.
4 Conclusions
In this work, a generic game theoretical model has been presented that accounts for the rational actions of two CCUs. This game theoretic model is underpinned by a queueing model that takes into account the stochastic nature of queueing systems. This work extends the application of game theoretic models already present in the literature to healthcare (Li et al, 2002; Xie and Ai, 2006).
A result is proved that allows for the assertion of existence of a Nash Equilibrium. This result is then applied to two particular models that are influenced by discussions with a local health board. Strict diversion: patients can be lost to the system if both CCUs declare being in diversion. Soft diversion: if both CCUs are in diversion then they cannot divert their own patients.
An analysis of the effect of rational behaviour is given for both of these models in the form of PoA calculations. The PoA is calculated so as to measure the effect of rational behaviour on overall patient throughput. The PoA represents a theoretical lower bound for the potential damages caused by uncoordinated behaviour. High PoAs are found in the case of strict diversion which is to be expected as soft diversion implies a certain level of cooperation. Importantly, a non-negligible effect of rational behaviour is calculated for certain policy target values. A recommendation of setting \(t=0.72\) is found across both models. This gives some evidence to a particular target value of maximal utilisation in a two CCU ward setting.
The assumptions as to the strategy space of our players is restrictive: a single threshold policy might not be optimal (although it is present in various pieces of literature on optimal control of queueing systems: Naor, 1969, Shone et al, 2013). Indeed, in reality critical care managers could have far more complex boundaries for their heuristic decision making;
This model only assumes the presence of two players; however in reality the system has a variety of stakeholders. Multiplayer systems could be worth considering. This would reflect health boards/hospitals in a concentrated area so that interactions are not just between two hospitals but between many.
The restriction to pure strategies is influenced by discussions with ABUHB and also does not detract from the results presented thanks to the Theorem and Lemma of Section 2. However, allowing for mixed strategies could also be of interest, corresponding to decision making that is not constant over time. Managers could alternate between a variety of behaviours: at time accepting patient despite being busy and at other times not;
Patient length of stay is assumed to be dependent on the CCU at which they receive service. A further extension of the work would be to use the service rate from original CCU (prior to diversion). This would require a Markov chain with a higher dimensional state space. In practice this corresponds to patient morbidity corresponding to the original CCU.
As discussed above, reducing the decision making of critical care managers to rational reactions to capacity targets is not without limitations. However, this quantitative model of behaviour was described as insightful and informative by ABUHB. In practice, stakeholders describe a target of 80% capacity. Whilst this is not only at times impossible, it is also not evidence based and in particular does not take in to account interactions between CCUs. This is a common theme in practice and the literature which this manuscript aims to address.
Finally, congestion and throughput are not the only concern of a healthcare system. Further work could involve the investigation of patient survival instead of throughput as utility. This would be similar to work such as Erkut et al (2008), Knight et al (2012).
The code used in this work can be found here: https://gist.github.com/anonymous/81effc06eea70a9e4e2f. The graphics for this paper were obtained using software described in Hunter (2007), Stein et al (2013); a worksheet with the data and code used for the plots can be found here: https://cloud.sagemath.com/projects/c293aefd-1fdf-4b9c-95f4-75bb77035e42/files/MeasuringThePriceOfAnarchyInCCUInteractions.sagews.