Hospital preparedness during epidemics using simulation: the case of COVID-19

This paper presents a discrete event simulation model to support decision-making for the short-term planning of hospital resource needs, especially Intensive Care Unit (ICU) beds, to cope with outbreaks, such as the COVID-19 pandemic. Given its purpose as a short-term forecasting tool, the simulation model requires an accurate representation of the current system state and high fidelity in mimicking the system dynamics from that state. The two main components of the simulation model are the stochastic modeling of patient admission and patient flow processes. The patient arrival process is modelled using a Gompertz growth model, which enables the representation of the exponential growth caused by the initial spread of the virus, followed by a period of maximum arrival rate and then a decreasing phase until the wave subsides. We conducted an empirical study concluding that the Gompertz model provides a better fit to pandemic-related data (positive cases and hospitalization numbers) and has superior prediction capacity than other sigmoid models based on Richards, Logistic, and Stannard functions. Patient flow modelling considers different pathways and dynamic length of stay estimation in several healthcare stages using patient-level data. We report on the application of the simulation model in two Autonomous Regions of Spain (Navarre and La Rioja) during the two COVID-19 waves experienced in 2020. The simulation model was employed on a daily basis to inform the regional logistic health care planning team, who programmed the ward and ICU beds based on the resulting predictions.


Introduction
The COVID-19 pandemic presents a major global health threat.Since the outbreak in China in early December 2019, more than 180 million confirmed cases, and close to four million deaths from COVID-19 infection (up to the end of June 2021 https:// coronavirus.jhu.edu/map.html)have been recorded.Regularly updated information on the COVID-19 outbreak is also available on the websites of the European Centre for Disease Prevention and Control (ECDC), the European Commission (EC), and the World Health Organization (WHO).This outbreak has brought changes in health care delivery, and in hospital systems stretched by the sudden increase in demand.The treatment of COVID-19 patients requires dedicated resources, material, and personnel.The pandemic has had a particularly intense impact on Intensive Care Units (ICU), which require highly specialized personnel and costly technical apparatus.Accurate planning requires accurate prediction of resource needs.In addition, hospitalized COVID-19 patients need to be isolated from other types of patients, which makes advance preparation of wards necessary.Therefore, the management of both ICU and ward beds for COVID-19 patients benefits from accurate short-term demand forecasting.Other resource needs, such as personnel requirements, can be calculated from bed demand numbers.Usually, the hospitalization bed is still widely used as a hospital (ICU) management parameter both at the strategic and operational levels.
Hospitals are complex systems evolving in a stochastic environment with a level of uncertainty which intensifies during pandemics due to lack of knowledge about the spread of the disease and its consequences for those infected.In this unsettled context, simulation emerges as a suitable tool of analysis, since it is able to reproduce both the complexity of the system and the variability and uncertainty of the environment, as well as being eligible for use in combination with other analytical techniques.The literature contains numerous bibliographical references relating to the use of simulation models for decision making in the healthcare context.Most applications use simulation to support strategic decisions, usually for resource sizing, scheduling, or management.All these cases require the design of a simulation model to reproduce stationary state healthcare system performance and evaluate resource levels, patient flow management policies, and the long-term decision making process.The recommendations obtained from the simulation analysis are intended for a pre-determined implementation period.
However, a simulation model designed to enable tactical decisions for the provision of specialized health resources during the current outbreak has to focus on the transition period, if it is to generate a short-term projection of the current state of the hospital.To achieve this goal, the simulation model needs to account for non-stationary and non-periodic patient input to the hospital, a complex hospital situation at the Simulation Starting Point (SSP), variation in the patient hospital length of stay (LoS) pattern and censored data.This paper presents a Discrete Event Simulation (DES) model combining dynamic forecasting to predict (simulate) new patient arrivals and the reproduction of patient flow patterns and designed to address all the issues just raised.The simulation process yields future resource-use scenarios to inform the health authorities of future needs and give them time to plan.In fact, the results of the simulation model were used on a daily basis during the successive waves of the pandemic (from March 2020 to May 2021) by local governments of two Spanish regions and by the Spanish Ministry of Health as a health resources planning instrument.Therefore, the main feature of the simulation model presented here is its capacity to reproduce the evolution of the health system from its current state, in a non-stationary and changing environment, thus providing a useful forecasting tool.
The main contribution of this paper is our proposal for a new simulation framework enabling short-term (from days to a few weeks) prediction of critical resource needs for the care of COVID-19 patients and our account of its use by health authorities during the COVID-19 pandemic waves.The simulation framework can be adapted for application in potential future outbreaks.To achieve this main contribution our research includes: • A method to simulate patient arrival times based on Population Growth (PG) models.
These are better suited for the prediction of hospitalization (and positive cases) series than other mathematical alternatives such as SIR-type models, which require detailed knowledge of the spread of the disease throughout the population and the estimation of many parameters.PG models produce S-shape curves able to represent the evolution of pandemic variables, such as positive cases and hospitalizations, from beginning to end of the outbreak.• A statistical analysis of the accuracy of four different PG models in simulating and forecasting the spread of the pandemic.• The representation of the current state of the health system based on a set of state variables and a dynamic and adaptive statistical analysis of patient flow and hospital LoS.
• The combination of all elements in a DES model flexible enough to recreate scenarios based on stochastic models fitted to data (data-driven prediction), scenarios defined by expert judgment, and a mixture of both.• In practical terms, this paper also shows how operations research can contribute to a rapid respond to a healthcare crisis by reporting on a successful real-world application of the simulation model to support a decision-making process of crucial importance to the health of patients in two Autonomous Regions of Spain (Navarre and La Rioja).
The rest of the paper is organized as follows.Section 2 offers a review of related literature dealing with the use of quantitative methods for the prediction and efficient management of health care system requirements.Section 3 studies the adequacy of PG models to predict the spread trend of a pandemic.The modeling of patient flow through the hospital is presented in Sect. 4. The structure of the DES model and the methodology used to set up the simulation are included in Sect. 5. Results of the application in the Autonomous Regions of Navarre and La Rioja (Spain) are included in Sect.6.Finally, Sect.7 ends the paper with the conclusions of this work.

Related literature
Simulation is one of the most suitable analytical tools for the analysis of complex systems, such as healthcare systems, as reflected in numerous specialist articles describing the use of simulation models for decision-making in the healthcare context.DES has been used to model and analyze all aspects of logistics management in healthcare, particularly the improvement of patient flow management, bed-planning, waiting list management, health service design, medical staff scheduling, etc.For reviews of the use of simulation models in healthcare, see Brailsford et al. (2009); Günal and Pidd (2010); Katsaliaki and Mustafee (2011);and Mielczarek and Uziałko-Mydlikowska (2012).These simulation models usually focus on studying the stationary state of the health system to support strategic decisions for resource sizing or management policy design purposes.
The ultimate goal of these models is to match resource availability with demand in order to provide high-quality patient care while maintaining adequate human and technological resource provision.Some of the problems analyzed in this framework are patient flow (Shahani et al. 2008;Kolker 2009), bed planning (Ridge et al. 1998;Zhu et al. 2012;Rodrigues et al. 2018), health service design (Mallor et al. 2016), and medical staff scheduling (Erhard et al. 2018), among others.Despite reports in the medical literature of discrepancies between assumptions in mathematical simulation models and the behavior of real healthcare systems (Azcarate et al. 2020), there is no doubt about the usefulness of simulation models for the analysis of relevant problems in complex healthcare systems.
However, simulation not only helps to ensure the highest quality healthcare in terms of staff and facilities, it also improves the delivery of best practice.Since the pandemic began, all national governments and the WHO have extensively used simulation modelling to identify the best strategies for reducing the impact of COVID-19.Currie et al. (2020) identify challenges from this disease and discuss how simulation modelling can help decision-makers to make the best informed decisions.
The accuracy of a simulation model for the prediction of resource needs during a pandemic is dependent upon the design of an accurate model to forecast patient arrivals at the health facility.Most infectious disease prediction methods rely on differential equation models based on population dynamics (Grassly and Fraser 2008;Brauer and Castillo-Chavez 2012).These mathematical models are essential for understanding the course of the epidemic and planning effective control strategies (Anderson and May 1991;Diekmann and Heesterbeek 2000;Hethcote 2000).One of the most widely used models of human-to-human transmission is the SIR model (Kermack and McKendrick 1927).Members of the population are sorted into different status categories: S (Susceptible), I (Infected), and R (Remove).The portion of population in each state is calculated over time by estimating the rate of transition from one state to another.With more complex model specifications, it is possible to recreate the spread of a specific epidemic.Extensions of the classical SIR model (Anastassopoulou et al. 2020;Giordano et al. 2020;Lin et al. 2020;Zhou et al. 2020;Casella 2021), as well as stochastic transmission models (Hellewell et al. 2020;Kucharski et al. 2020) have indeed been developed for the COVID-19 pandemic.However, such models are complicated and need strong assumptions and simplifications, because they are based on a set of differential equations with initial conditions and a number of adaptive parameters (Xia et al. 2009;Li et al. 2014;Magal et al. 2016;Li and Zhang 2017).Reliable values of those parameters only become available at the end of the pandemic and they depend on non-pharmaceutical interventions dictated by political decisions.There is also a need for other mathematical models that can be adapted to daily pandemic data.
PG models provide a simpler alternative for modelling the number of cumulative positive cases, hospitalizations and other pandemic variables.Growth curves are used in a wide range of research areas, such as fishery research (Oliveira Zardin et al. 2019;Oribe-Pérez et al. 2020), biology (Sun et al. 2020), or other infectious disease outbreaks (Horimoto et al. 1997;Roberts and Saha 1999;Viboud et al. 2016;Ghazvini et al. 2019).Specifically, Logistic, Gompertz, Rosenzweig, and Richards models have been used to model the spread of outbreaks such as A/H1N1 and Ebola in (Liu et al. 2015).The COVID-19 research has produced several papers describing the development of a growth model to predict new cases in countries such as China (Shen 2020), India (Malavika et al. 2021), Spain (Sánchez-Villegas and Daponte Codina 2020), and other European countries (Cássaro and Pires 2020).These mathematical models present a set of mathematical equations including adaptive parameters that can be determined numerically based on available real data (Panovska-Griffiths 2020).The model can be used daily (by updating the number of positive cases) and automatically adapted to individual parameter trends.
If all the mathematical models mentioned in the previous paragraphs could be fitted to real data, it would be possible to obtain an accurate prediction of what might happen in the future (e.g., emergency planning, resource allocation) (He et al. 2020;Poston et al. 2020;Steinberg et al. 2020).This is very important; especially for typically scarce hospital resources, such as ICU beds.Manca et al. (2020) present and discuss a few regression models built on historical ICU admissions and patient death data during the COVID-19 pandemic.They are capable of reproducing the bed occupancy curve using regression models with great potential for decision-making and emergency planning in future pandemics.
In recent decades, moreover, healthcare simulation models using advanced technology have become a new experience-based learning support (Almagooshi 2015;Persson 2017) enabling healthcare professionals to acquire new cognitive, technical, and behavioural skills.Before working in real-world patient treatment scenarios, both professionals and students can benefit from this experience-based form of learning in a risk-free decision-making environment (Palominos et al. 2019).Simulation models of the type presented in this paper also enable training in the management of health care services during emergencies.When resources are in short supply, one of the most critical decisions is how to allocate them to patients, especially when they can mean the difference between survival and death, as is the case with ICU patients.This triage becomes even more difficult during pandemics, when resources are stretched even further.Different ICU triage protocols for use in pandemics have been suggested in Cheung et al. (2012); Christian et al. (2014); and Zhang et al. (2020).Forecasting bed demands is essential to avoid ethical dilemmas (Azcarate et al. 2020;Garcia-Vicuña et al. 2020).According to Utley et al. (2011), "the impact of triage is dependent on the level of demand and on the scale of achievable differences between included and excluded groups in terms of anticipated LoS and critical care survival".A simulation model can improve critical resource planning during a pandemic; and can be used as an off-line learning tool to test new triage protocols, which are not always as effective as might be desired, and other hard-to-anticipate factors must be considered.

Modelling the patient arrival pattern
In this section, we discuss the adequacy of PG models for case prediction purposes.First, we perform a statistical comparison of four different models for their suitability.We then describe the use of the Gompertz PG model to simulate daily hospitalization series.

Population growth models
The simulation model needs to generate the daily patient arrivals to the hospital(s), which is a non-stationary process highly dependent on the number of positive (active) cases in the population.A compartmentalized epidemiological model, such as the SEIR model, enables analysis of the spread of the disease throughout a population.It models transition dynamics between four different states of a population: susceptible (S), exposed (E), infective (I), and recovered (R).The model depends on epidemiological parameters such as the infection rate (the number of people that an infective person infects per day), the disease latent time (the lag between contact with an infected person and the appearance of symptoms), the recovery rate, and the death rate.The basic SEIR model has been extended to categories such as the protected (P) and the quarantined people (Q) (Godio et al. 2020) and other case detection and symptom statuses, up to a total of eight or more compartments (Giordano et al. 2020).Stochastic transmission models have also been considered (Kucharski et al. 2020).All these extensions add details to the model but also more complexity, which does not necessarily mean greater forecasting reliability, since it increases the number of model parameters to be estimated (Roda et al. 2020).Roda et al. (2020) demonstrate a linkage between the transmission rate and the case-infection ratio, resulting in a continuum of best-fit parameter values.These can produce significantly different predictions for the epidemic: the hallmark of a non-identifiability problem.These difficulties motivated us to consider parametrically parsimonious models, such as the PG type, which are able to generate curves of the shapes generally associated with pandemic variables (positive (active) cases, hospitalizations, deaths): monotonic, humped, and S-shaped.Rypdal and Rypdal (2020) found that PG models are obtained from the SIR model by making reasonable assumptions about the SIR parameter trends over time.
Examples of growth models found in the literature include the Gompertz (Gompertz 1825), the Richards (Richards 1959), the Stannard (Stannard et al. 1985), and the Logistic model (Ricker 1979).They all start with exponential growth but each has a specific, gradually decreasing growth rate.All produce S-shaped curves describing the evolution of pandemic variables departing from one or a few initial cases, growing initially at an exponential rate before reaching a plateau, and then decreasing to zero when the pandemic expires.The equations describing the number of cases in population y, at time x, take the following form: We carried out two statistical analyses to elucidate the adequacy of PG models for representing and predicting the evolution of the pandemic caused by the SARS-CoV-2 virus.The first analysis evaluates the capacity of the four PG models to fit complete sets of real positive case data registered in the 20 most-affected countries during the first wave of the pandemic (as recorded in Worldometer on June 15, 2020).The results included in Appendix 1 show that the Gompertz, Richards, and Stannard models have similar goodness of fit; with all three outperforming the Logistic model.These results are consistent with Rypdal and Rypdal (2020) who found that the COVID-19 related death rate curves of most countries are well described by the Gompertz growth model.The cumulative positive case, hospitalization and death curves have similar shapes because they are all scaled by the factor s, and translated by the factor t. (5) where y I , y h , y d are the cumulative series of positive cases, hospitalizations and deaths, respectively; s h , s d are the scaling factors for hospitalizations and deaths, respectively, and t h , t d are the time lags between infection and hospitalization, and infection and death, respectively.
The second statistical analysis is designed to test the short-medium term predictive capacity of the PG models.For each of the 20 countries used in the first statistical analysis, the data up to the day on which cases exceed 25%, 40%, and 65% of the total cases registered at the end of the pandemic wave are used to predict the cases for the following successive 5, 10 and 15 day horizons.Thus, nine prediction exercises are performed for each country and each PG model.The results included in Appendix 2 show that the Gompertz model surpasses the predictive capacity of the other PG models, outperforming them in all nine cases, and being equaled by the Richards and Stannard models in only four.
Our statistical analysis supports the use of the Gompertz for modelling series of cumulative hospitalizations.The parameters of the original equation of this model, presented above, are more suited to mathematical than biological interpretation, like most equations describing sigmoidal growth curves.Therefore, before using it in our modeling, some transformation will aid interpretation of the curve.Zwietering et al. (1990) rewrite the Gompertz growth model as shown in Eq. (1).
where, e = exp(1), G(t) is the cumulative number of hospitalizations up to time t.
A is a growth model parameter corresponding to the total number of hospitalizations at the end of the outbreak.It is the upper asymptote of the curve.K is the absolute growth rate of the curve at its inflection point.D, known as the lag time, is the time at which G(t) = Aex p(−e), which means that it always occurs at the same percentage (6.6%) of the upper asymptote.This value is less intuitive than either of the others.
Suppose we are at pandemic day n + 1 and have recorded and denoted by h(t), t = −n, . . ., −1 the number of hospitalizations since the beginning and by H (t) the cumulative number of hospitalizations Using these data, the Gompertz growth model parameters p = (A, K , D) are estimated by minimizing the sum of the squared errors (SSE).The estimated parameter vector is denoted by p = Â, K , D and the Gompertz model by G p(t ).The values of G p (t) are used to predict the expected number of hospitalizations for the current and following days, t = 1, . . ., T , as required by the simulation methodology described in the following subsection.

Simulation of the patient arrival pattern
Once the curve G p(t ) is fitted to the hospitalization data H (t), t = −n, . . ., −1 of a certain region up to the present day, it is used to predict and simulate the number of new hospitalizations for each of the following days t = 1 . . ., T .The function argument t is continuous and we will assume that t = 0 represents both the end of the last day of recorded hospitalizations and the start of the current day.Therefore, t = 1 is the time at which the current day ends.
The simulation procedure is summarized in the following four steps: 1. Fit the Gompertz curve to cumulative hospitalization data series (e.g. by using the least squares method).Record the estimated parameter vector p = Â, K , D and its covariance matrix Σ p.
2. Simulate a parameter vector p = (A, K , D) from the Multivariate Normal distribution N p, Σ p .
3. Calculate the expected number of arrivals N (t) for day t, t = 1, . . ., T , where t = 1 represents the current day (start of the simulation) and T is the simulation horizon.Use the Gompertz curve G p (t) with parameters p to calculate 4. Simulate the number of hospitalizations, for each day t, t = 1, . . ., T , in the future, as observations from a Poisson distribution with mean λ(t) = E(N (t)): 5. Repeat steps 2-4 as many times as necessary for the different hospitalization sequences to be simulated.
This simulation procedure takes into account both variability due to uncertainty in the estimation of the Gompertz parameters and variability in hospital arrival numbers around the mean.Figure 1 illustrates the four steps.The upper-left hand corner of the graph shows the Gompertz model (green curve) fitted to the available data (black dots); the point estimator p and the covariance matrix Σ p are used in the second step to sample a parameter vector p.The upper-right hand corner of the graph shows the Gompertz curves associated with parameter vectors p in a tolerance region obtained from the multivariate normal distribution (Dong and Mathew 2015) as where χ 2 3 (q) denotes the qth percentile of a chi-square distribution with df = 3.Clearly, R is the central 100q% region of the multivariate normal distribution N p, Σ p .
Each parameter vector p in the region R is associated with a Gompertz curve G p (t) (shown in the lower left hand corner of Fig. 1) compatible with the observed data (and different from G p(t )).This Gompertz curve G p (t) provides the expected number of hospitalizations among simulated arrivals generated by sampling from a Poisson distribution.A sequence of trajectories with the simulated number of hospitalizations for future days t = 1, . . ., T (shown in the lower right hand corner of Fig. 1) is obtained by replicatimg the sampling of a vector p, the calculation of the expected number of future arrivals λ(t), t = 1, . . ., T , from the Gompertz curve G p (t), and the

Modelling the patient flow
This section focuses on modelling patient flow through the health system.First, we describe the possible patient pathways through the hospital, and then explain how the LoS of each patient is modelled.

Hospital patient pathway
COVID-19 patients can access the health system in a variety of ways: following diagnosis with COVID-19 in a primary healthcare facility, hospital emergency department, or nursing home; or after undergoing a SARS-CoV-2 test control (such as a Polymerase chain reaction (PCR) test), etc. Depending on the severity of his/her condition, the person is admitted to the health care system as a COVID-19 patient, either in a hospital ward or directly in the ICU.
The COVID-19 patient pathway through the hospital is the same as for other hospital patients, but, due to the highly contagious nature of the virus, COVID-19 patients require dedicated resources and cannot be mixed with other patients.Figure 2 shows the patient flow through the health system, highlighting the transitions between the hospital ward and the ICU.Patients can be admitted either to the ICU or first to a hospital ward with potential transfer to the ICU if their condition deteriorates.Discharge from a ward can follow either death or a health improvement.Patient transfers from the ICU to a hospital ward occur after a health improvement.

Stochastic modelling of hospital LoS
The following variables are used to model LoS in the hospital: • X 1 , the LoS in the hospital ward of a patient not needing ICU.
• Y , the LoS of a patient in the ICU.
• Z , time spent by a patient in the hospital ward before transfer to the ICU (applies only to patients transferred to the ICU from the hospital ward).
• X 2 , the LoS of a patient in the hospital ward after being discharged from the ICU.
The following probabilities determine the patient-pathway through the healthcare facilities: • p I , the probability of direct admission to ICU upon arrival.
• p W I , the probability of a patient initially admitted to a ward requiring transfer to ICU. • p I W , the probability of patient transfer from ICU to a ward.Then, 1 − p I W is the probability of death in the ICU.
Then, the probability of ICU requirement is p As the pandemic progresses and more COVID-19 patients need hospitalization, the new data collected from these patients can be used to update the probability distribution parameters estimates and patient pathway probabilities.Given that only a small percentage of ward admissions and an even smaller percentage of ICU admissions have been discharged after a few weeks from the start of the outbreak, the associated information on most of them is partly unknown, that is, they constitue censored data.For example, a patient admitted to the ICU 10 days ago provides an ICU LoS datum x such that x > 10.
This motivates us to perform daily updates of the distribution parameters and probabilities by adding the fresh data, thereby enlarging the sample size, reducing the degree of censorship bias and ultimately obtaining more accurate parameter estimates.The parameter estimation is done by the maximum likelihood method.For example, for the estimation of θ Y , the parameter vector of the distribution function of variable Y is performed by maximizing the following likelihood function: where {y i , i = 1, . . ., n Y } is the set of exact value observations, that is, those corresponding to the LoS of patients now discharged from the ICU, and f θ Y (y i ) is the density function.y * i , i = 1, . . ., n Y * is the set of censored values, that is, those corresponding to the LoS of patients remaining in the ICU at the time of the statistical analysis, and The use of probability plots facilitates identification of the parametric probability distribution family that best fits the data.The parameters of the selected probability distribution family are estimated by the maximum likelihood method.Weibull and Lognormal distribution families have proved to be good probability models for LoS variables, as will be shown in Sect.6.
At the beginning of a new pandemic, there is insufficient understanding of the disease and possibly no known effective treatment, as was the case with the COVID-19 outbreak.As medical and biological research progresses, the discovery of new drugs and therapeutic protocols improves patient care and alters lengths of stay in hospital wards or ICUs.This observation reinforces the need to gather every possible new piece of patient admission and discharge data for use in updating estimated distribution parameters and branching probabilities.

The discrete event simulation model
In this section, we present the mathematical modelling of hospital dynamics using a DES model.We pay attention to starting the simulation from the current state of the health system, which is one of the distinguishing features of this application of healthcare system simulation modelling.

Entities, state variables, and events
DES models create moving entities that are transformed by several processes until they exit the modelling system.In the healthcare system that concerns us, the entities are the COVID-19 patients and the processes are the health care received in the hospital ward and/or ICU.The system is described by a set of state variables, which provide at any time a complete representation of the simulated system, and the set of events, which modify the value of the state variables.The simulation model represents patient flow through the different hospitalization routes; that is, the area enclosed by dashed lines in Fig. 2. In this subsection, we present two types of healthcare system state variables and the set of events separated into two categories.
We consider two distinct types of state variables: global and patient-level.The two global variables, B(t) = (B W (t), B I (t)), describe bed occupancy by COVID-19 patients in hospital wards and the ICU, respectively, at any time t.Total COVID-19 hospitalizations at time t, N (t), is given by the sum of these two state variables, Each patient i admitted to hospital has two associated state variables.The patientlevel state variable S i (t), which records the condition of patient i at time t, can take one of three values: W 1 , when patient i is admitted to a ward without a previous stay in ICU; I , when patient i is admitted to ICU; and W 2 , when patient i is in a ward after transferral from ICU.The patient-level state variable R i (t) records the time at which patient i enters state S i (t).
Two different types of events can affect the values of the state variables.They have been classified by the nature of the variation in B(t): an increase or decrease in N (t), or a variation in B W (t) and B I (t) not affecting their sum.The first set of events E A are associated with patient arrival times.This group includes only external arrivals, i.e., positive cases detected outside the hospital that require hospitalization.These events are generated by the simulation methodology described in Subsection 0, and each arrival is classified as an ICU arrival or a ward arrival with probabilities p I and 1 − p I , respectively.The last type of patients are also subdivided into two groups, those who will require ICU admission after some time on the ward (with probability p W I ) and those who will not (with probability 1 − p W I ).
The second category includes the events E B leading to a patient's end of a stay in the ICU or the ward, and altering the value of their patient-level state variables, and also either B W (t), or B I (t) or both.As stated in Sect.4, there are several events of this type: Figure 3 outlines the DES model of the health system.

Starting the simulation run
The purpose of the simulation model is to predict short-term resource needs, with precision strongly depending on the model's accuracy both in representing the initial state of the healthcare system and the initial resource utilization rate.This last aspect of mathematical modeling is usually not very important when the aim of the simulation is to investigate the long-term behavior of a system in its stationary state, which is usually independent of its initial state.However, when the simulation is used as a predictive tactical decision-making support tool, the initial state of the simulation model and the initial dynamics of the system are the main determining factors of the state of the healthcare system in the near future.The simulation clock is set to zero at the time of the last update of the Electronic Health Record (EHR)-file, which we assume to have taken place at the end of the kth day of the pandemic.The simulation model begins to simulate future changes from day (k + 1)th, using the information collected during the first k days of the pandemic, taking hospitalizations at the end of the kth day as the initial state.The point of transition from the past to the future occurs at the beginning of the simulation run, for which the event calendar must be initialized (Law 2014).The simulation of event type E A , a new patient arrival, was explained in Subsection 0. We will now explain how to simulate type E B events for patients currently admitted, that is, at time zero in the simulation model.The value of the state variables, number of ward patients, B W (0), and number of ICU patients, B I (0), as well as times R i (0) in the current state S i (0) for each patient i, can be calculated from the EHR-file, which records admission, discharge, and ward/ICU transfer dates for each patient.This set of state variables defines the initial state of the healthcare system simulation model.The discharge time of each ICU patient i is calculated by sampling from the random variable Y conditioned to a stay longer than r i = (k + 1 − R i (0)), the number of days already spent in the ICU.Let y i be a value sampled from the conditional distribution Y |Y > r i , then the value y i − r i is the simulated ICU discharge date for patient i, which is assigned to the position E B I of the event calendar vector associated with patient i.
A patient admitted to a ward for r i days can ultimately be discharged from the hospital or transferred to ICU.The probability of ICU transfer for a patient hospitalized for r i days, denoted by p I CU|r i , is calculated with Bayes theorem: where B is the event of a ward patient requiring ICU transfer.C is the event of a ward patient not requiring ICU transfer.A hospital trajectory is simulated for each patient i already admitted to a hospital ward.The first step of the simulation concerns the decision as to whether the patient will be admitted to the ICU (with probability p I CU/r i ) or not (with probability 1− p I CU/r i ).Time to ICU transfer is then simulated by sampling from the conditional distribution Z |Z > r i , and assigning the value z i − r i to the event E B Z .If the patient i does not require ICU care, the hospital discharge event E B X 1 will occur at time x i − r i , where x i is sampled from the conditional distribution X 1 |X 1 > r i .
Similarly, for each ward patient i previously discharged from the ICU, the time of discharge from the hospital is simulated by sampling a value x i from the conditional variable X 2 |X 2 > r i .The value x i − r i is the simulated discharge time and is assigned to position E B X 2 of the event calendar vector associated with patient i.
Once discharge times and transfer times between ward and ICU have been simulated for each hospitalized patient, and recorded in the event calendar (together with the arrival time of the next COVID-19 patient) the DES model is ready to advance the simulation clock from time zero to the minimum of the times recorded in the event calendar.The state variables and calendar events are then updated accordingly and the main loop of Fig. 3 is repeated until the simulation run is complete.
The fitted Gompertz curve forecasts daily patient arrivals, which can be uniformly distributed over the following 24 h or according to a non-stationary pattern when, for example, arrivals drop significantly overnight.

Simulation output
The DES model works by generating patient arrivals, discharges, and transfers causing variations in ward and ICU occupancy levels, which are recorded by statistical counters.The simulation model includes two sources of randomness: the number of patient arrivals (hospital and ICU), and patients' LoS.Therefore, each time the DES model is run departing from the current situation of the healthcare system (generating randomness based on a different seed) the ward and ICU bed requirements differ.Figure 3 in Sect.5.1 shows one iteration of the simulation, with which different trajectories can be obtained with each replication of that routine.Thus, the simulation model is run many times (thousands) to get a statistical distribution of the number of ward and ICU beds needed each day.
The DES simulator generates percentile data, which are stored in an Excel file.The 5 th percentile (P5), 50th percentile (P50), and 95th percentile (P95) are plotted on a graph as confidence bands for future resource needs.Figure 4 shows an example of these graphic outputs.The green line represents the real occupancy trend and the black dot indicates the SSP, that is, the moment from which the hospital system dynamics are simulated.The left-hand side shows four different possible ICU bed demand trajectories (T1, T2, T3, and T4), while the right-hand side shows the confidence bands.

Application of the DES model
The methodology introduced in the previous Sections, implemented in software, has been used by the Governments of the Spanish Autonomous Regions of Navarre and La Rioja to support bed planning in their hospitals during the two pandemic waves experienced to date.We briefly describe how the virus has affected these two regions globally, then explain the stochastic modelling of the hospitalized patients and their pathway through the hospital, and then present the predictions obtained by the DES model at different times.We conclude this Section with some observations and tips for the practical use of this forecasting tool.

Incidence of COVID-19 disease
Navarre and La Rioja are two Regions of northern Spain with populations of about 650,000 and 350,000, respectively, more than half concentrated around the capitals (Pamplona and Logroño).With this population distribution, Navarre Health Services have a main hospital in Pamplona, with a bed capacity of more than 1000, and two secondary hospitals in two of the most populated cities (Estella and Tudela) bringing total bed capacity to 1,466 ward beds and 45 ICU beds.La Rioja has a main hospital in Logroño with 630 hospital beds and 21 ICU beds and a secondary 80-bed hospital in Calahorra.Both regions have the possibility of increasing bed numbers if necessary.
Navarre and La Rioja figure among the five Spanish autonomous regions with the highest cumulative COVID-19 rates during both waves of the pandemic, according to data collected by the Governments of Navarre and La Rioja.Daily numbers of new admissions have important implications for hospital management teams.Figure 5 shows hospital admission statistics for both regions from early March 2020 to mid-December 2020.Two waves can be appreciated each with its own characteristics.The first is shorter but steeper, while the second is more prolonged.By December 16, 4228 COVID-19 patients had been admitted to hospitals in Navarre (6.5 per 1,000) and 2253 COVID-19 patients in La Rioja (6.4 per 1000).

Stochastic modelling of hospitalizations and lengths of stay
As the pandemic spreads, the data load increases, making it possible to improve the simulation model.Since March 16, 2020, the arrival pattern is calculated from the hospital admission series.Figure 6 shows different results after fitting the Gompertz growth model to cumulative hospitalizations in La Rioja during the first wave.As the pandemic progresses and more data becomes available, the Gompertz curve fit usually improves.However, due to the wide variability of the real data, minor fit deviations, such as that of March 31, 2020, are possible.Separate curve fits are shown in Fig. 7, along with the real daily hospitalization series.These graphs show the wide variability of the data.
Both ward and ICU lengths of stay are estimated daily, as explained in Sect.0. Ward lengths of stay (X 1 ) fit reasonably well to a lognormal distribution (L N (μ, σ )), whilst the Weibull (W (α, β)) distribution is found to provide a better fit for ICU LoS (Y ).During the pandemic, each time the data was analyzed, the distribution parameters were reset for the simulation.Table 1 lists the probability distributions that best fit the lengths of stay in the two waves for Navarre (Na) and La Rioja (Ri), sorted by gender, male (M) and female (F).N stands for the number of patients analyzed.Differences can be observed between regions, waves, and genders, especially in ICU lengths of stay and the percentages of ICU admissions.Figure 8 shows two probability plots obtained from the fits of the ward and ICU LoS distributions (regardless of gender) during the first wave of the pandemic in Navarre.

Ward and ICU bed occupancy forecasts
Figure 9 shows the bed occupancy forecasts for the hospitals of Navarre based on the March 21, 2020 simulation for the following days.Note that the most important  predictions for the medical staff are for the short-medium term (yellow-colored area in Fig. 9), and there is a close match between the simulated and the real data, plotted in green.The simulator's ability to obtain accurate 10-day forecasts, even in the early stages of the pandemic, is demonstrated here.More ward and ICU bed occupancy predictions at different moments of the second wave in Navarre, in comparison with real occupancy can be seen in Fig. 10.Three dates have been selected to show the data trend pattern.The first is September 20, 2020, when the occupation began to increase significantly.The second is October 27, 2020, some days before the peak in ward and ICU bed occupancy.The last is November 13, 2020, when peak occupancy had passed

Tips for the use of the DES model in practice
All results shown in the previous subsections were obtained by fitting the growth model and probability distributions to the available data at the prediction times.However, during the first stages of an outbreak, when patient hospitalization data are scant, it could be hard to achieve accurate Gompertz model parameters and LoS probability distribution estimates to feed the simulation model.The beginning of an outbreak is usually marked by exponential growth in the data, potentially leading to a very high upper asymptote from the Gompertz model, which, in practical terms, could be considered as infinity (e.g.several orders of magnitude greater than the total population of the region).Taking this estimation as a simulation input, bed demand rises exponentially to figures much higher than the entire regional population.This is not a realistic estimation even in the worst case scenario of the entire population being infected.Nevertheless, in this case, the estimation would be valid for as many days as the exponential growth holds, and, as more data is collected, the accuracy of the upper asymptote estimate increases.However, to avoid unlimited exponential growth, and improve the accuracy of the estimates at the beginning of a new pandemic wave, we recommend conducting a mixed estimation of the Gompertz parameters, combining an estimate based on expert opinion for one parameter with statistical fit estimates for the other two.Specifically, experts are able to estimate total hospitalizations based on the population incidence rate scaled by a hospitalization factor.For example, at the beginning of the first wave of the pandemic, Navarre Health Administration professionals guessed that 1% of the population would catch the virus (based on flu incidence), and 40% of the cases would require hospitalization (estimating from initial data).Using values ranging around these estimates, we could run the simulation model to obtain possible hospitalization scenarios throughout the entire wave.These predictions overestimated total hospitalizations by the end of the first wave by only 30%.At the beginning of the second wave, the initial predicted maximum can be the value observed in the first wave or a percentage of it.However, as soon as enough data are available for an accurate parameter estimation, the simulation model should be completely data-driven.
A similar problem arises when estimating the parameters of the LoS probability distributions at the onset of a new wave.When insufficient data prevents the statistical estimation of all parameters, the simulation model has to be flexible enough to allow manual parameter input.We recommend the use of the triangular distribution to represent the LoS for different hospital status levels.The triangular distribution family is a popular choice for the estimation of task completion times because it embodies the idea of the 'three-point estimation' where subjective judgment is used to estimate a minimum, a 'best guess', and a maximum value of the variable of interest (Law 2014).Experts can rely on values reported for the countries first affected by the pandemic (for example the cases of China and Italy are described in Grasselli et al. (2020); Guan et al. (2020); Young et al. (2020);and Zhou et al. (2020)).For the second and successive waves, the probability distributions estimated at the end of previous waves can be used initially.For example, during the first days of the outbreak, Navarre Health Administration experts fixed the minimum, maximum, and most probable total LoS as 10, 18, and 13 days respectively.

Conclusion
Healthcare systems are overburdened as high demand for healthcare services from COVID-19 patients places strains on ICU capacity and creates excessive workloads for healthcare professionals.Accurate predictions of patient care resource needs are essential to advanced resource planning which can ease pressure on the system and relieve stress among hospital staff.Accurate predictions optimize response times and thus help to save lives.
Under normal circumstances, managers cope with demand surges through resource contingency plans based on predictions made about one week in advance.In this paper, we have developed a DES model to predict hospital resource needs, particularly in terms of ward and ICU beds.The simulation model is fed with new hospitalization predictions generated by a PG model.The Gompertz growth model was selected following an analysis of the fit and forecasting properties of four PG models: Logistic, Richards, Stannard, and Gompertz.Forecasting improvements could be achieved using an ensemble of these models, but such an exercise is beyond the scope of this paper and remains for future research.Forecasting accuracy can be improved by including other factors affecting resource consumption, such as age and the Adjusted Morbidity Group (AMG), in LoS stochastic models.
The structural simplicity of the simulation model makes it appropriate for general use, i.e., it can be adapted to estimate bed needs in any geographic area.The growth model requires only three parameter estimates, which can be obtained directly from the observed data.Easy online parameter estimation is one of the advantages of this model over other complex models, such as the SIR type.
It is worth mentioning the strength of simulation models in this context of uncertainty, that is, their capability to run what-if scenarios enabling decision-makers to explore the consequences of different policy choices, such as the spatial allocation and quantity of additional healthcare resources required by COVID-19 patients in a context of uncertain demand.The simulation model is data-driven, patient arrivals and lengths of stay can be estimated from data, but it also has the flexibility of allowing the use of simulation from user-determined input to explore additional scenarios.
A distinct technical/methodological feature of the simulation model is its focus on the transition period of the health system rather than the stationary state as is usual in simulation studies or on transition periods following regeneration points.This simulated transition period is unique, given that the outbreak has no regeneration points.Therefore, accurate representation of the initial health system status is paramount.The simulation of remaining LoS per hospitalization has shown to be a key point to the smooth projection of health system dynamics and the process of linking them (and mixing them) with the new dynamics obtained from simulated new patient arrivals and lengths of stay.However, the simulation of the remaining LoS depends on the amount of information known about hospitalized patients.In this paper, we have considered patient-level information (exact admission and discharge dates).In cases where only aggregated hospital-level information is available, that is, daily numbers of admissions and discharges, an estimated admission date per patient at time zero of the simulation is required.
The simulation model can be extended to include non-COVID-19 ICU and ward bed utilization.This extension would enable the creation of hospital scenarios on which to test the effects of decisions involving other hospital areas, such as a reduction in elective surgeries to free more beds for COVID-19 patients during epidemic waves.The ultimate purpose is to create a learning tool by developing an interactive simulation model to enable the inclusion of patients from all types of pathways (ordinary and nonordinary, such as pandemic patients), where bed management decisions are made by the program user.
The methodology presented in the previous Sections was implemented in software using the Python programming language.It takes input from a data file containing a record of six variables including sex and age of patient and four dates representing the hospital and ICU admission and discharge times for each patient admitted to the hospital so far.The dates for patients still awaiting discharge or ICU admission are left blank.These four dates enable estimation of all the LoS probability distributions and branching probabilities.The additional age and sex data enable segmentation of the patient population.The functionalities and manner of use of the software evolved through time and from one pandemic wave to another.Initially (from March 2020 to June 2020), the simulation model was computationally implemented on its own, such that a daily manual statistical analysis was required to fit probability distributions.Throughout this period, the predictions and reports were drawn up by the research group and sent to the hospital's COVID-19 logistics manager.Only one regional government used the results of our simulation model during this period.The statistical analysis was automatized and integrated in the software during the summer of 2020.In addition, the user interface, output analysis and automatic reports generation were also implemented.Then, from October 2020 and throughout the second pandemic wave, the analyses performed by local government health administration personnel (Govts. of Navarre and La Rioja), under the supervision of the research group.From December 2020, and throughout the third and fourth waves, the health administration analysts worked almost autonomously.During the last period (January to May, 2021) the research group also assisted the Spanish Health Ministry by providing predictions for each of the 17 Autonomous Communities in Spain.
After 15 months' cooperation with health authorities, we have reached the conclusion that the success of this operations-research support system for decision-makers in difficult pandemic times is due to the following factors: • Multidisciplinary teamwork and a background of cooperation with health managers.The research group q-UPHS (www.unavarra.es/quphs)has been cooperating for more than 10 years in the solving of real problems surrounding health service improvements.Problem analysis is always addressed through multidisciplinary teamwork involving academics (engineers and mathematicians) and health service personnel (managers, medical staff, and computer scientists).• A request for assistance from the health administration.At the beginning of the pandemic, health managers raised the need for a a short-and medium-term beddemand forecasting method to improve their bed management system.Medical space and equipment (including staff) planning is based on 10-bed modules.Prior knowledge of bed needs therefore facilitates resource planning.• Rapid response.Five days after the original request, the group presented the simulation model and the initial results (predictions) for validation by the region's hospital and healthcare logistics managers.• Joint development of the model.Decision-makers were involved in the development of the model and maintained continuous communication with the research team.• Continuous improvement of the computer application.Suggestions made by health managers and a user-friendly software interface were implemented to free users gradually from the need for supervision by the research team.• Joint monitoring of the results.Quality assessment and critical analysis of the predictions were performed jointly by the research team and health managers.
Thus, the simulation paradigm presented in this paper is suitable for the realistic representation of health service processes, which makes it more credible and easier to understand by the managers who will have to rely on the results in their decisionmaking.From March 2020 until the moment of writing this paper (end of June 2021), the simulation model has been used daily to predict hospital resource needs in the Spanish regions of Navarre and La Rioja, and all other Spanish Communities from January, 2021.We found that the involvement and continuous improvement suggestions of the hospital logistics manager in the development of the simulation model has been crucial for obtaining a user-centered simulator and a practical forecasting tool to enable daily updates of data from the hospital administration's information system.Positive cases predictions for the following 5, 10, and 15 days, at 25%, 45%, and 65% of total cases detected The prediction of the fitted curves for the next 5, 10, and 15 days is assessed by calculating the MAE.These time horizons are considered as sufficient for the hospital managers to adapt extra resources for new needs.As new positive case data is added every day, and predictions are refreshed also every day, the long-term predictive capacity of the model will not be analyzed.Table 3 summarizes the relevant information from all the tables shown at the end of this appendix.It indicates the number of countries in which each model is the best in terms of predictive capacity (as before, differences smaller than 0.1% have been considered equal).It is observed that the Gompertz model is the one that more accurately predicts future values, specifically in all time horizons analyzed.For this reason, the Gompertz model is recommended for the prediction of new cases of 5,6,7,8 and 9).The rest of the appendix is organized as follows.On the one hand, we present fits to the curves until the selected days, obtaining an MAE for each model and country.On

Table 3
The number of countries in which each model is equal or better than the others in terms of predicting new positive cases for the next 5, 10, and 15 days Model  To abbreviate the headings in the following tables, the Logistic (L), Gompertz (G), Richards (R), and Stannard (S) functions are concatenated with the number of days to be predicted (5, 10, and 15) using the symbol "_".Therefore, column G_10, for example, shows the normalized MAEs obtained for the next 10-day forecast, having fitted the data with the Gompertz function.

Fig. 1
Fig.1The simulation procedure for the patient arrival pattern.Steps 2-4 are replicated as often as necessary for the different patient arrival and hospitalization sequences to be simulated

Fig. 2
Fig. 2 Representation of COVID-19 patient flow in the health system

Fig. 3
Fig. 3 Flow diagram of the health system simulation model.Two types of events are considered, external arrivals and ward or ICU end of stays

Fig. 4
Fig. 4 Simulation output for ICU bed demand for the following days.The left-hand side shows four different trajectories starting from the SSP; the 3 lines on the right-hand side correspond to the 5th, 50th, and 95th percentiles

Fig. 5
Fig. 5 Daily recorded hospitalizations in Navarre and La Rioja from early March 2020 to mid-December 2020

Fig. 6 Fig. 7
Fig. 6 Cumulative hospitalizations in La Rioja from March 3 to June 9, 2020, and different curve fits obtained from the Gompertz growth model

Fig. 8 Fig. 9
Fig. 8 Probability plots of the fits of the ward and ICU LoS distributions during the first wave of the pandemic in Navarre

Fig. 10
Fig. 10 Comparison between the predictions made in Navarre at different moments of the second wave (2020/09/20, 2020/10/27, and 2020/11/13) for the number of beds occupied in both hospital and ICU, and real occupancies values represent the best scores * These values are of 18 countries because 2 of them (Pakistan and Bangladesh) have no data for that period 123 • E B Z events dictating end of ward stay prior transfer to ICU, which are generated by sampling from the variable Z .The variable B W (t) decreases by one unit and B I (t) increases by one.• E B X 1 events signaling end of ward stays with no need for ICU transfer, which are generated by sampling from the variable X 1 .The variable B W (t) decreases by one unit.• E B X 2 events associated with end of ward stay for a patient transferred from ICU, and generated by sampling from the variable X 2 .The variable B W (t) decreases by one unit.• E BY events associated with end of ICU stay and generated by sampling from the variable Y .The variable B I (t) decreases by one unit.The patient is transferred to a ward with probability p I W , and B W (t) increases by one unit.The event calendar vector at time t has B W (t) + B I (t) + 1 positions.One includes the time of the next patient arrival (associated with event E A ), B W (t) positions, one for each ward patient, containing their hospital discharge times (associated with events E B X 1 or E B X 2 ) or ICU transfer times (associated with event E B Z ), and B I (t) positions, one for each ICU patient, storing the discharge time from ICU.

Table 1
The parameters fitted to different populations at different moments during the pandemic, sorted by region (Navarre and La Rioja), wave, and gender, showing ward and ICU lengths of stay distributions and ICU admission probabilities Region

Table 2
The 20 most-affected countries by COVID-19 until June 15, 2020.The last four columns show the MAE calculated for the fit with each of the applied models

Table 4
MAE calculated for the fit of each model at 25% of total cases detected Bold values represent the best scores the other hand, the tables of the MAEs made in the predictions are shown.To facilitate the comparison of results, MAEs are normalized by the total number of positive cases on the selected days.From these results, we can conclude that the Gompertz model outperforms in predictive capacity the other PG models and it is recommended to predict new cases of COVID-19.

Table 5
Normalized MAEs obtained for each prediction and model at 25% of total cases detected

Table 6
MAE calculated for the fit of each model at 40% of total cases detected

Table 8
MAE calculated for the fit of each model at 65% of total cases detected