1 Introduction

Statistical modeling for describing and predicting patient arrival to Emergency Departments (EDs) represents a basic tool of each study concerning ED patient load and crowding. Indeed, all the approaches adopted to this aim require an accurate model of the patient arrival process. Of course, such a process plays a key role in tackling the widespread phenomenon of overcrowding which afflicts EDs all over the world (see e.g., Ahalt et al. (2018), Bernstein et al. (2003), Daldoul et al. (2018), Hoot and Aronsky (2008), Hoot et al. (2007), J Reeder et al. (2003), Vanbrabant et al. (2020), Wang et al. (2015), Weiss et al. (2004), Weiss et al. (2006)). The two factors that have the most significant effect on overcrowding are both external and internal. The first concerns the patient arrival process; the second regards the patient flow within the ED. Therefore, both aspects must be accurately considered for a reliable study on ED operation.

Several modeling approaches for analyzing ED patient flow have been proposed in literature (see Wiler et al. (2011) for a survey). The main quantitative methods used are based on statistical analysis (time–series, regression) or general analytic formulas (queuing theory). In particular, a realistic model for patient arrivals is crucial for dealing with important issues regarding the patient flow through an ED. To this aim, time–dependent queueing models have been successfully adopted; for instance, in Vile et al. (2017) staffing level problem has been efficiently dealt with  using queuing theoretical approach in a time–dependent setting with a time–varying input. However, simulation modeling (both Discrete Event and Agent-Based Simulation) is currently one of the most widely used and flexible tool for studying the patient flow through an ED. It enables performing effective scenario analysis, aiming at determining bottlenecks (if any) and testing different ED settings. We refer to Salmon et al. (2018) for a recent survey on simulation modeling for ED operation.

Simulation modeling can be also combined with other techniques to improve the responses provided; for instance, in the recent paper by Gartner and Padman (2020) a Discrete Event Simulation model has been linked with machine learning models for better estimating the patient perception of the services' delay provided in emergency care.

In the time–dependent modelling approach, a methodology that appears to be a step forward is Simulation–Based Optimization. It combines a simulation model with a black-box optimization algorithm, aiming at determining an optimal ED setting, based on suited objective function (representing some KPIs) to be maximized or minimized (Ahmed and Alkhamis 2009; Guo et al. 2016, 2017).

Modeling methodologies are generally based on assumptions that, in some cases, may represent serious limitations when applied to complex real–world cases, such as ED operation. In particular, when dealing with ED patient arrival stochastic modeling, due to the nonstationarity of the process, a standard assumption is the use of Nonhomogeneous Poisson Process (NHPP) (see e.g., Ahalt et al. (2018), Ahmed and Alkhamis (2009), Guo et al. (2017), Kim and Whitt (2014a), Kuo et al. (2016), Zeinali et al. (2015)). We recall that a counting process X(t) is an NHPP if 1) arrivals occur one at a time (no batch); 2) the process has independent increments; 3) increments have Poisson distribution, i.e. for each interval \([t_1, t_2]\),

$$\begin{aligned} P\left( X(t_1)-X(t_2)=n\right) = e^{-m(t_1,t_2)} \frac{[m(t_1,t_2)]^n}{n!}, \end{aligned}$$

where \(m(t_1,t_2)= {\int _{t_1}^{t_2} \lambda (s)ds}\) and \(\lambda (t)\) is the arrival rate. Unlike the Poisson process (where \(\lambda (t)=\lambda\)), NHPP has nonstationary increments and this makes the use of NHPP suitable for modeling ED arrival process, which is usually strongly time–varying. Of course, appropriate statistical tests must be applied to available data to check if NHPP fits. This is usually performed by assuming that NHPP has a rate that can be considered approximately piecewise constant. Hence, Kolmogorov–Smirnov (KS) statistical test can be applied in separate and equally spaced intervals, and usually the classical Conditional–Uniform (CU) property of the Poisson process is exploited (see Brown et al. (2005), Kim and Whitt (2014a), Kim and Whitt (2014b)). Unlike the standard KS test, in the CU KS test, the data are transformed before applying the test. More precisely, by CU property, the piecewise constant NHPP is transformed into a sequence of i.i.d. random variables uniformly distributed on [0, 1] so that it can be considered a (homogeneous) Poisson process in each interval. In this manner, the data from all the intervals can be merged into a single sequence of i.i.d. random variables uniformly distributed on [0, 1]. This procedure, proposed in Brown et al. (2005), enables removing nuisance parameters obtaining independence from the rate of the Poisson process on each interval. Hence data from separate intervals (with different rates on each of them) and also from different days can be combined, avoiding common drawback due to large intra-day and inter-day variation of the ED patient arrival rate. Brown et al. (2005) apply the CU KS test after performing a further logarithmic data transformation. In Kim and Whitt (2014b), Kim and Whitt (2015), this approach has been extensively tested along with alternative data transformations proposed in early papers Durbin (1961) and Lewis (1965). However, in Kim and Whitt (2014a) the authors  have observed that this procedure applied to ED patient arrival data is fair only if they are “analyzed carefully”. This is because the following three issues must be seriously considered: 1) data rounding, 2) choice of the intervals, 3) overdispersion. The first issue may produce batch arrivals (zero-length interarrival times) that are not included in an NHPP, so that unrounded data (or an unrounding procedure) must be considered. The second is a major issue in dealing with ED patient arrivals since the arrival rate can rapidly change so that the piecewise constant approximation is reasonable only if the intervals are properly chosen. The third issue regards combining data from multiple days. Indeed, in studying the ED patient arrival process, it is common to combine data from the same time slot from different weekdays, being this imperative when data from a single day are not sufficient for statistical testing. Data collected from the EDs database usually show large variability over successive weeks mainly due to seasonal phenomena like flu season, holiday season, etc. However, this overdispersion phenomenon must be checked by using a dispersion test on the available data (e.g., Kathirgamatamby (1953)).

In this work, we propose a new modeling approach for the ED patient arrival process based on a piecewise constant approximation of the arrival rate accomplished with not equally spaced intervals. This choice is suggested by the typical situation that occurs in EDs where the arrival rate is small and varying during the night hours, and it is higher and more stable in the daytime, this is indeed what happens in the chosen case study. It is worth noting that ED management typically plans resource allocation based on the average number of arrivals expected in a given time–slot (corresponding to an interval of the partition of the 24 h), for instance in staffing allocation. Therefore, it is important to have fewer intervals to represent the real arrival process, still ensuring good accuracy. In this respect, to obtain an accurate representation of the arrival rate \(\lambda (t)\) by a piecewise constant function \(\lambda _D(t)\), a finer discretization of the time–domain is required during the night hours, as opposed to daytime. For this reason, the proposed method finds the best partition of the 24 h into intervals not necessarily equally spaced.

As far as the authors are aware, the use of an optimization method for identifying stochastic processes characterizing the patient flow through an ED was already proposed in Guo et al. (2016), but that study aimed at determining the optimal service time distribution parameters (by using a metaheuristic approach) and it did not involve ED arrival process. Therefore our approach represents the first attempt to adopt an optimization method for determining the best stochastic model for the ED arrival process. In the previous work De Santis et al. (2020) a preliminary study was performed following the same approach. Here, concerning De Santis et al. (2020), we propose a significantly enhanced statistical model which allows us to obtain better results on the case study we consider.

In constructing a statistical model of the ED patient arrivals, a natural way to define a selection criterion is to evaluate the fit error between \(\lambda (t)\) and its approximation \(\lambda _D(t)\). However, the true arrival rate is unknown. In the approach we propose, as opposed to Kim and Whitt (2014a), no analytical model is assumed for \(\lambda (t)\), but it is substituted by an “empirical arrival rate model” \(\lambda _F(t)\) obtained by a sample approximation corresponding to the very fine uniform partition of the 24 h into intervals of 15 minutes. In each of these intervals, the average arrival rate values have been estimated from data obtained by collecting samples over the same day of the week, for all the weeks in some months, using experimental data for the ED patient arrival times. Hence, any other \(\lambda _D(t)\) corresponding to a grosser partition of the day must be compared to \(\lambda _F(t)\). In other words, an optimization problem is solved to select the best day partition in not equally spaced intervals, determining a piecewise constant approximation of the arrival rate over the 24 h with the best fit to the empirical model. Therefore, the objective function (to be minimized) of the optimization problem we formulate, comprises the fit error, namely the mean squared error. Moreover, an additional penalty term is included aiming at obtaining the overall regularity of the optimal approximation, being the latter measured using the sum of the squares of the jumps between the values in adjacent intervals. The rationale behind this term is to avoid optimal solutions with too rough behavior, namely few long intervals with high jumps.

To make the result reliable, several constraints must be considered. First, the length of each interval of the partition can not be less than a fixed value (half an hour, 1 h). Moreover, for each interval,

  • the CU KS test must be satisfied to support the NHPP hypothesis;

  • the dispersion test must be satisfied to ensure that data are not overdispersed, and could be considered as a realization of the same process (no week seasonal effects).

The resulting problem is a black-box constrained optimization problem and to solve it we use a method belonging to the class of Derivative-Free Optimization. In particular, we use the new algorithmic framework recently proposed in Liuzzi et al. (2020) which handles black-box problems with integer variables.

We performed extensive experimentation on data collected from the ED of a big hospital in Rome (Italy), also including some significant sensitivity analyses. The results obtained show that this approach enables determining the number of intervals and their length such that an accurate approximation of the empirical arrival rate is achieved, ensuring the consistency between the NHPP hypothesis and the arrivals data. The regularity of optimal piecewise constant approximation can be also finely tuned by properly weighing the penalty term in the objective function concerning the fit error term.

It is worth noting that the use of a piecewise constant function for approximating the arrival rate function is usually required by the most common discrete event simulation software packages when implementing the ED patient arrivals process as an NHPP.

To summarize, we propose a model for the patient arrival process at an ED under NHPP hypotheses, aiming at defining the best piecewise constant approximation of arrival rate with not equally spaced intervals. This is obtained by solving an integer nonlinear black–box optimization problem where the number of intervals is not a priori fixed and the constraints ensure that the solution complies with the NHPP hypothesis.

The paper is organized as follows. In Sect. 2 we describe the statistical model we propose along with the optimization problem we consider. Sect. 3, briefly reports information on the hospital ED under study. The results of extensive experimentation are included in Sect. 4, while Sect. 5 reports a preliminary assessment of the approach we propose. Finally, Sect. 6 includes some concluding remarks.

2 Analytical model

The arrival process at EDs is usually characterized by a strong intra-day variation both in the arrival rate and interarrival times: typically, experimental data show rapid changes in the number of arrivals during the night hours, as opposed to a smoother profile at daytime. As we already mentioned in the Introduction, for this reason, the ED arrival process is usually modeled as an NHPP.

2.1 Statistical model

We describe the statistical model of the ED patient arrivals we propose. As opposed to Kim and Whitt (2014a), we do not assume any analytical model for the arrival rate \(\lambda (t)\), and therefore a suitable representation of the unknown function is needed. A realistic representation can be obtained by averaging the number of arrivals observed in experimental data on suitable intervals over the 24 h of the day, not necessarily equally spaced.

Let \(\{T_i\}\) denote a partition P of the observation period \(T = [0,24]\) (h) in N intervals, and let \(\{\lambda _i\}\) be the corresponding sample average rates. Then a piecewise constant approximation of \(\lambda (t)\) can be written

$$\begin{aligned} \lambda _D(t)= \sum _{i=1}^N \lambda _i\,{\mathbf{1}} _{T_i}(t), \quad t\in T, \end{aligned}$$

where \({\mathbf{1}} _{T_i}(t)\) is 1 for \(t\in T_i\) and 0 otherwise (the indicator function of set \(T_i\)). Any partition P gives rise to a different approximation \(\lambda _D(t)\), depending on the number of intervals and their lengths. Therefore a criterion is needed to select the best partition \(P^\star\) with some desirable features.

First of all, we need to ensure that there is no overdispersion in the arrivals data. We refer to the commonly used dispersion test proposed in Kathirgamatamby (1953) and reported in Kim and Whitt (2014a). If it is satisfied, then it is possible to combine arrivals for the same day of the week over different weeks. To this aim, for any partition P, let \(\{k_i^r\}\) denote the number of arrivals in the i-th partition interval \(T_i\) in the r-th week, \(r=1,\ldots , m\). Consider the statistics

$$\begin{aligned} Ds_i = \displaystyle \frac{1}{\mu _i}\displaystyle \sum _{r=1}^m \left( k_i^r - \mu _i\right) ^2, \quad i=1,\ldots ,N, \end{aligned}$$

where \(\mu _i = \frac{1}{m}\sum _{r=1}^m k_i^r\) is the average number of arrivals in the given interval for the same day of the week over the considered m weeks. Under the null hypothesis that the counts \(\{k_i^r\}\) are a sample of m independent Poisson random variables with the same mean count \(\mu _i\) (no overdispersion), then \(Ds_i\) is distributed as \(\chi ^2_{m-1}\), the chi-squared distribution with \(m-1\) degrees of freedom. Therefore the null hypothesis is not rejected with \(1-\alpha\) confidence level if

$$\begin{aligned} Ds_i \le \chi ^2_{m-1,\alpha }, \quad i=1,\ldots ,N, \end{aligned}$$

where \(\chi ^2_{m-1,\alpha }\) is, of course, the \(\alpha\) level critical value of the \(\chi ^2_{m-1}\) distribution.

Furthermore, the partition is feasible if data are consistent with NHPP. Namely, if we denote by \(k_i\) the number of arrivals in each interval \(T_i=[a_i, b_i)\) obtained by considering data of the same weekday, in the same interval, over m weeks, i.e. \(k_i=\sum _{r=1}^m k_i^r\), \(i=1,\ldots ,N\), the partition is feasible if each \(k_i\) has a Poisson distribution with a rate \(\lambda _i\) obtained as \(\mu _i/(b_i-a_i)\). To check the validity of the Poisson hypothesis, the CU KS test can be performed (see Brown et al. (2005), Kim and Whitt (2014a)). We prefer to use CU KS rather than the Lewis KS test since this latter is highly sensitive to rounding of the numerical values and the CU KS test has more power against alternative hypotheses involving exponential interarrival times (see Kim and Whitt (2014b) for a detailed comparison between the effectiveness of the two tests).

To perform CU KS test, for any interval \(T_i=[a_i, b_i)\), let \(t_{ij}\), \(j=1, \ldots ,k_i,\) be the arrival times within the i-th interval obtained as the union over the m weeks of the arrival times in each \(T_i\). Now consider the rescaled arrival times defined by \(\tau _{ij} =\displaystyle \frac{t_{ij}-a_i}{b_i-a_i}\). The rescaled arrival times, conditionally to the value \(k_i\), are a collection of i.i.d. random variables uniformly distributed over [0, 1]. Hence, in any interval, we compare the theoretical cumulative distribution function (cdf) \(F(t) = t\) with the empirical cdf

$$\begin{aligned} F_i(t) = \frac{1}{k_i}\sum _{j=1}^{k_i} {\mathbf{1}} _{\{\tau _{ij} \le t\}}, \qquad 0 \le t \le 1. \end{aligned}$$

The test statistics is defined as follows

$$\begin{aligned} D_i = \sup _{0 \le t \le 1}(\vert F_i(t)-t\vert ). \end{aligned}$$

The critical value for this test is denoted as \(T(k_i,\alpha )\) and its values can be found on the KS test critical values table. Accordingly, the Poisson hypothesis is not rejected if

$$\begin{aligned} D_i \le T(k_i,\alpha ), \quad i=1, \ldots , N. \end{aligned}$$

This test has to be satisfied on each interval \(T_i\) to qualify the partition P given by \(\{T_i\}\) as feasible, in the sense that the CU KS test is satisfied, too.

A further restriction is imposed on the feasible partitions. Given the experimental data, realistic partitions can not have a granularity too fine to avoid that some \(k_i\) being too small may unduly determine the rejection of the CU KS test. To this aim, a suited lower threshold value for the interval length must be chosen, taking into account the specific case study considered.

Now let us evaluate the feasible partitions also in terms of the characteristics of the function \(\lambda _D(t)\). It would be amenable to define a fit error for \(\lambda (t)\), which unfortunately is unknown. The problem can be resolved by considering a piecewise constant approximation \(\lambda _F(t)\) over a very fine partition \({P}_F\) of T. A set of 96 equally spaced intervals of 15 minutes was considered and the corresponding average rates \(\lambda _i^F\) were estimated from data.

The function \(\lambda _F(t)\) can be considered as an empirical arrival rate model. Note that partition \({P}_F\) need not be feasible since it only serves to define the finest piecewise constant approximation of \(\lambda (t)\). Therefore the following fit error can be defined

$$\begin{aligned} E(P)= \sum _{i=1}^N \sum _{j=1}^{N_j} (\lambda _j-\lambda _{i_j}^F)^2, \end{aligned}$$

where \(N_j\) is the number of intervals of 15 minutes contained in \(T_j\), and identified by the set of indexes \(\{i_j\}\subset \{1,\ldots ,96\}\).

Finally, it is also advisable to characterize the “smoothness” of any approximation \(\lambda _D(t)\) to avoid very gross partitions with high jumps between adjacent intervals  using the mean squared error

$$\begin{aligned} S(P) = \sum _{j=2}^N (\lambda _j-\lambda _{j-1})^2. \end{aligned}$$

In the following Sect. 2.2 the model features illustrated above are organized in a proper optimization procedure that provides the selection of the best partition according to conflicting goals.

The approach we propose enables us to well address the major two issues raised in Kim and Whitt (2014a) (and reported in the Introduction) when dealing with modelling ED patient arrivals, namely the choice of the intervals and the overdispersion. Concerning the third issue, the data rounding, the arrival times in the data we collected are rounded to seconds (format hh:mm:ss), and occurrences of simultaneous arrivals which would cause zero interarrival times are not present. Therefore, we do not need any unrounding procedure. Anyhow, as already pointed out above, the CU KS test we use is not very sensitive to data rounding.

2.2 Statement of the optimization problem

Any partition \(P=\{T_i\}\) of \(T = [0,24]\) is characterized by the boundary points \(\{x_i\}\) of its intervals and by their number N. Let us introduce a vector of variables \(x\in {{\mathbb {Z}}}^{25}\) such that

$$\begin{aligned} T_i=[x_i, x_{i+1}), \end{aligned}$$

\(i=1, \ldots ,24\), with \(x_1=0\) and \(x_{25}=24\).

Functions in (5) and (6) are indeed functions of x, and therefore will be denoted by E(x) and S(x), respectively. Therefore, the objective function that constitutes the selection criterion is given by

$$\begin{aligned} f(x) = E(x) + w S(x), \end{aligned}$$

where \(w>0\) is a parameter that controls the weight of the smoothness penalty term compared to the fit error: the larger w, the smaller the difference between average arrival rates in adjacent intervals; this, in turn, implies that on a steep section of \(\lambda _F(t)\) an increased number of shorter intervals is adopted to fill the gap with relatively small jumps.

The set \({\mathcal {P}}\) of feasible partitions is defined as follows:

$$\begin{aligned} \begin{array}{l} {{\mathcal {P}}}=\Bigl \{x\in {{\mathbb {Z}}}^{25} ~ | ~ x_1 = 0, \quad x_{25}=24, \quad x_{i+1}-x_i\ge \ell _i, \quad g_i(x)\le 0, \bigr . \\ \ \\ \bigl . h_i(x)\le 0, \quad i=1, \ldots , N \Bigr \} \end{array} \end{aligned}$$


$$\begin{aligned} \ell _i= & {} {\left\{ \begin{array}{ll} 0 \quad \hbox {if} \quad x_i=x_{i+1},\\ \ell \quad \hbox {otherwise},\\ \end{array}\right. } \end{aligned}$$
$$\begin{aligned} g_i(x)= & {} {\left\{ \begin{array}{ll} 0 \quad \hbox {if} \quad x_i=x_{i+1},\\ D_i - T(k_i,\alpha ) \quad \hbox {otherwise},\\ \end{array}\right. } \end{aligned}$$
$$\begin{aligned} h_i(x)= & {} {\left\{ \begin{array}{ll} 0 \quad \hbox {if} \quad x_i=x_{i+1},\\ Ds_i - \chi ^2_{m-1,\alpha } \quad \hbox {otherwise},\\ \end{array}\right. } \end{aligned}$$

\(i=1, \ldots , N\). The value \(\ell\) in (9) denotes the minimum interval length allowed and we assume \(\ell \ge 1/4\). Of course, constraints \(g_i(x)\le 0\) represent the satisfaction of the CU KS test in (4), while constraints \(h_i(x)\le 0\) concern the dispersion test in (2). Therefore, the best piecewise constant approximation \(\lambda _D^\star (t)\) of the time-varying arrival rate \(\lambda (t)\) is obtained by solving the following black-box optimization problem:

$$\begin{aligned} \begin{aligned} \max ~~&f(x) \\ s.t. ~~&x\in {{\mathcal {P}}}. \\ \end{aligned} \end{aligned}$$

We highlight that the idea of using as constraints of the optimization problem a test to validate the underlying statistical hypothesis on data along with a dispersion test is completely novel in the framework of modeling the ED patient arrivals process. The only proposal which uses a similar approach is in our previous paper (De Santis et al. 2020).

It is important to note that in (7) the objective function has no analytical structure in terms of the independent variables and it can only be computed by a data-driven procedure once the \(x_i\)’s values are given. The same is true for the constraints \(g_i(x)\) and \(h_i(x)\) in (8). Therefore the problem at hand is an integer nonlinear constrained black-box problem, and both the objective function and the constraints are relatively expensive to compute and this makes it difficult to efficiently solve. Consequently, classical optimization methods either can not be applied (since based on the analytic knowledge of the functions involved) or they are not efficient especially when evaluating the functions at a given point is very computationally expensive. Therefore to tackle the problem (12) we turned our attention to the class of Derivative-Free Optimization and black-box methods (see, e.g., Audet and Hare (2017), Conn et al. (2009), Larson et al. (2019)). Specifically, we adopt the algorithmic framework recently proposed in Liuzzi et al. (2020). It represents a novel strategy for solving black-box problems with integer variables and it is based on the use of suited search directions and a non-monotone line search procedure. Moreover, it can handle generally-constrained problems by using a penalty approach. We refer to Liuzzi et al. (2020) for a detailed description and we only highlight that the results reported in Liuzzi et al. (2020) clearly show that this algorithm framework is particularly efficient in tackling black-box problems like the one in (12). In particular, the effectiveness of the adopted exploration strategy concerning the state-of-the-art methods for black-box is shown. This is because the approach proposed in Liuzzi et al. (2020) combines computational efficiency with a high level of reliability.

3 The case study under consideration

The case study we consider concerns the ED of Policlinico Umberto I, a very large hospital in Rome, Italy. It is the biggest ED in the Lazio region in terms of yearly patients arrivals (about 140,000 on average). Thanks to the cooperation of the ED staff, we were able to collect data concerning patient arrivals for the whole year 2018. In particular, for this work, we focus on the patients′ arrival data collected in the first 13 weeks of the year, i.e. on data collected from the 1st of January to the 31st of March. Both walk-in patients and patients transported by emergency medical service vehicles are considered. The total number of arrivals, for each weekday, over the 13 weeks is reported in Table 1.

Table 1 The total number of arrivals, for each  weekday, over the 13 considered weeks

In Fig. 1 the plot of the average rates \(\lambda _i^F\) estimated from data over 96 equally spaced intervals of 15 minutes is reported.

Fig. 1
figure 1

Plot of the daily average arrival rate \(\lambda _i^F\), \(i=1, \ldots , 96\), over the 13 considered weeks

From this figure, it can be easily observed that, as expected, the arrival rate drastically changes from night hours to day hours, with significant growth during the morning hours.

In Fig. 2, the weekly hourly average arrival rate obtained by averaging the number of arrivals occurring in the same hourly time slot over the 13 considered weeks is reported.

Fig. 2
figure 2

Plot of the weekly average arrival rate for the 13 considered weeks

It is worth highlighting that, following the literature (see, i.e., Kim and Whitt (2014a)), the average arrival rates among the days of the week are significantly different. Therefore, since averaging over these days would lead to inaccurate results, the different days of the week must be considered separately. Specifically, we can choose any day of the week to apply the methodology under study and the same way would apply to other days, thus obtaining a different partition for each day. As an example, we choose Tuesday.

In Fig. 3, the plot of the hourly average arrival rate for the Tuesdays over the 13 considered weeks is reported, while Fig. 4 shows the mean and variance of the interarrival times that occurred on the first Tuesday of the year 2018.

Fig. 3
figure 3

Plot of the average hourly arrival rate for the Tuesdays over the 13 considered weeks of the year

Fig. 4
figure 4

Plot of the average (in solid green) and variance (in dashed red) of the interarrival times for the first Tuesday of year 2018. On the abscissa axis, 3-h time slots are considered

From this latter figure, we observe that these two statistics have similar values within each 3-h time slot and this complies with the property of the Poisson probability distribution for which mean and variance coincide.

We finally remark that seasonal phenomena might affect the number of weeks to be considered for model estimation due to large variability over successive weeks. Indeed, the overdispersion phenomenon may require a model calibration for each particular period of the year to take into account typical situations which occur, for instance, during flu season. This important aspect clearly emerges also from our experimentation reported in the next section.

4 Experimental results

In this section, we report the results of extensive experimentation on data concerning the case study described in Sect. 3, namely the ED patient arrivals collected in the first m weeks of the year 2018. Different values of the number m of weeks have been considered. Standard significance level \(\alpha =0.05\) is used in the CU KS and dispersion tests.

In the optimization problem at hand the value of \(\ell\) in (9) is set to 1 hour. Moreover, it is important to note that different values of the weight w in the objective function (7) lead to various piecewise constant approximations with different fitting accuracy and degree of regularity. Therefore, we performed a careful tuning of this parameter, aiming at determining a value that represents a good trade-off between a small fit error and the smoothness of the approximation.

In our experimentation, we used the default parameter values of the optimization algorithm adopted in Liuzzi et al. (2020). The stopping criterion is based on the maximum number of function evaluations set to 5000. As starting point \(x^0\) of the optimization algorithm we adopt the following

$$\begin{aligned} x^0_i=i-1, \qquad i=1,\ldots ,25, \end{aligned}$$

which corresponds to the case of 24 intervals of unitary length. This choice is a commonly used partition in most of the approaches proposed in the literature (see e.g., Ahalt et al. (2018), Kim and Whitt (2014a)).

We used R language and all the runs were performed on a PC with an Intel Core i7-2600 quad-core 3.4 GHz Processor and 16 GB RAM.

Table 3 in the “Appendix” reports the results of CU KS and dispersion tests applied to the partition corresponding to the starting point \(x^0\), considering \(m=13\) weeks. In particular, in Table 3 for each one-hour slot the sample size \(k_i\) is reported along with the p-value and the rejection/not rejection of the null hypothesis of the corresponding test. We observe that the arrivals are not overdispersed in any interval of the partition corresponding to \(x^0\), i.e. all the constraints \(h_i(x)\le 0\) are satisfied and this allows us to combine data for the same day of the week over successive weeks. However, this partition is even infeasible, i.e., \(g_i(x) >0\), for some i; this corresponds to reject the statistical hypothesis on some \(T_i\). Notwithstanding, even if the starting point is infeasible, the optimization algorithm we use can find an optimal solution.

As we already mentioned, the choice of a proper value for the weight w in the objective function (7) is important and not straightforward. On the other hand, the number m of weeks considered also affects both the accuracy of the approximation, through the average rates estimated on each interval, and the consistency of the results, which is ensured by constraints (10) and (11). However, while w is related to the statement of the optimization problem (12) and it can be arbitrarily chosen, the choice of m is strictly connected to the available data. In (Kim and Whitt 2014a, Section 4), the authors assert that having 10 arrivals in the one–hour slot 9–10 a.m., it is necessary to combine data over 20 weeks in order to have a sufficient sample size (200 patient arrivals). However, being their approach based on equally spaced intervals, one–hour slots are also adopted during off–peak hours, for instance during the night. This implies that the sample size corresponding to data combination over 20 weeks for these slots could no longer be sufficient to guarantee good results. This is clearly pointed out in Table 3 in the “Appendix” where the sample size \(k_i\) corresponding to some of the one-hour night slots is very low considering \(m=13\) weeks and it remains insufficient even if 26 weeks are considered (see subsequent Table 5). The approach we propose overcomes this drawback since, for each choice of m, we determine the length of the intervals by solving the optimization problem (12). Of course, there could be values of m such that problem (12) does not have feasible solutions, i.e. a partition such that the NHPP hypothesis holds and the results are consistent does not exists for such m.

To give an idea of the computational burden required by the application of our approach, we report the CPU elapsed time corresponding to one function/constraints evaluation used for solving the optimization problem 12. Table 2 indicates this time (in seconds) for different values of the parameter m.

Table 2 CPU elapsed time corresponding to one function/constraints evaluation for different values of m

This table evidences the increase of the computational effort required as the parameter m increases. We believe that the choice \(m=13\) represents a good trade–off between the accuracy of the results and computing time.

Now, to deeper examine how the parameters w and m affect the optimal partition, we performed a sensitivity analysis, focusing first on the case with fixed m and w varying. In particular, we have chosen to focus on \(m = 13\) weeks since, as discussed above, in this case, an overly computational burden is not required and anyhow we expect that no substantial changes in the conclusions would be obtained with different values of m. This is confirmed by further experimentation whose results are not reported here for the sake of brevity.

This analysis allows us to obtain several partitions that may be considered for a proper fine-tuning of w. In particular, we consider different values of w within the set \(\{0, ~ 0.1,~ 1, ~ 10, ~ 10^3\}\). Table 4 in the “Appendix” reports the optimal partitions obtained by solving the problem (12) for these values of w. In particular, Table 4 includes the intervals of the partition, the value of the sample size \(k_i\) corresponding to each interval over 13 weeks and the results of the CU KS and dispersion tests, namely the p-value and the rejection/not rejection of the null hypothesis of the corresponding test.

In Fig. 5, for graphical comparison, we report the plots of the empirical arrival rate model \(\lambda _F(t)\) and its piecewise constant approximation \(\lambda _D(t)\) corresponding to the optimal partitions obtained.

Fig. 5
figure 5

Graphical comparison between the empirical arrival rate model \(\lambda _F(t)\) (in green) and the piecewise constant approximation \(\lambda _D(t)\) (in red) corresponding to the optimal partition obtained by solving the problem (12) (with \(m=13\)) for different values of the parameter w. From top to bottom: \(w=0, 0.1, 1, 10, 10^3\) (Color figure online)

Two effects can be clearly observed as w increases: on the one hand, on steep sections of \(\lambda _F(t)\), shorter intervals are adopted to reduce large gaps between adjacent intervals; on the other hand, when \(\lambda _F(t)\) is approximately flat, a lower number of intervals may be sufficient to guarantee small gaps. This is confirmed by the two top plots in Fig. 5 which correspond to \(w=0\) and \(w=0.1\). In fact, in the first plot (\(w=0\)), where only the fit error is included in the objective function, and in the second one (\(w=0.1\)), where anyhow the fit error is the dominant term of the objective function, the optimal partition is composed of a relatively large number of intervals. In particular, in the partition corresponding to \(w = 0.1\), fewer intervals are adopted during the daytime. As expected, a smaller number of intervals is attained when \(w = 1\), \(w = 10\) and \(w = 10^3\). Note that, since on the steep section corresponding to the time slot 7:00–10:00 a.m. the maximum number of allowed intervals (due to the lower threshold value of one hour given by the choice \(\ell =1\) in (9)) is already used, the only way to decrease the smoothness term of the objective function is to enlarge the intervals during both the day and the night. It is worth noting that for \(w=10^3\), the number of intervals increases if compared with the case \(w=10\). This occurs to offset the increase in the fit error term due to the use of a smaller number of intervals on the flatter sections. As a consequence, the partition has an unexpected interval at the end of the day.

We point out that for each value of w, the optimization algorithm finds an optimal partition (of course feasible concerning all the constraints), despite some constraints related to the CU KS test are violated at the initial partition, i.e. the one corresponding to \(x^0\) in (13), namely the standard assumption of one-hour slots usually adopted. This means that the data used, comply with the NHPP hypothesis and are sufficient to appropriately define the piecewise constant approximation of the ED arrival rate.

Conversely, when the optimization algorithm does not find a feasible partition, the CU KS test or the dispersion test related to some \(T_i\) are never satisfied. This implies that the process is not conforming to the NHPP hypothesis or that the data are overdispersed. This is clearly highlighted by our subsequent experimentation where we set \(w=1\), letting m varying within the set \(\{5,9,17,22,26\}\).

First, in Table 5 in the “Appendix” we report the results of CU KS and dispersion tests applied to the partition corresponding to the starting point \(x^0\) in (13), for these different values of m. Once more, this table evidences that the use of equally spaced intervals of one-hour length during the whole day can be inappropriate. As an example, see the results of the tests on the time slot 02:00–03:00. Moreover, note that, for all these values of m, the initial partition corresponding to the starting point \(x_0\) is infeasible, except when \(m = 5\). Indeed, the constraints corresponding to CU KS and dispersion tests are violated for some \(T_i\), meaning that the validity of the standard assumption of one-hour time slots strongly depends on the time period considered for using the collected data. To this aim, a strength of our approach is its ability to assist in the selection of a reasonable value for m. If there is no value of m such that the optimization algorithm determines an optimal solution (due to infeasibility), then it may be inappropriate to consider the ED arrival process in hand as NHPP.

The subsequent Table 6 in the “Appendix” includes the optimal partitions obtained by solving problem the (12) for the considered values of \(m\in \{5,9,17,22,26\}\). Like the previous tables, Table 6 includes the intervals of the partition, the value of the sample size \(k_i\) corresponding to each interval and the results of CU KS and dispersion tests. For all the considered values of m, the optimization algorithm determines an optimal solution with the only exception of \(m=26\). In this latter case, the maximum number of function evaluations allowed is not enough to compute an optimal solution: in fact, we obtain an infeasible solution since the CU KS test related to the last interval of the day is not satisfied. This could be partially unexpected since more accurate results should be obtained when considering greater sample size. However, by adding the last four weeks (passing from \(m=22\) to \(m=26\)) which corresponds to June, the data become affected by a seasonal trend and the NHPP assumption is no longer valid. This confirms the remark reported at the end of Sect. 3 about the need of model calibration for possibly taking into account seasonality.

In Fig. 6 we report a graphical comparison between the empirical arrival rate model \(\lambda _F(t)\) and the piecewise constant approximation \(\lambda _D(t)\) corresponding to the optimal partitions obtained for the considered values of m. We observe that the variability of \(\lambda _F(t)\) reduces as the value of m increases since averaging on more data leads to flattening the fluctuation. Despite these rapid oscillations and unlike the other considered values of m, for \(m = 5\) the empirical model \(\lambda _F(t)\) shows a constant trend during both the night and day hours. This results in a piecewise constant approximation \(\lambda _D(t)\) that is flat in all the time slots of the 24 h of the day except the ones related to the morning hours, for which many intervals are used. To guarantee a good fitting error between \(\lambda _D(t)\) and \(\lambda _F(t)\), it would be necessary to use shorter intervals, but this is not allowed by the choice \(\ell = 1\) in the constraints (9). For the other considered values of m, the number of intervals increases, leading to partitions that improve the fitting error if compared with the case \(m = 5\). In particular, we observe that the piecewise constant approximation \(\lambda _D(t)\) obtained for \(m = 22\) benefits from the lower fluctuations resulting from averaging more data. Therefore, as expected, using the maximum number of available data leads to the most accurate piecewise constant approximation. However, when considering too much data, seasonal phenomena could give rise to the rejection of the null hypothesis of the considered tests, as observed for the case \(m = 26\). Moreover, as highlighted at the end of Section 5 in Kim and Whitt (2014a), a tendency to reject the NHPP hypothesis (i.e. the null hypothesis of the CU KS test) may be encountered when the sample size is large. In fact, a larger sample size requires stronger evidence of the null hypothesis for the test to be passed. Notwithstanding, our approach can overcome these drawbacks, providing us with an optimal strategy to identify the best way of using the collected data.

Fig. 6
figure 6

Graphical comparison between the empirical arrival rate model \(\lambda _F(t)\) (in green) and the piecewise constant approximation \(\lambda _D(t)\) (in red) corresponding to the optimal partition obtained by solving the problem (12) (with \(w=1\)) for different values of the parameter m. From top to bottom: \(m=5, 9, 17, 22, 26\) (Color figure online)

5 A preliminary assessment

As we already mentioned, this paper represents the first step toward the development of a complete and accurate DES model of the ED under study, thus it is only focused on modeling patient arrivals. However, it could be interesting to preliminary assess the impact of the method we propose in modeling the overall patient flow through the ED. Unfortunately, at this stage, we do not have detailed information concerning all the many processes inside the ED after triage, hence we are still unable to construct a complete simulation model of the ED. On the other hand, for evaluating the actual consequences of the proposed arrival process approximation, it is enough to construct a simplified model which includes only patient arrivals and triage, being the latter the first process encountered by a patient after arrival and the one most affected by patient arrivals process. This, of course, must be considered only a preliminary assessment.

In this simplified model, triage is modeled as “seize-delay-release” process where the seized resource is a dedicated nurse and the delay (the triage time in hours) is assumed distributed according to the Weibull distribution WEIBULL\((\alpha ,\beta )\) with \(\alpha =3\) and \(\beta =0.1\). Such distribution and its parameters have been obtained using a statistical analysis of available data on triage process times. The model has been implemented by using Ucar et al. (2019), a process–oriented and trajectory–based DES package for R language. In running simulations, we used the following setting: 30 independent replications each of 14 weeks length and warming–up period of 1 week.

In order to perform a significant assessment of the method we propose for patient arrivals, we compared results obtained by the following scenarios:

  • use of real data for generating patients arrivals for the first 13 weeks of the year 2018;

  • use of our method (with different choices of the weight w) for defining the NHPP process of patient arrivals;

  • use of the standard method which considers 24 one–hour intervals for defining the NHPP process of patient arrivals.

We choose to use as KPIs in this comparison those mostly affected by the arrival process, namely

  • average number of patients queued waiting for the triage;

  • average patient waiting time before the triage.

In Figs. 7 and in 8 we report these KPIs on an hourly basis. In particular, we report results obtained by the real data (as-is status), those obtained using our approach (using \(w=0\) and \(w=10\)) and the ones derived from the partition with 24 one–hour intervals.

Fig. 7
figure 7

Average number of patient queued waiting for the triage

Fig. 8
figure 8

Average of waiting times before the triage

We can observe from both figures, that the finer discretizations, i.e. those corresponding to one-hour slots and to \(w=0\) (18 intervals), provide results closer to the real process. On the other hand, the grosser partition obtained with \(w=10\), corresponding to 10 intervals, maintains a good fit with the real process, noticing that a small decrease in accuracy is evidenced in time interval 11:00–13:00 where the real data feature high variability while the partition provided by our method with (\(w=10\)) generates a wide interval with a constant rate from 9:00 to 14:00 (see Fig. 5). Therefore a good accuracy is obtained with a limited number of intervals that provide the management with a better scenario for planning the resource allocation for the ED services.

6 Conclusions

In this work, we examined the arrival process to EDs by providing a novel methodology that can improve the reliability of the modeling approaches frequently used to deal with this complex system, i.e. the Discrete Event Simulation modeling. Following the literature, we adopted the standard assumption of representing the ED arrival process as an NHPP, which is suitable for modeling strongly time-varying processes. In particular, the final goal of the proposed approach is to accurately estimate the unknown arrival rate, i.e. the time-dependent parameter of the NHPP, by using a reasonable piecewise constant approximation. To this aim, an integer nonlinear black–box optimization problem is solved to determine the optimal partition of the 24 h into a suitable number of non equally spaced intervals. To guarantee the reliability of this estimation procedure, two types of statistical tests are considered as constraints for each interval of any candidate partition: the CU KS test must be satisfied to ensure the consistency between the NHPP hypothesis and the ED arrivals; the dispersion test must be satisfied to avoid the overdispersion of data. To the best of our knowledge, our methodology represents the first optimization-based approach adopted for determining the best stochastic model for the ED arrival process.

The extensive experimentation we performed on data collected from an ED of a big hospital in Italy, shows that our approach can find a piecewise constant approximation which represents a good trade-off between a small fit error with the empirical arrival rate model and the smoothness of the approximation. This result is accomplished by the optimization algorithm, despite some constraints at the starting point, which corresponds to the commonly adopted partition composed of one-hour time slots, are violated. Moreover, some significant sensitivity analyses are performed to investigate the fine-tuning of the two parameters affecting the quality of the piecewise constant approximation: the weight of the smoothness of the approximation in the objective function (concerning the fit error) and the number of weeks considered from the arrivals data. While the former can be arbitrarily chosen by a user according to the desired level of smoothness, the latter affects the accuracy of the arrival rate estimation. In general, the more weeks are considered, the more accurate is the arrival rate approximation, as long as the NHPP assumption still holds and the data do not become overdispersed.

Further experimentation allowed us to perform a preliminary assessment of the proposed approach, monitoring the number of patients queued at the triage and the corresponding waiting time obtained in a simplified simulation model representing only the initial passenger flow through the ED, namely the arrivals and the triage processes. The results showed that it is possible to adopt a grosser partition of the 24 h, which is preferable to the management point of view, still ensuring a good fit with the real data process when compared with that of a plain partition with shorter equally spaced intervals.

As regards future work, to deeper analyze the robustness of the proposed approach, we could use alternative statistical tests, such as the Lewis and the Log tests described in Kim and Whitt (2014a), in place of the CU KS test. Moreover, whenever Discrete Event Simulation modeling is the chosen methodology to study ED operation, a model calibration approach could be also used to determine the best value of the weight used in the objective function to penalize the “smoothness term”. The optimal value of this parameter could be obtained by minimizing the deviation between the simulation outputs and the corresponding key performance indicators computed through the data. This enables to obtain a representation of the ED arrival process that leads to an improved simulation model of the system under study.