1 Introduction

Choice has become more important in health care, as patients can compare hospitals and health care providers much more easily using the wealth of statistics that are available on the Internet. In England, patients within the National Health Service (NHS) have been able to choose which hospital they attend for elective treatments since 2006, and similar policies are in place in other European countries, e.g. Sweden, France, the Netherlands and Italy. The introduction of choice has led to uncertainty among health care providers as to expected patient numbers and how different attributes of the hospitals will influence patients’ decisions. Using admissions data for patients resident in Derbyshire, UK, we use discrete choice theory to describe how patients choose their hospitals, as an aid to predicting future admissions. Derbyshire is a county in the Midlands of England, including one large city, the city of Derby, which is surrounded by open countryside and some small towns. The county of Nottinghamshire is to the east: patients from the eastern part of Derbyshire may attend hospitals in Nottingham, which are included in our study. The study is more comprehensive than many that have gone before in that the region contains a mix of rural and urban areas. Moreover, we consider patient choices for both elective hospital admissions (those for which the decision to admit a patient and admission itself are separated in time) and non-elective (emergency) admissions. Such a mixed urban/rural region is typical of many English counties and thus offers an opportunity for generalisation of the results.

The provision of NHS healthcare and its administration in England has been subject to much change since its inception. For example, at the time of writing, Sustainability and Transformation Plans (STPs) (https://www.england.nhs.uk/ourwork/futurenhs/deliver-forward-view/stp/) are being drawn up to describe the ways that services will evolve regionally in coming years to serve the local population in a sustainable manner. The motivation of this research is thus to find appropriate models that can be used for predictions of patients likely to access services that are planned to change location or level of service. Our study of Derbyshire began at the time of a transfer of services from a hospital in the centre of the City of Derby to what is now the Royal Derby Hospital to the west of the city. This transfer of location was likely to cause some changes in access patterns: the question was raised as to whether extra beds in general wards should be made available in Nottingham hospitals, to serve patients close to Nottingham but living in Derbyshire, who would become now more distant from the new hospital. A simple “go-to-nearest” approach was initially used to predict access, but it was felt that a study of other factors likely to affect patient numbers would be relevant for similar geographical shifts of general patient services.

The main objective of our study is to compare the use of two discrete choice models, the Multinomial Logit (MNL) model and the Utility Maximising Nested Logit (UMNL) model, in describing how patients access hospitals in a region where there are several different hospitals within reach. This will allow us to determine whether the decision over which hospital to attend is a multiple or single stage process, i.e. whether patients first choose a city or group of hospitals and then make a choice from within that group, or simply choose the hospital that best suits them from a full list within the region. We treat elective admissions and non-elective admissions separately. We are not, however, making predictions for admissions for specific diseases, because of small numbers of patients by disease in our postcode sector level data. Moreover, it is the prediction of overall patient numbers that is the focus of this study. A further objective is to determine whether data available on websites concerning hospitals play a significant part in choices made by patients.

We continue in Section 2 with a description of the background to discrete choice modelling and its application in healthcare settings. We detail the data used in our study and the models we use in Section 3. We present results in Section 4, with a comparison of MNL and UMNL models, and conclude with a discussion in Section 5.

2 Background: discrete choice modelling

Discrete choice theory seeks to model choices between distinct alternatives; the theory was originally applied to modelling how people choose their mode of transport [3, 4]. A discrete choice model estimates the probabilities of a decision maker selecting each of a given set of discrete alternatives. An estimate of the probability of making a particular choice is found by first calculating the utility associated with that choice, according to the characteristics of the chooser and the attributes of the choices. It is assumed that the preferred choice is the one that maximises the utility of the chooser. Reviews of the subject are given by [17] and [36].

In choice analysis, a distinction can be made between revealed and stated preference data [17]. Revealed preference (RP) data are collected after choices have been made, while stated preference (SP) data are collected by survey, and can be used to discover preferences between choices that may be hypothetical or real, or new and as yet not utilised. Such surveys are known as stated choice experiments or discrete choice experiments (DCEs). A particular study may have RP data available while additionally collecting related SP data.

We have access to detailed RP data, which is arguably a better description of how patients actually behave when faced with a decision than SP data, which instead captures how they think they might behave. RP data from the Netherlands are used in [37] and [5], and from the UK in [2], all of which focus mainly on choice of orthopaedic care. Several hospital choice studies use similar data to ours, hospital discharge records, but in a United States setting [10, 24], and [35] use Medicare claims data.

DCEs and the SP data they provide have been more widely used than RP data in health care choice applications in the past. For example [22] use SP data from a binary choice experiment to determine the factors that influence older patients’ decisions when undergoing cataract surgery. Here, the options were defined as accept treatment or reject treatment, with the factors being tested including the distance to the hospital, waiting time, cost and the staff available to carry out the procedure. Their justification for conducting a binary choice experiment rather than a full DCE was that the choice being made is much simpler for participants to understand for a binary choice experiment, suggesting that it might be possible to see more evidence of trade-offs between different characteristics. [29] use a full DCE administered as a postal survey to determine how patient characteristics might affect the decisions made; in particular a patient’s education and social status are considered. This followed on from a large DCE carried out in the UK and described in [9]. Other examples of stated preference data include [31] which evaluates clinical practice, considering health and non-health outcomes such as reassurance and dignity, waiting times, continuity of care and location; while [15] apply a choice model to emergency care.

The MNL model [4, 23] is the basis of many of the choice models that allow for more than two options. It is one of a general class of models described as “random utility” models, since the utility is assumed to be the sum of an observable part and an unobservable random part or error term. Different random utility models make different assumptions about the error terms. The form of the MNL model is tractable and intuitive to use; it has proved popular in many applications [36] and it has been widely used to model choices in health care (e.g. [24, 33]). With the MNL model, alternatives are assumed to be uncorrelated, which implies that the ratio of the probability of choosing one alternative to the probability of choosing a given other alternative remains constant even if new alternatives are inserted or existing alternatives are deleted. This results in the well-known assumption of the “independence of irrelevant alternatives” (IIA) [23]. One situation where the IIA assumption will not hold is when decision-makers follow a two-stage process: for example, they first choose a group or nest of options and then make their final selection from within that group. In the example we consider here, we wish to test whether patients first select all of the hospitals from a given city and then choose their favoured hospital from within that city.

Models such as the Nested Logit (NL) [3, 26] group together or “nest” similar alternatives. In an NL model, subsets of similar alternatives are nested together, where the alternatives included in the same nest have a greater degree of similarity than the alternatives in other nests. This can be interpreted as choices being made in a hierarchical manner. This structure seems to have the potential to work well with the example presented here, where we observe similarities between two pairs of hospitals: a choice of city for treatment may precede a choice of hospital. Determining whether this is the case is one of the objectives of the research and we employ an NL model with a set hierarchical structure, with mixed results. Other authors have employed NL models for hospital choice, e.g. [32] who use a two-level nested multinomial logit model to distinguish between hospital and clinic-based care in rural Tanzania; [8] who have a similar structure with the first decision being made between seeking inpatient care or not and the second level decision being the choice of hospital. A more complex decision process is presented by [28], where there are three levels to the decision process: the first decision is between self-care and formal care, the second between GP, emergency room visit and specialised clinic and the third between NHS, direct payment and private insurance.

There are two formulations of the NL models used in practice: the Utility Maximising Nested Logit (UMNL) model and the Non-Normalized Nested Logit (NNNL) model. The UMNL model, which is a special case of the generalized extreme value (GEV) discrete-choice model family, is defined as McFadden’s model [26] and allows for utility maximisation. The NNNL model [12], is more computationally efficient but is not compatible with utility maximisation. Koppelman and Wen [21] describe a comparison between the UMNL and NNNL models using data on the choice of transport modes in Canada and demonstrate that the NNNL and UMNL models can predict substantial differences in the estimation of choice behaviour. They recommend the UMNL model because it is based on utility maximization, as already stated, but also because the interpretation of the parameters is more intuitive. We use the UMNL in this study.

One of the early studies of choice in healthcare [24] describes the use of an MNL model to study the impact of quality on hospital choice. Their results suggest that quality is important in the choice of hospital for surgical procedures, amongst other factors. A subsequent paper [10] suggests that hospital size has a strong positive effect upon hospital choice. Mixed Logit models are used to study and examine the hospital choice for Acute Myocardial Infarction patients [35], for kidney transplants [19], and for pneumonia patients [16]. Mixed Logit models can relax some of the more restrictive assumptions of MNL models such as the IIA discussed above and can capture unobserved heterogeneity in the coefficients of the hospital attributes [36]. They are therefore particularly flexible. These papers emphasize hospital quality and cost as the main determinants for hospital choice. In contrast, later papers demonstrate that distance from the hospital and waiting times can also have a strong impact on how patients choose their hospital. Burge et al. [9] develop a discrete choice model based on SP data from patients in the London area to analyse the trade-off between waiting times, distance and quality. However, the survey audience included only patients who had been waiting for more than 6 months and therefore can be thought of as a special case.

The variables we include in our RP study are similar to those of [9]; however, the geography of our study area is very different, comprising a mostly rural area with two main cities and a further two important towns. This geography is arguably more typical of many regions and therefore allows for a more general exploration of the effect of location on patient choice. We also make use of RP data rather than carrying out a DCE, making it more likely that similar analysis to ours could be carried out in other health care organisations which may not be able to afford a full DCE.

We include an average waiting time for elective care treatment, available from NHS websites, in our modelling. The effects of waiting times on patient choice are studied by several authors. [37] develop a binary choice model to study how waiting times affect the decision to bypass the nearest hospital. They find that the probability of bypassing a hospital decreases by between 2% and 10% when waiting times are less than the average. Travelling time is also considered in further work by [38] in order to estimate competition in the hospital markets. The effect of waiting times is summarised effectively by [33] using waiting time elasticities, using a case study of cataract surgery. By using a latent-class, multinomial logit model, they are able to characterise the heterogeneity of choice behaviour exhibited by the decision-makers, in this case general practitioners.

Moving beyond the calculation of the effects of different features on choice, [7] build a simulation model of musculoskeletal services in Scotland that uses models of patients’ willingness to travel for treatment to determine the effect this might have on waiting times. They show that the introduction of choice can reduce the variability in waiting times between different healthcare providers. Although variability may be reduced, [20] use a simulation model of elective knee operations in Wales to show that introducing choice results in increases in the mean waiting time for the whole system of healthcare facilities.

3 Methods

3.1 Factors affecting a patient’s choice of hospital

The UK NHS allows patients to make decisions about their health, and patients generally have the right to choose a hospital for consultant-led care. There is a range of criteria that patients might use to select a hospital and several organizations, including the NHS, provide information about hospitals, e.g. Dr Foster (http://www.drfosterhealth.co.uk/), NHS Choices (http://www.nhs.uk/Service-Search/) and the Care Quality Commission (http://www.cqc.org.uk/). We discuss some possible criteria for patient choice below.

  • Travelling time and distance: the location of hospitals relative to patients and their companions is expected to be a crucial factor in patients’ hospital choice. The report on the National Patient Choice Survey carried out in February 2010 in England reported that choosing a hospital close to their home or work place is the most important factor for patients [14]. Proximity is especially important for those on low incomes or those who rely on hospital transport. Nonetheless, there is always a trade-off between distance and other factors such as confidence in the treatment and waiting time [37]. In this study we use travel time by car as a proxy for the inconvenience of travel; however, we acknowledge that the assumption that all patients travel by car may introduce inaccuracies, particularly in the urban parts of this study area.

  • Size of hospital: larger hospitals are more likely to offer a greater variety of services than smaller ones and therefore attract more patients [10]. Moreover, effects of patients bypassing smaller local hospitals in favour of more distant larger hospitals, such as teaching hospitals, have been observed in several countries [37]. In this study we use number of beds as a proxy for the size and level of functioning (secondary or tertiary) of hospitals.

  • Waiting time: in the report on the National Patient Choice Survey, the length of wait for an appointment is ranked third among a list of the most important factors for patients when choosing a hospital [14]. It seems reasonable that some patients might be happy to travel to a more distant hospital if it reduces their waiting time. Interestingly, an SP study of patients on the Isle of Wight to compare the trade-off between waiting time and travelling distance showed that patients are willing to travel to hospitals that are far away, if the waiting times of those hospitals are decreased by at least 3.9 months [30].

  • Cleanliness: a topical issue in recent years, patients have become more concerned about the cleanliness of facilities. Their perception of a hospital’s cleanliness is thought to be based on the appearance of the environment, with dirty public areas introducing the worry that more critical areas such as operating theatres may not be cleaned to a high standard [39].

  • Availability of facilities: facilities include car parking spaces, accommodation, food, and translation services. Parking spaces are often cited as being an important determining factor in the choice of hospital, although [27] suggests that this is only of high importance for visitors and outpatients. They also find that the provision of single rooms makes a hospital more attractive, while food is of lower importance.

  • Quality of care and services: quality scores for each hospital in the UK are provided annually by the Care Quality Commission, which regulates the health and social care services in England. Hospitals are graded on inspection as Outstanding, Good, Requires Improvement or Inadequate, where the grading takes into account accidents and mistakes, infection control, and the death rate. The Care Quality Commission also carries out the Adult Inpatient Survey (http://www.cqc.org.uk/content/adult-inpatient-survey-2015), which covers many aspects of the patient experience and patients’ perceptions of their care. We use averages of the latter quality indicators as a variable in our modelling, as these are likely to be representative of the opinions of patients regarding the quality of their local hospitals. Hospital quality is reported as being an important influence on choice for surgical procedures [24].

Patients’ choice of hospitals will also depend on the type of illness. [13] states that patients with chronic illnesses have different hospital choices from those with acute illnesses. For example, continuity of care becomes the most important issue for patients with a chronic illness. This is confirmed by a study in 2006 [27] which suggests that a patient is most likely to attend the nearest hospital under the following circumstances: after an accident; with severe, serious and unpleasant pain; or suffering from acute illness. A patient who falls ill with a more complicated disease is more likely to attend a hospital where there are specialists in that field.

Throughout this paper, we make the assumption that the patient is making the decision about which hospital to attend, whereas research suggests that many patients rely on their general practitioner (GP) to make the decision for them [25], believing that they are better able to process the available information on performance indicators. As long as the decision makers use similar factors in choosing the hospital this should not impact on the results significantly. It is possible that some errors may be introduced here, because in a rural area a patient’s GP may be situated some distance from the patient’s place of residence. In these cases, the GP may be giving advice based on a different distance measure.

3.2 Data

We use Hospital Episode Statistics (HES) data for 2008 and 2009, made available by the (then existing) Derbyshire Primary Care Trust (PCT), aggregated by the postcode sector of the residences of patients attending the hospitals. These data cover both the urban and rural regions of Derbyshire, including the city of Derby. There are 226,492 attendances included in the data, of which 106,248 activities, or 46.9%, are elective and 120,244 activities, or 53.1%, are non-elective. We use separate choice models for electives and non-electives. The distance to hospitals from each of the 133 Derbyshire postcode sectors is measured in expected travel time by car; Microsoft MapPoint software was used to calculate these travel times. Six hospitals are included in the data: Chesterfield Royal Hospital, Derby City General Hospital (Derby CGH, now the Royal Derby Hospital), Derby Royal Infirmary (Derby RI, now closed), Nottingham City Hospital (Nott CH), Nottingham Queen’s Medical Centre (Nott QMC) and Queen’s Hospital Burton-on- Trent.

As a measure of waiting times, we use the average time in days that patients wait for their first outpatient appointment and their subsequent admission to hospital, for the nine most common procedures at the six hospitals. These waiting times come from Dr Foster (www.drfoster.com), a well-known organisation that collects and analyses hospital data in England and worldwide in a rigorous and transparent manner. We also use data from Dr Foster on the numbers of beds, the patient safety scores and the numbers of car parking spaces for each of the hospitals. The scores for cleanliness of wards and the inpatient survey scores for overall care, averaged over the survey questions, were obtained from the NHS Choices website (www.nhs.uk), which provides data on each hospital in the UK in a standardised way that is easy to search. Scores were not available for the Derby hospital that is now closed: for this case we assume the same data as for the other Derby hospital.

3.3 Models

The MNL model can be used to describe decisions between three or more alternatives. The decision-maker is assumed to have an unobservable preference or utility for each alternative, which is made up of a deterministic component and a random component. The model assumes that the distribution of the random component can be determined and that a decision-maker will choose the option with the greatest utility. Under the standard MNL model, a decision-maker i will choose option n out of N possible alternatives with probability

$$ P(y_{i} = n) = \frac{\exp\left( x_{in}^{T}\gamma\right)}{{\sum}_{k=1}^{N} \exp\left( x_{ik}^{T}\gamma\right)}, $$
(1)

where y i is the decision made by decision-maker i. The vector x i n contains the explanatory variables that describe the attributes of alternative n and those characteristics of the patient making the decision which might affect their choice behaviour. We assume that each patient places the same weighting on the attributes, x i n , which we denote by the vector γ, which must be estimated.

The random component of the utility is assumed in an MNL model to have independent and identically distributed errors drawn from a Gumbel distribution. The independence of the errors implies the property of independence of irrelevant alternatives (IIA), as discussed in the Introduction. This property is not always valid for a given data set and nested logit (NL) models can be used in these cases.

In NL models, the alternatives are split into non-overlapping nests. McFadden’s UMNL model is structured so that a decision-maker (the patient in this example) chooses firstly the nest m,m = 1,…,M, from which their final selection will come and secondly the alternative n,nN m , from the set of alternatives N m for nest m. The probability of choosing alternative n is therefore equal to the product of the probability of choosing alternative n from nest m and the probability of choosing nest m

$$ P_{n} = P_{n \vert m}P_{m}. $$
(2)

Let V n and 𝜖 n represent the observable and unobservable components of the utility for alternative n, such that U n = V n + 𝜖 n ,n = 1,…,N, and using the notation above, \(V_{n} = x_{in}^{T}\gamma \). The marginal probability of choosing nest m, P m , is given by

$$P_{m}=\frac{e^{\mu_{m}{\Gamma}_{m}}}{{\sum}_{j=1}^{M}e^{\mu_{j}{\Gamma}_{j}}}. $$

where 0 ≤ μ m ≤ 1 and

$${\Gamma}_{m} = \ln\left[{\sum}_{k\in N_{m}}e^{V_{k}/\mu_{m}}\right] $$

is the logsum or inclusive value (IV) variable of nest m and is equal to the expected value of the maximum of the random utilities of alternatives in nest m. The variable μ m is the IV parameter. From the above equation, we can see that the utility of nest m is μ m Γ m .

Rearranging Eq. 2, the conditional probability P n|m can then be written as

$$P_{n \vert m} = \frac{e^{V_{n}/\mu_{m}}}{{\sum}_{k\in N_{m}}e^{V_{k}/\mu_{m}}}, $$

which has a similar form to Eq. 1 above.

Of the six hospitals in our data, the two Derby hospitals (Derby Royal Infirmary and Derby City General Hospital) are nested together, as are the two Nottingham hospitals (Nottingham City Hospital and Queen’s Medical Centre), while Chesterfield Royal Hospital and Queen’s Hospital Burton are treated individually. The choice of nesting structure was made based on initial correlation and cluster analyses.

We used SAS ®; 9.4 to fit both the MNL and the UMNL models to the available data, using maximum likelihood.

4 Results

The Results Section begins with some descriptive statistics on numbers of patients and correlations between variables. Results of running MNL and UMNL models then follow, with a discussion of boundary conditions for IV parameters of NL models, whereby we conclude that the NL model structure may not be valid for these data.

Table 1 shows the break-down of admissions coming from Derbyshire in the study period by hospital, for both the elective and non-elective data (see Section 3.2 for a list of hospitals and abbreviations of names). Pearson’s correlation coefficients were computed between all variables related to hospital characteristics: numbers of beds and parking places, waiting times and scores from the cleanliness, safety and inpatient surveys. These data are at a hospital level and apply to all patients, with the exception of waiting times, which do not apply to non-elective patients. However, p-values were all found to be greater than 0.05, except for that of cleanliness and inpatient survey, which is less than 0.0001. The high correlation between the cleanliness and inpatient surveys, 0.9953, can thus be deemed to be significant. In the modelling that follows, we have entered these two variables separately in models.

Table 1 Numbers of elective and non-elective admissions to hospitals in Derbyshire and adjacent parts of Nottinghamshire

4.1 MNL models

MNL models were successfully fitted to both elective and non-elective data using SAS Proc MDC with the Newton-Raphson optimisation method. To achieve maximisation of the likelihood function, the Hessian matrix (matrix of 2nd order partial derivatives) must not be singular (indicating collinearity) and must have all eigenvalues negative (for a maximum point). Results from the MNL models for elective admissions are shown in Table 2. For these patients, the “website” variables (average waiting time, cleanliness survey, patient safety score and inpatient survey results) were found to be non-significant (p-values ranging from 0.861 to 0.997), as a group, in the presence of the other “physical” variables (travel time from postcode sector and numbers of beds and parking places).

Table 2 Estimated parameter values for the MNL model for elective patients

For non-elective admissions, both the “physical” and the “website” variables were found to be significant in initial models. Waiting time for appointments was not entered as a factor into the regression, as this was not relevant to emergency situations. Models containing either cleanliness survey or inpatient survey (found to be highly correlated) were compared using the log-likelihood ratio (LLR), giving values of 254,206 and 254,232 respectively. We included them only one at a time in models, since interpretation of the results with such high correlation would be difficult. Table 3 shows the resulting MNL model including the inpatient survey. We compare this with the MNL model for “physical” variables only, in Table 4. The latter model has an LLR of 254,073. We find that use of the extra “website” variables gives only a small improvement in LLR, and also has unexpected positive or negative coefficient signs, e.g. patients choosing fewer parking spaces and lower safety. We therefore recommend the “physical” variables alone for MNL modelling.

Table 3 Estimated parameter values for MNL model for non-elective patients, mixed “physical” and “website” variables
Table 4 Estimated parameter values for MNL models for non-elective patients, final variables

Using the LLR, we test the goodness-of-fit of the MNL model and a reference model, which takes no account of the hospital characteristics. This shows that the MNL is statistically significantly better at explaining the data than the reference model.

4.2 UMNL models

We continue with results for fitting the UMNL model to both elective and non-elective admissions data, using SAS Proc NLP with Nelder Mead optimisation. We assign the two Derby hospitals to one nest and the two Nottingham hospitals to another; the Chesterfield and Burton hospitals are separate. For these models, we consider only the “physical” variables found suitable with MNL modelling. Tables 5 and 6 give elective and non-elective results. All variables are significant for non-electives. For electives, including both time and parking variables gives the maximum log likelihood with two variables; using all three variables makes parking insignificant.

Table 5 Estimated parameter values for the UMNL model for elective patients
Table 6 Estimated parameter values for the UMNL model for non-elective patients, all “physical” variables

As discussed below, the IV values can be used to assess how appropriate the nesting structure is. In all versions of UMNL elective and non-elective models, the coefficients for the IV values are greater than one. A reduction in the IV values is obtained by reducing the numbers of variables considered. We present in Table 7 the UMNL results with just time and parking, which also gave a higher log likelihood than just time and beds (-84984 vs -85590 respectively). For the purposes of model suitability, this version of the UMNL for non-elective admissions is preferable to that with higher IV values.

Table 7 Estimated parameter values for the UMNL model for non-elective patients, reduced variables

The Daly-Zachary-McFadden (DZM) conditions described in [6] state that for the UMNL model to be globally consistent with stochastic utility maximization, the IV parameters should be between zero and one, with values outside this range suggesting a mis-specification of the nesting structure. Börsch and Supan suggest that these conditions are overly restrictive [6] and that it is only necessary to find IV parameters that ensure the model is consistent over the expected range of deterministic parts of the utility, V n ,n = 1,…,N. While the Börsch-Supan conditions were corrected by Herriges and Kling [18], the condition for two options in a nest remains unchanged and states that

$$P_{j} \geq (IV_{j}-1)/IV_{j}, $$

where P j is the probability of choosing nest j [11]. Using the results presented in Table 1, the probability of selecting the Derby nest for elective patients is 0.600 (obtained by summing together the actual proportions of patients attending the Derby hospitals) and the probability of selecting the Nottingham nest is 0.092. The boundary conditions for these probabilities are 0.382 and 0.441 respectively, suggesting that the nesting structure is not valid for the elective patients, with irregularity arising with the Nottingham nest. For non-elective admissions, the probabilities of selecting the Derby and Nottingham nests are 0.555 and 0.086 respectively, with boundary conditions of 0.600 and 0.492. Again this suggests that the nesting structure is invalid for non-electives. It may be noted, however, that the small proportion of patients from Derbyshire attending Nottingham hospitals may introduce errors into this result, since the population is relatively low in the countryside between Derby and Nottingham. There are some concerns over this relaxation of the DZM condition (e.g. see [11]) but taking it into account gives us more certainty that the proposed nesting structure is not valid for these data.

4.3 Model Comparison

We summarise the estimated parameters for elective and non-elective admissions in the final MNL and UMNL models in Table 8. Consistency can be noted in the coefficients estimated for travel time in all models, and also for the coefficients of parking spaces in both MNL models.

Table 8 Summary of estimated parameter values in the final MNL and UMNL models for elective and non-elective patients

We use the Akaike Information Criterion (AIC),

$$AIC = 2k-2\ln(L), $$

to compare the goodness-of-fit of the UMNL and MNL models (see Table 8), where k is the number of model parameters and L is the likelihood for the best-fit model. This takes into account both the estimate of the likelihood of a model and the number of estimated parameters required [1]. The UMNL model has a lower AIC than the MNL model for both the electives and non-electives, suggesting the additional parameters are justified by the improved fit. Nonetheless, the IV values for all of the UMNL models lie outside the acceptable range, suggesting that the nesting structure used may not be valid for these data. As a result we conclude that the MNL model is likely to give more accurate predictions for these data.

5 Discussion

The contribution of this research is to demonstrate the potential for use of MNL models to describe choice of hospital for general admissions, both elective and emergency, by patients residing in mixed urban/rural regions. The RP data analysed in this case study comes from the period of time after the introduction of patient choice of hospital for elective services in 2006 in the UK. We compare the use of MNL and UMNL models in describing these data; we do not find that the UMNL model is valid to model a hierarchical choice pattern of, firstly, city or group of hospitals and, secondly, individual hospital.

The results show that travelling time is the major influencing factor for all admissions in Derbyshire, for both elective and non-elective patients, implying that utility will increase if travelling time decreases. This agrees with the results of the majority of previous studies (see e.g. [5, 33]). It is also worth emphasizing here that we are working with a large, comprehensive data set that includes admissions data for all hospital departments in a region that incorporates both urban and rural areas, rather than focusing on just one specialty as other authors have done (e.g. [2, 37]). Our results can therefore confirm that the expected dependence on travel times and other factors still holds for all hospital admissions in a geographically diverse region.

Perhaps of more interest in our results is the impact of other factors on the decisions being made. Our results show that the numbers of parking spaces increase utility, as well as the numbers of beds, which may be regarded as a proxy for the size or level, secondary or tertiary, of a hospital. We find that data derived from web sources does not have a significant impact on choices made by elective patients and appears to be unreliable for non-elective patients. This situation could arise either if most patients do not consult websites for data concerning hospitals in the region, or, alternatively, because physicians are advising patients based on personal knowledge, for example concerning consultants in particular specialties.

We have demonstrated a successful MNL model fit to the data but the added complexity of the UMNL has not been shown to be valid. We chose to implement the UMNL model, a nested logit model, to capture some possible structure in the choices. The UMNL model was chosen in favour of the non-normalized nested logit model [12] because of its emphasis on utility maximisation. Results suggest that the nesting structure is not well-defined for either elective or non-elective admissions. It appears that patients do not choose hospital firstly on the basis of the destination city before choosing the hospital within the city. However, the small proportion of attendances from Derbyshire in Nottingham (9 %) may have affected this result: it would be of interest to model data from a region with high volumes of attendances in two cities with several hospitals. Other nesting structures were tested but did not improve the results sufficiently to warrant use of an unintuitive structure.

A limitation of this modelling is that data is aggregated at postcode sector level, rather than patient level, and it has not been possible to analyse the impact of patient characteristics. For example, people without cars will not be interested in numbers of car parking spaces available. Moreover, we have not considered levels of deprivation in the different areas, which could reasonably affect the distances that patients are prepared to travel and car ownership. A further limitation is that distance of travel may have a non-linear effect on patient attendance at healthcare facilities; exponential drop-off of demand with distance has been modelled in the literature [34].

Car ownership, rurality, index of deprivation (an indicator of the wealth of an area) and propensity to seek healthcare are omitted variables that could cause endogeneity problems. Richer patients who own cars could be more willing to travel further to see a desired consultant, but might also choose private treatment or some other means to avoid accessing public health care. People living close to a hospital could have a lower threshold for choosing to seek care: those at a distance in a rural region might be more self-sufficient in preventive healthcare and prefer to avoid loss of time in seeking support.

It can be shown that the MNL model distributes the total expected choices of the different hospitals according to the proportions in the data to which the model is trained [36]. An extension of this study would be to validate the modelling using a different dataset for a region with choices of several hospitals.

Whilst our model predicts how patients choose their hospital to a reasonable extent, there will be some errors in the prediction which are most likely due to factors that are not included in our model, such as the level of expertise available at each hospital. Our modelling has taken a broad view of general patient access, rather than detailing individual services offered. Determining how to include more subjective and qualitative data in the choice model would be a valuable extension to this work.

The Mixed Logit model, though beyond the scope of this research, would be appropriate for future work in this area, with its flexibility in relaxing assumptions of the MNL model. Furthermore, ideas from Artificial Intelligence could be used to better predict patients’ choices.

This research is relevant to health policy makers and service commissioners who are in the position of planning services across regions which are mixed urban/rural in nature, and where there are several hospitals or other facilities available to the public to choose from. Multinomial Logit models have been demonstrated to be able to provide forecasts of general hospital admissions, both elective and emergency. Several factors in addition to travel times can be included in such models; however, it is not evident that patients make a “hierarchical” choice of services, of, firstly, location or city and, secondly, hospital. The use of Utility Maximising Nested Logit models has not therefore been shown to be appropriate in the situation modelled.