1 Introduction

Stated preference methods such as Contingent Valuation (CV) using Discrete Choice Experiments (DCE) involve presenting detailed information to a randomly selected sample of the population and subsequently eliciting their preferences and willingness to pay. Rather early on in the application of CV to the valuation of environmental goods, researchers found that providing respondents with new or differently-phrased information on these goods could change measures of willingness to pay (Thaler 1980). This should not come as a surprise, as the same phenomena apply to goods and services traded in markets—what people know about the characteristics of a good, and of substitutes and complements, can co-determine their values (Milgrom 1981; Munro and Hanley 2002).

In this paper, we present an approach for controlling for the effects of different information sets provided to respondents in a very popular class of generalized multinomial logit (G-MNL) models. Specifically, we allow information to affect both preferences and the mean and variance of individual-specific scale parameters in a random utility model (RUM). In this way we incorporate both observed and unobserved preference and scale heterogeneity to investigate differences across and within information treatments. As a result, our technique better controls for different information effects than existing methods in the literature.

Random Utility forms the basis for many applications of both stated and revealed preference approaches. In a random utility model, the formulation of the utility function leads to an empirical model in which the observed choices of an individual are used to link choice alternatives with utility levels. An agent’s utility depends on a deterministic component \(V\) and a random component \(\varepsilon \) (McFadden 1974). The deterministic component, \(V\), is comprised of estimated preference parameters which map attributes and individual characteristics to utility. The introduction of the error term \(\varepsilon \) is due to the researcher’s inability to observe all attributes of choice and all significant characteristics of respondents which influence her choices (McFadden 1976).Footnote 1 Pragmatically, this makes it possible to explain why apparently equivalent individuals (equal in all attributes which can be observed) may make different choices.

Variation of the random component of utility \((\sigma _{\upvarepsilon })\) relative to the magnitude of the deterministic component is often called the scale parameter. As the scale parameter increases, the size of the deterministic portion of utility increases relative to the idiosyncratic portion of utility. Thus as the scale parameter increases, respondents’ choices appear less random from the econometrician’s perspective.Footnote 2 As a result, it influences the confidence intervals of WTP estimates.

Consistent with the literature, our study is predicated upon the notion that altering the information set presented to a subject in a stated preference study could affect the predictive power of the econometrician’s RUM (Carson and Czajkowski 2014). For example, if subjects have a more complete information set, it could lead to the value of certain attributes of a public good being estimated more precisely. In this paper, we propose a methodology for making the preference and scale parameters a function of the information set a respondent holds and, as a result, provide a method for accounting for the effects of information in random utility-based stated preference methods. Using a simple theoretical model, we highlight how this approach is consistent with modeling agents as Bayesian updaters who incorporate newly presented information provided during a study into their stated preferences. Our approach also provides a convenient way for combining datasets from two different but related CV studies for joint estimation thereby increasing estimator efficiency.Footnote 3 Seen in this light, our approach is based on an extension of the generalized multinomial logit (G-MNL) model (Fiebig et al. 2010). Most generally, the approach allows for a flexible treatment of both observed and unobserved preference and scale heterogeneity.

We illustrate our approach by applying it over preferences for biodiversity conservation. Biodiversity conservation is a well-suited public good for our study for two reasons. First, the non-use benefits of enhanced biodiversity conservation in the case examined here are non-rival and non-excludable in consumption. Second, there is no well-functioning market for biodiversity meaning that stated preference methods are common in evaluating management options for this type of good. Our results confirm that the estimated contribution of the deterministic portion of respondents’ utility relative to the stochastic element both varies across individuals and is sensitive to the information set they are given. In contrast, changes in the information provided did not influence the preference (taste) parameters of respondents’ utility functions. We find, then, that from the econometrician’s perspective, the information set provided to respondents may affect the precision with which the deterministic portion of utility is estimated, rather than the level of coefficient estimates. This finding is consistent with subjects refining their preferences for biodiversity conservation over the econometrician’s observable characteristics as they are presented with different information sets.

The remainder of the paper is structured as follows. Section 2 begins with an overview of information effect studies in stated preferences. We then conduct a brief theoretical modeling exercise which shows variance in preferences for a good could vary with informative signals about the good if consumers are Bayesian updaters. Section 3 offers precise discussion of how scale and preference heterogeneity has been modelled in discrete choice studies. We then set out a new approach to represent differences in unobserved preference and scale heterogeneity in combined datasets, namely differences in mean preference and scale coefficients, as well as differences in their variances. The design and implementation of a choice experiment with two information treatments is then described. Results from applying this framework to our study follow, and we conclude with some observations on implications for future work.

2 Information Effects in Stated Preferences

The effect of information about environmental goods on willingness to pay was one of the early concerns amongst stated preference researchers, and reflects a long-standing interest in information, complexity and human behavior in decision science (Payne 1976).Footnote 4 Munro and Hanley (2002) consider eight contingent valuation studies finding statistically significant effects of different information sets presented to subjects on mean WTP. Munro and Hanley (2002) also formally show in an expected utility model how the mean and variance of WTP might be affected by more “positive” information about an environmental good.Footnote 5 Other studies attempt to test whether certain neoclassical and behavioral economic theories are the causes behind these changes. Specifically, the effect of costless signals and cheap talk, bounded rationality, and Bayesian updating conditional on previous levels of familiarity with a good have been examined and been found to be important (MacMillan et al. 2006; Aadland et al. 2007; Hoehn et al. 2010; LaRiviere et al. 2014) or unimportant (Alberini et al. 2005; Czajkowski et al. forthcoming) for estimating mean WTP for non-market goods. The literature shows, then, that influencing the information set of agents in a CV study can affect estimated mean WTP.Footnote 6

A less studied aspect of the CV literature is how information provided in a CV study could influence the econometrician’s ability to predict mean WTP levels (Czajkowski et al. forthcoming). In a CV study, the subject is often presented with information about the public good being valued before the econometrician elicits their WTP. It is reasonable that the provided information could interact with the subject’s previous information set and experiences to influence their stated preferences. For example, Bayesian updating due to a subject’s previous information set interacting with newly provided information in CV study has been shown to affect estimated preference parameters and potentially affect \(\sigma _{\varepsilon }\) relative to \(V\) (Christie and Gibbons 2011). Efficiently and consistently estimating \(\sigma _{\varepsilon }\) relative to \(V\) is vital because it allows the econometrician to correctly infer the level and confidence intervals of WTP.

The motivation behind our approach is to account for how providing different information sets to subjects can affect the estimated scale parameter (e.g., the ratio of \(\sigma _{\varepsilon }\) relative to \(V\)) in a random utility model. It is relatively straightforward to understand how information could influence the estimated scale parameter (and hence, potentially also the predictability of subjects’ WTP estimates) for the econometrician. We now present a simple theoretical model showing how information can influence the variance of observed WTP from the econometrician’s perspective if subjects are Bayesian updaters.Footnote 7

Consider a model in which there are two possible states of the world \(x\in \left\{ {A,B} \right\} \). An agent trying to infer the true state of the world has a prior that the state of the world is \(A: pr\left( {x=A} \right) =\rho \) such that \(pr\left( {x=B} \right) =1-\rho \). In our case, \(\rho \) might be the probability an agent believes their home is at risk from flooding. In this case, \(A\) may be associated with an increased value the agent receives from additional flood protection (e.g., in state \(A\) the agent’s home is at high risk of flooding). As a result, state \(A\) is positively related to the level of utility from an additional public works project that defends against flooding whereas state \(B\) is negatively related to the utility the agent gains from the public works project. A different example would have that state A corresponds to the re-introduction of a species with no adverse effects on other flora and fauna, whereas in state B the species re-introduction causes negative impacts on existing animals and plants. Without loss of generality, then, \(V_{A}\) can be thought of as the agent’s value for the public project in state \(A\) and \(V_{B}\) can be thought of as an agent’s value for the project in state \(B\).

Define the information a subject receives in a CV study at time \(t\) as \(s_{t}\). \(s_{t}\) informs the subject as to state of the world such that \(s_t \in \left\{ {a,b} \right\} \) and \(pr\left( {s_t =a|A} \right) =pr\left( {s_{t}=b|B} \right) =\theta \in \left( {0.5,1}\right. )\). Note that signals are informative but not perfectly informative given the support of \(\theta \): with probability \(1-\theta \) the subject will receive a signal of \(b\) even if the true state of the world is \(A.\) This set up is meant to describe a situation in which a subject doesn’t know the true state of the world (e.g., does know their valuation for a public good with certainty at the time of a survey) and so must infer if from imprecise signals (e.g., a signal \(s_{1},\) in the information provided during the survey).

The expected value and variance of the public good conditional on a subject’s prior are

$$\begin{aligned} E\left( {\hbox {V}|\rho } \right)&= \rho V_{A} +\left( {1-\rho } \right) V_{B} \\ \hbox {var}\left( {\hbox {V}|\rho } \right)&= V_{A} V_{B} \rho \left( {1-\rho } \right) \end{aligned}$$

The variance of utility is single peaked with a maximum at \(\rho =0.5\) and equal to zero for \(\rho \in \left\{ {0,1} \right\} \). Consider the properties of utility conditional on receiving the CV study’s information signal, \(s_{t}\), given the above model given a prior, \(\rho \), that the true state of the world is \(A\). As long as the information causes the consumer’s posterior, \(\rho _{t+1} |s\), to be updated toward either zero or one, the variance in a representative consumer’s expected valuation would decrease.Footnote 8 As a result, the more informative signals an agent receives, the lower the variance from consumption as \(\rho \) is updated with new information.Footnote 9 Importantly, the agents in this model do not have random preferences conditional on a state of the world. Rather, new information can affect the agent’s belief about the true state of the world and therefore affect their WTP for a good.

Now consider that there are two classes of signals, one class noisier than the other. This situation mirrors what would occur in a survey in which one information packet is written by an interest group and another by an objective surveyor. If that case, the noisier signal (e.g., the information from the interest group) would lead to different updating than the cleaner signal (e.g., the information from the objective surveyor). As a result the observed variance in WTP for a good would vary by signal type.

The above theoretical exercise shows how additional information can affect a representative agent’s variance in WTP for a good via Bayesian updating. In a discrete choice model, variance in WTP is summarized by the relative magnitude of the unobserved (to the econometrician) portion of utility (e.g., the random component \((\varepsilon )\). The magnitude of that unobserved component is dictated by the scale parameter \((\upsigma _\upvarepsilon )\). As a result, the simple model above shows that different information sets could have different effects on the estimated scale parameter. Further, the information set a subject enters a survey with is unobserved to the econometrician.Footnote 10 Importantly, the subject’s unobserved information can interact with the information provided during the survey process leading to changes in the variance of WTP conditional on covariates. From the econometrician’s perspective, this also manifests as new information affecting the magnitude of the estimated deterministic portion of utility relative to the idiosyncratic portion, or the scale parameter. As a result, if different information sets were presented to various subjects in a CV study, it is reasonable to expect their scale parameters could be influenced differently.

It is not uncommon for agents in CV studies to be presented with different information sets in order to either satisfy various stakeholders or be part of an economic field experiment (Carson et al. 1994; MacMillan et al. 2006). In that situation, it is reasonable for the econometrician to allow different information sets to have heterogeneous effects on subjects’ scale parameters. Indeed, the above exercise motivates an econometric approach for allowing different information sets provided to subjects in a CV study to asymmetrically affect the scale parameter in RUMs.

In the next section, we show how allowing an information fixed effect to enter the estimated scale parameter is a relatively straightforward extension of discrete choice models. Allowing heterogeneous treatment effects of information on the scale parameter permits heterogeneous predictability (e.g., ‘perceived randomness’) of agent’s choices by the econometrician. This is the main contribution of our econometric model.

3 Modelling Discrete-Choice Sata

Before introducing the how information fixed effects can enter the scale parameter in an econometric model, we first briefly introduce the standard Random Parameters Logit (RPL) model (McFadden and Train 2000; Hensher and Greene 2003). The RPL model allows for an economic agent’s preferences to vary with their observable characteristics. In the RPL model respondent \(i\)’s utility associated with selecting alternative \(j\) out of a set of \(J\) available alternatives at time occasion \(t\) can be represented as:

$$\begin{aligned} U_{ijt} =\sigma {\varvec{\upbeta }'}_i \mathbf{x}_{ijt}+\varepsilon _{ijt} , \end{aligned}$$
(1)

where \(\mathbf{x}_{ijt} \) is a vector of respondent- and alternative-specific choice attributes, and \({\varvec{\upbeta }}_{i} \) represents a vector of individual-specific taste parameters associated with marginal utilities of the choice attributes, such that they follow a multivariate distribution \({\varvec{\upbeta }}_i \sim f\left( {\mathbf{b},{\varvec{\Sigma }}} \right) \), with means \(\mathbf{b}\) and variance-covariance matrix \({\varvec{\Sigma }}\).Footnote 11 Finally, since the stochastic component of the utility function \(\varepsilon \) is typically assumed to follow the extreme value distribution (with a known mean and variance), the parameter \(\sigma \) can be thought as introducing the required amount of perceived randomness into respondents’ choices by scaling the deterministic part of their utility function—the higher the scale, the more deterministic (predictable) the choices from the econometrician’s perspective.Footnote 12

A method allowing to simultaneously model the preference and scale heterogeneity is the G-MNL model (Fiebig et al. 2010). In this model, the utility function takes the form:

$$\begin{aligned} U_{ijt} =\left[ {\sigma _{i}\mathbf{b}+\gamma \varvec{\upupsilon }_i +\left( {1-\gamma } \right) \sigma _i \varvec{\upupsilon }_i } \right] ^{\prime }\mathbf{x}_{ijt} +\omega _{ijt} . \end{aligned}$$
(2)

Similarly to the RPL model, the coefficients in the utility function are individual-specific (\(\mathbf{b}\) represents the population means of the parameters, while \({\varvec{\upupsilon }}\) are individual-specific deviations from these means). Unlike in the RPL, however, the scale coefficient is also individual-specific, with \(\sigma _i \sim LN\left( {1,\tau } \right) \) or \(\sigma _i =\exp \left( {\bar{{\sigma }}+\tau \upsilon _i } \right) \) with \(\upsilon _i \sim N\left( {0,1} \right) \). Since it is still necessary to normalize scale, we want \(E\sigma _i =\exp \left( {\sigma +{\tau ^{2}}/2} \right) \). This may be achieved by assuming \(\bar{{\sigma }}={-\tau ^{2}}/2\). This way the scale is no longer constant across respondents; instead it is assumed to follow a lognormal distribution, with the new parameter \(\tau \) reflecting the level of scale heterogeneity in the sample.

The coefficient \(\gamma \in \left[ {0,1} \right] \) controls how the variance of residual taste heterogeneity varies with scale. If \(\gamma =0\) the individual coefficients become \({\varvec{\upbeta }}_i =\sigma _i \left( {\mathbf{b}+{\varvec{\upupsilon }}_i } \right) \), while if \(\gamma =1\) they are \({\varvec{\upbeta }}_i =\sigma _i \mathbf{b}+{\varvec{\upupsilon }}_i \). These are the two extreme cases of scaling (or not scaling) residual taste heterogeneity in the G-MNL model (type I and type II respectively), however, all intermittent solutions are possible.

Finally, it should be noted that the preference (taste) and scale parameters are not separately identifiable, as they are always observed as a in multiplicative form. Hess and Rose (2012) demonstrate that in the case when (1) all parameters are modelled as random and (2) all parameters are allowed to be correlated, introducing the random scale coefficient is equivalent to allowing for a more flexible distribution for the taste-scale mixture. In many cases, however, introducing a random scale coefficient is useful because it allows one to account for all (random or non-random) parameters for a particular individual becoming larger or smaller, relative to the utility function error term (whose variance is normalized to one). In this way, a single parameter allows us to observe how the deterministic part of respondent’s utility function varies relatively to the random component, from the perspective of the analyst. This approach provides a convenient way of observing and interpreting the level of heterogeneous predictability (‘perceived randomness’) of agent’s choices by the econometrician. In our case, as described in the next section, we not only make the scale coefficient random, but also introduce information-set-specific covariates into its mean and variance, thus proposing a useful, reduced form method of empirical investigation of the effects of information and updating in a public goods discrete choice model.Footnote 13

3.1 Methods for Accounting for Information-Related Effects on Preference and Scale Heterogeneity

The information sets respondents hold may in some cases be observed. A typical way of controlling for the effects of information on preference parameters is making the means (and possibly variances) of the taste parameters a function of information-related covariates \(\left( \mathbf{z} \right) \), so that \({\varvec{\upbeta }}_\mathbf{i} \sim f\left( \mathbf{b}+{\varvec{\upphi }}'\mathbf{z}_i ,{\varvec{\Sigma }}+{\varvec{\uppsi }}'\mathbf{z}_i \right) \). By comparing the means or the variances of random preference (taste) parameters in one treatment with the parameters in another, the modeler is able to identify the relative changes in the observed preferences resulting from different information treatments. This way, it is possible to investigate the effects of different information sets on respondents’ preferences (tastes) while allowing for unobserved preference heterogeneity.

The main contribution of our paper, however, comes from accounting for not only the effects of information on preferences, but also the effects on scale. Since in the case of the G-MNL model individual scale is a random variable, there are two possible effects which can be taken into account—the systematic differences in the mean scale, and the systematic differences in its variance. We propose to account these effects by making the mean of the random scale parameter and its variance functions of information-related covariates:

$$\begin{aligned} \sigma _{i}\sim LN\left( 1+{\varvec{\upphi }}'\mathbf{z}_{i}\mathbf{,}\tau +{\varvec{\upxi }}'\mathbf{z}_{i} \right) . \end{aligned}$$
(3)

Exploring issues related to the observed scale differences by including observation-specific covariates of (non-random) scale has been done before. There are several models which allow for this, including the covariance heterogeneity model (DeShazo and Fermo 2002), the error components model (Hensher et al. 2008; Savage and Waldman 2008), modeling Gumbel variance directly by using socio-economic characteristics (Scarpa et al. 2003), the heteroskedastic extreme value model (Salisbury and Feinberg 2010), multiplicative errors model (Fosgerau and Bierlaire 2009) and, perhaps most notably, the heteroskedastic MNL model (e.g. Hensher et al. 1998; Dellaert et al. 1999; Swait and Adamowicz 2001; Caussade and Ortúzar 2005). None of these approaches allow for unobserved preference and scale heterogeneity, however.

The possibility of including covariates of scale, while allowing for unobserved scale heterogeneity, was first mentioned by Fiebig et al. (2010), although the authors did not pursue this approach. We apply their concept and extend it by also including information-related covariates in the scale variance \(\left( \tau \right) \). The rationale of our approach is as follows. Just as two samples can differ with respect to mean WTP and its variance, they can differ with respect to (1) how random or how deterministic the respondents appear (on average) from the econometrician’s perspective and (2) how differentiated each sample of respondents is, in terms of whether the respondents have similar scale parameters. The former effect is captured by the mean of the individual scale parameters, the latter by the variance. Our approach, therefore, allows for a greater flexibility in accounting for scale differences between groups of observations.

In our case, the information sets were dataset-specific. Therefore, we use dataset-specific covariates of mean scale and its variance to control for the possible effects of information, so thatFootnote 14:

$$\begin{aligned} \sigma _{i} =\exp \left( {\bar{\sigma }}+\exp \left( {\varvec{\lambda }}'\mathbf{z}_{i} \right) \tau \upsilon _i +{\varvec{\uptheta }}'\mathbf{z}_{i} \right) . \end{aligned}$$
(4)

In this formulation, even though the absolute levels of scale or its variance are not identified, positive coefficients \({\varvec{\uptheta }}\) indicate observations with a higher scale, i.e. less uncertainty surrounding choices, in comparison with the reference treatment. Positive coefficients of \({\varvec{\lambda }}\), on the other hand, represent observations which higher scale heterogeneity, e.g., a group of respondents who are more diversified in terms of how predictable their choices are, in relation to the reference group.Footnote 15

The resulting extension of the G-MNL model is flexible enough to capture observed and unobserved preference heterogeneity, as well as observed and unobserved scale heterogeneity. In the case of our empirical application, it allows us to control for different information levels of respondents. We note, however, that it can easily be applied to model other phenomena.

3.2 Accounting for Scale Differences When Combining Datasets

Our approach has one other practical application: accounting for scale differences when two or more datasets are combined. It has long been recognized that if observations from two datasets are to be combined, accounting for scale differences is necessary (Swait and Louviere 1993). This is because utility function parameters are confounded with scale and so failing to take this into account (i.e. assuming the scale parameter in two datasets is the same) may lead to misleading conclusions being drawn. Scale may vary across data sets due to e.g., differences in sampling or in the information provided to respondents. Only after the scale differences have been accounted for it is possible to formally test the equality of utility function parameters and their variances if unobserved preference heterogeneity is allowed for (Hensher et al. 1998).Footnote 16

Several methods to control for scale differences between datasets have been suggested. Ben-Akiva and Morikawa (1990) proposed a procedure to efficiently estimate the scale differences between two data sources. Their procedure simultaneously maximizes a joint likelihood function for observations from two or more datasets. A relative scale factor can be estimated for each type of data (except one which is arbitrarily chosen as the base level; Morikawa 1989). Bradley et al. (1992, 1994) incorporated the one-step estimation approach of Morikawa and Ben-Akiva into the nested logit setting. They call this approach the logit-based scaling approach (Hensher and Bradley 1993; Bradley and Daly 1994). Perhaps the most commonly used method of controlling for scale differences when combining datasets was proposed by Swait and Louviere (1993). It allows data from two sources to be combined by exploring through a grid search a range of plausible relative scale factors for which the parameters of the models estimated for two samples are statistically equal. This “tedious, but straightforward” procedure results in unique maximum for the log likelihood function, at least for the linear-in-parameters MNL model for which the log likelihood is concave and remains probably the most commonly used way to test for scale differences between two datasets, at least in environmental economics applications (e.g. von Haefen and Phaneuf 2008; Christie and Azevedo 2009; Olsen 2009; Brouwer et al. 2010).

We note that the methods presented above do not allow for unobserved scale heterogeneity, despite a growing body of literature suggesting that modelling of unobservable scale differences may be a significant component in accounting for overall heterogeneity (e.g. Louviere et al. 2002). Once unobserved scale heterogeneity is allowed, groups of observations (e.g. datasets) can differ not only in terms of preferences (e.g. means of random parameters), preference heterogeneity (e.g. variances of random parameters), and mean scale, but also with respect to the scale heterogeneity (i.e. scale variance). The approach we propose allows one to simultaneously take all these differences into account. In addition, it does not require that all taste parameters are assumed equal across the sample. We therefore argue it is more flexible and, at the same time, more convenient to use when data from two or more sources is combined.

4 Case Study and Survey Design

This section applies the empirical model introduced in Sect. 3 to a specific issue in biodiversity conservation. The management of Red Grouse (Lagopus lagopus scotticus) in the UK uplands provides an interesting case study. Management of moorlands for Red Grouse shooting since the mid-nineteenth century has led to declines in many species of predators (Newton 1998), since the aim of grouse management is to maximize numbers of birds available for shooting in the autumn. One particular conflict which has arisen in this context concerns the management of Hen Harriers (Circus cyaneus) on sporting estates. Hen harriers are listed as endangered due to population declines in the last 200 years (Baillie et al. 2009). Economic costs to grouse moor owners arise because harriers prey on grouse (Thirgood et al. 2000), and arguments between the conservation lobby and the sporting estate community have become polarized over time (Redpath et al. 2004; Thirgood and Redpath 2008). Evidence shows that (1) Hen Harrier densities can increase to the extent that they make management for grouse shooting economically unviable; (2) illegal killing has resulted in a suppression of harrier populations in both England and Scotland (Etheridge et al. 1997); and (3) that enforcement of current laws prohibiting lethal control has been ineffective (Redpath et al. 2010). Golden Eagles are often found in Hen Harrier habitat, and are also top predators which have been subject to illegal persecution, particularly in managed grouse moors (Watson et al. 1989; Whitfield et al. 2007).

To understand public preferences over the conservation of Hen Harriers on heather moorland, we designed a choice experiment (Hanley et al. 2010). The choice experiment design consisted of four attributes. These were:

  • Changes in the population of Hen Harriers on heather moorlands in Scotland. The levels here were a 20 % decline (used as the status quo), maintaining current populations, and a 20 % increase in the current population.

  • Changes in the population of Golden Eagles on heather moorlands in Scotland. The levels here were a 20 % decline (used as the status quo), maintaining current populations, and a 20 % increase in the current population.

  • Management options. These included the current situation, moving Hen Harriers (“MOVE”), diversionary feeding (“FEED”) and tougher law enforcement (“LAW”). These levels were included as labelled choices. That is, in each choice card, 4 options were available. One represented the status quo, and then 3 choice columns showed variations in other attribute levels given a particular, labelled management strategy.

  • Cost of the policy. We told respondents that “the cost level indicated is the amount of extra tax which a household like yours might have to pay if the government went ahead with that option.” The levels used were 0 (the status quo), 10, 20, 25, and 50 GBP.

Table 1 gives an example of a choice card; each questionnaire included 6 choice cards.Footnote 17

Table 1 Example choice card

The choice experiment was designed to minimize the determinant of the AVC matrix of the parameters (D-error) given the priors on the parameters of a representative respondent’s utility function using a Bayesian efficient design (Scarpa and Rose 2008). The parameters of this distribution were derived from a preliminary model estimated on data available from a pilot study. Pilot surveys were undertaken using in-person surveys of a random sample of Edinburgh households.Footnote 18

There were two different versions of the survey, which differed only in the information provided to respondents. The first survey (study 1), reported in Hanley et al. (2010), used an information pack developed solely by the research team, based on existing research findings. The second survey (study 2) used an information pack which was re-written by a group of stakeholders engaged in moorland ownership, management and grouse shooting. In each case, the information pack covered the following items:

  • a description of what we meant by “the uplands” in the UK, and how some uplands areas are managed as grouse moors;

  • the contribution that grouse shooting makes to the Scottish economy;

  • the contribution of grouse management to maintaining heather moorlands, rather than allowing moorlands to be converted to rough grassland or plantation forestry;

  • a description of the Hen Harrier, including conservation status and threats from illegal persecution;

  • a description of Golden Eagles, their conservation status and current threats to the species;

  • three alternatives for moorland management aimed at Hen Harriers.

Given that the biodiversity issues involved in moorland conservation are likely to be unfamiliar to many respondents, we might anticipate that differences in information provided will impact their choices. The most significant differences between how these items were relayed to respondents in the two treatments were that (1) moorland management is depicted as more beneficial to a variety of species in survey 2 (2) information was provided about how organizations involved in shooting and conservation are trying to find a solution to this conservation conflict in survey 2 (3) moorlands were described as “an important part of our cultural heritage” and “internationally important species of animals and plants” in survey 2, but not in survey 1; (4) a higher figure was provided for the number of jobs generated by moorland management for red grouse shooting in survey 2Footnote 19; (5) Hen Harriers were depicted as less threatened in survey 2 than in survey 1; and (6) more information was provided on Golden Eagles in survey 1, such as how they mate for life, how population numbers have recovered during the twentieth century, and how illegal persecution is still carried out.

5 Results

In the first survey, we obtained 233 responses from 1,000 mail outs, a 23 % response rate. In the second survey, we obtained 347 responses from 1,700 mail outs, a 20 % response rate. Table 2 presents the comparison of sample characteristics. Overall, the samples were very similar to each other with respect sex, age and income. Both samples suffered from some over-representation of older respondents. We believe this to be an artefact of the mail format of our survey. The ratio of male and female respondents and the mean household income was very close to the national average.

Table 2 A comparison of sample characteristics

The observations from the two studies were combined and modelled using the approach outlined in Sect. 3. We applied three different estimators to provide an illustration for our approach and allow for a case-study comparison with existing methods; the results are reported in Table 3. The first approach is an MNL model, followed by the RPL model, and the conventional G-MNL model. The last approach we report allows for not only the mean scale to differ between studies, but also scale parameters variance \(\left( \lambda \right) \). In all cases we followed the standard practice in joint estimation on data combined from different sources (Ben-Akiva et al. 1994) by allowing for the scale coefficient to vary between two studies. This is represented by the scale correction factor \(\theta \) which is included for the observations from study 2 (study 1 is used as a reference). This dataset-specific coefficient reveals significant differences between the two studies in terms of how predictable respondents’ choices were (effective scale was higher in the second study).

Table 3 The results of the G-MNL model

The variables used in the model include alternative specific constants associated with different protection programs (LAW, FEED, MOVE), dummy-coded levels of improvement of Hen Harriers \((HH_{1}, HH_{2})\) and Golden Eagles \((GE_{1}, GE_{2})\), and the continuously coded cost \((FEE)\). All preference parameters were assumed to be normally distributed (including the dis-utility of higher costs). We allowed all parameters to be study-specific (superscripts on variable names indicate the two different samples, except for cost \(\left( {FEE} \right) \), which was constrained to be equal in both studies for identification purposes.Footnote 20 The model allows for correlations between all random parameters within each study.Footnote 21 As a result, the estimated utility function was of the following structure:

$$\begin{aligned}&U_{itj} =\left[ \sigma _{i}\mathbf{b}+\gamma {\varvec{\upupsilon }}_i +\left( {1-\gamma } \right) \sigma _{i}{\varvec{\upsilon }}_{i}\right] ^{\prime }\mathbf{X}_{itj} +\varepsilon _{itj} \nonumber \\&\hbox { where:} \nonumber \\&\sigma _{i}= \exp \left( \bar{{\sigma }}+\exp \left( {\lambda S_{i}} \right) \tau \varepsilon _{0i} +\theta S_{i} \right) \nonumber \\&\gamma = \frac{\exp \left( {\gamma ^{{*}}} \right) }{1+\exp \left( {\gamma ^{{*}}} \right) } \nonumber \\&\mathbf{X}= \left[ {\begin{array}{l} LAW^{1},FEED^{1},MOVE^{1},HH_{1}^{1} ,HH_{2}^{1} ,GE_{1}^{1} ,GE_{2}^{1} , \\ LAW^{2},FEED^{2},MOVE^{2},HH_{1}^{2} ,HH_{2}^{2} ,GE_{1}^{2} ,GE_{2}^{2} ,FEE \\ \end{array}} \right] \end{aligned}$$
(5)

and \(S\) is a binary variable associated with one of the studies. The estimated parameters are the mean tastes matrix \(\mathbf{b}\) along with the elements of Cholesky decomposition of their variance-covariance matrix \({\varvec{\Sigma }}\), and individual-scale-specific parameters \(\tau ,\, \lambda ,\, \theta \).Footnote 22 The estimation was performed in Matlab 8.1 using 1,000 shuffled Halton draws (Sándor and Train 2004). Since the log-likelihood function in the case of G-MNL is not necessarily convex, we used multiple starting points to ensure convergence at the global maximum. Standard errors of coefficients associated with standard deviations of random parameters were simulated using \(10^{6}\) draws (Krinsky and Robb 1986).

We start by noting that, overall, all attribute parameters in Table 3 are highly significant and of the expected sign. The statistical significance of the coefficients associated with the standard deviations of normally distributed parameters indicates that there is substantial un-observed preference heterogeneity with respect to all model parameters. The alternative specific constants associated with each protection program (LAW, FEED, MOVE) were relatively high. Coefficients associated with improvements in Hen Harriers \((HH)\) and Golden Eagle \((GE)\) populations indicate that overall respondents were more concerned with the latter than the former. The differences in preference parameters and WTP for different improvement levels are not linear, and could be interpreted as an asymmetric loss aversion effect (avoiding a 20 % loss is more important than 20 % gain) or alternatively as being due to sharply declining marginal WTP in some cases. In addition, the respondents in the second sample revealed a higher degree of preference heterogeneity with respect to protecting these bird species, as indicated by higher estimates of standard deviations associated with these parameters in study 2 in comparison with study 1. Comparing the relative importance of different attribute levels is not straightforward, however, as we allowed for correlations between the attributes. Because the random parameter associated with an attribute could be positively or negatively correlated with the cost, the implicit prices associated with these attributes do not necessarily reflect coefficients for the means associated with each attribute. We investigate this further below when we present simulated implicit prices of the attribute levels for both studies.

Comparing the different approaches, modelling unobserved heterogeneity of scale (the change from the RPL to the conventional G-MNL model) allows for an improvement in model fit, as indicated by the highly significant increase in the value of the simulated LL function, as measured by the LR test (p value \(<\)0.0001). The scale variance coefficient \(\tau \) is significantly different from zero, which indicates the presence of significant unobserved scale heterogeneity in the sample—there were significant differences between respondents in terms of how deterministic or how random their choices appeared from the modeler’s perspective. The third and the most flexible extended G-MNL estimator additionally allows for the variance of individual scale parameters to differ between the two studies. By introducing an additional component \(\lambda \) in the scale variance \(\tau \) of the observations from study 2 we were able to allow for not only the mean scale, but also its variance to differ between the two datasets. This also proved to be a statistically significant improvement (p value resulting from the LR test \(<\)0.0001). The significant negative coefficient on \(\lambda \) indicates that the respondents of study 2 displayed lower scale variance, than the respondents of study 1, despite having a higher mean scale.

Our results show that allowing for a more flexible estimator, in terms of allowing for random scale and its variance to differ between the two datasets, allows for an improvement in model fit by the AIC, BIC and pseudo \(\hbox {R}^{2}\) metrics. This allows the econometrician to avoid potential bias resulting from the problem of misspecification, to which nonlinear models are particularly vulnerable (Greene 2011). In addition, it allows for a useful comparison of the potential of our model to predict the choices of the respondents in study 1 and 2. In our case, the respondents’ choices in the second study appeared less random from the econometrician’s perspective. At the same time, respondents in the second study exhibited less variation in the econometrician’s ability to predict their preferences (i.e., lower scale variance across subjects). Given that sampling was random, we attribute these differences in scale to changes in the information provided to respondents.Footnote 23

5.1 Implicit Prices

Finally, we turn to the analysis of respondents’ welfare measures associated with the changes in each attribute level. The WTP for particular attribute levels are calculated as marginal rates of substitution between respective utility function components (a public good attribute for the monetary attribute) and so they allow an additional insight into respondents’ preferences.

Since we impose that all parameters in our model (including cost) are normally distributed, the resulting ratio distribution of WTP has infinite moments (i.e. does not have well defined mean and standard deviation (Fieller 1932; Meijer and Rouwendal 2006; Carson and Czajkowski 2013). We therefore employed the following procedure to simulate the median of each distribution of WTP:

  1. 1.

    We took \(n=10^{5}\) draws from the multivariate normal distribution described by a vector of estimated parameters and the asymptotic variance-covariance matrix;

  2. 2.

    For each of \(n\) draws we decomposed the resulting vector of parameters to a vector of means and standard deviations of random parameters associated with 15 choice attributes;

  3. 3.

    For each of the \(n\) draws we took \(m=10^{5}\) draws from multivariate normal distribution described by means and variances of the parameters drawn in step 2;

  4. 4.

    For each of the \(n\cdot m\) draws we calculated implicit prices of the attribute levels, and calculated their medians, standard deviations and 95 % quantile ranges.

The results are presented in Table 4. From a policy perspective, the interesting questions relate to the willingness to pay estimates for each management alternative (LAW, FEED, MOVE) and for changes in the two raptor populations, as determined by (1) the modelling strategy and (2) the information provided to respondents. Regarding the former, observing sometimes substantial differences in WTP depending on which modelling approach was used is not uncommon (e.g., Czajkowski et al. 2014a). In our case, the largest differences come from accounting for unobserved preference heterogeneity (i.e. the change from the MNL to any of the models with random taste parameters), particularly for the alternative specific constants (ASC) as represented by LAW, FEED orMOVE. The MNL model seems to suggest no status quo effect in some cases and a positive status quo effect in others, while the other specifications are consistent with an anti-status quo effect—all of the ways of implementing the new plan are preferred to doing nothing but there is no differentiation on average in terms of preferring one approach. These differences can be explained with the presence of substantial preference heterogeneity with respect to different protection programs, as can be seen from the ratios of means of the ASCs to their variances in the RPL or GMNL model (Table 3). As a result, the point estimates of implicit prices of LAW, FEED and MOVE in the MNL model are not significantly different from 0 or are much different from the medians of the distributions of WTP allowing for unobserved preference heterogeneity.Footnote 24 The change in WTP when moving from the RPL to the G-MNL model is much less evident, with a modest (although not statistically significant) increase in medians and decrease in confidence intervals of the simulated WTPs. This is the result of allowing for random scale, which increases the model fit and reduces the standard errors associated with model parameters. Accounting for observed (information set specific) scale variance does not seem to be causing significant changes in WTPs, or their confidence intervals, indicating that at least in the case of our dataset, the bias resulting from failing to account for information’s effect on scale variance differences may be small.

Table 4 Simulated median implicit prices of the attribute levels (GB Pounds per household per year)

The approach advanced in this paper can also determine whether providing an alternative information set shifts utility function taste parameters, scale parameters or both and how they can be reflected in respondents’ WTP. We start by noting that although there were noticeable changes in the WTPs between information treatments, these differences were generally not statistically significant at the 95 % level. Overall, we found that the respondents did not attach importance to how increases in raptor populations were achieved. This finding holds across information treatments and across econometric methods.Footnote 25

The absolute values of WTP for each management alternative were substantially different comparing information pack one with information pack two in the MNL and the RPL model. However, this difference does not emerge from either of the G-MNL models. This could be an indication that, at least in the case of our dataset, allowing for unobserved scale heterogeneity (while unobserved taste heterogeneity is also accounted for) controls for a substantial part of the effect of different information sets. Allowing for observed scale heterogeneity provided only a small improvement in this respect.

With regard to changes in populations for either Hen Harriers or Golden Eagles, information pack two (study 2) resulted in smaller absolute WTP values for both species in all econometric treatments relative to information pack one, although these differences are not significant at the 95 % level. As expected, higher implicit prices are observed for a 20 % population increase in either species relative to stabilizing populations at current levels. Comparing WTP for a 20 % increase in either Hen Harriers or Golden Eagles across models for a given information set shows changes in absolute values (e.g., from 10.24 GBP for Hen Harriers in the RPL model, to 18.17 GBP in the conventional G-MNL model, to 17.78 GBP in our modified G-MNL model), but again, these differences are not significant. Finally, in our dataset we find that our modified G-MNL model leads to only modest changes- both increases and decreases- in the standard error of estimated WTP for attributes.Footnote 26

Overall, the differences in WTP resulting from information treatments may be difficult to observe because (1) there is large uncertainty and standard deviation associated with all of the estimated WTPs and (2) information provided in the questionnaire can be contested by some respondents. In general, however, our approach shows how one can investigate how changes in an information set can shift not only the location parameters, but also the scale parameter, and how these changes can get reflected in respondents’ preference parameters (discussed before) and WTP. Future research to identify precisely when the effect of information would manifest in the location versus scale parameters is needed. Structural models such as ours may be best suited for field experiments embedded in surveys and/or simulation studies aimed at addressing this question.

6 Discussion and Conclusions

For many years, researchers have been interested in the effects of information provided to respondents on their stated values. In a random utility model, changes in information can affect both the relative magnitudes of the estimated deterministic and estimated random components of utility, something which has not so far been highlighted in stated preference work. This paper demonstrates an econometric approach which allows both preference and scale variance heterogeneity to be included, which we argue to be useful for considering how information provided in a CV study can influence the econometrician’s ability to predict subjects’ choices. The method allows for heterogeneity in preferences across and within information treatments, as well as variations in relative scale within and between treatments. This may be of particular interest to researchers who wish to investigate the effects of presenting different types or varying amounts of information to respondents, for example to reflect conflicting views or uncertainty over the non-market impacts of a project. Whilst we illustrate the method in the context of a public good, it would be applicable to the use of stated preference methods for measuring demand for private goods where different consumers have access to differing levels of information.

Results show that the estimated random element of choice varied across the two information treatments, both in terms of its mean magnitude and its variance. Respondents given a more complete and positively framed information set displayed higher mean relative scale and lower scale variance (e.g., more predictable choices from the econometrician’s perspective). This is in addition to within-treatment unobservable variations in the random component of utility. We also find considerable evidence of preference variation in the deterministic component of choices, but small differences in willingness to pay between treatments. Our new G-MNL estimation offered a significant improvement in fit over either a random parameters logit or a standard G-MNL model. Moreover, there were significant differences in parameter estimates between our preferred G-MNL model and the other two models.

None of the above addresses some key questions for information provision in stated preferences. These include how well-informed preferences should be before policy-makers and regulators rely on them for making decisions using cost-benefit analysis, and how much information should be provided to respondents (MacMillan et al. 2006). Moreover, information is often contested, a very relevant example being disagreements within the stakeholder community over the effects of grouse moor management on biodiversity and its contribution to local economic activity, which partly motivated the design of this choice experiment (Thirgood and Redpath 2008). We do not have a quantitative measure of how much information respondents received or assimilated in each treatment, and so are unable to relate this to the precision of willingness to pay estimates (and thus their credibility with decision-makers).

Finally, we note that the paper suggests some fruitful avenues for follow-up work. These include an investigation of the effects of measures of prior familiarity with an environmental good on scale heterogeneity, and a treatment which measures awareness of the characteristics of a good before and after the provision of new information, then relates this to preference and scale heterogeneity. Individual’s responses to complexity in information sets could be related to observables such as education and age. There are also interesting questions relating to the interplay between information provision, learning and consequentiality. Structural models, field experiments embedded in surveys and/or simulation may be best suited to address this important question.