Advertisement

Environmental and Resource Economics

, Volume 69, Issue 2, pp 365–393 | Cite as

Single-Choice, Repeated-Choice, and Best-Worst Scaling Elicitation Formats: Do Results Differ and by How Much?

  • Daniel R. Petrolia
  • Matthew G. Interis
  • Joonghyun Hwang
Article
  • 316 Downloads

Abstract

This paper presents what we believe to be the most comprehensive suite of comparison criteria regarding multinomial discrete-choice experiment elicitation formats to date. We administer a choice experiment focused on ecosystem-service valuation to three independent samples: single-choice, repeated-choice, and best-worst scaling elicitation. We test whether results differ by parameter estimates, scale factors, preference heterogeneity, status-quo effects, attribute non-attendance, and magnitude and precision of welfare measures. Overall, we find limited evidence of differences in attribute parameter estimates, scale factors, and attribute increment values across elicitation treatments. However, we find significant differences in status-quo effects across elicitation treatments, with repeated-choice resulting in greater proportions of “action” votes, and consequently, higher program-level welfare estimates. Also, we find that single-choice yields drastically less-precise welfare estimates. Finally, we find some differences in attribute non-attendance behavior across elicitation formats, although there appears to be little consistency in class shares even within a given elicitation treatment.

Keywords

Best-worst scaling Choice experiment Contingent valuation Ecosystem-service valuation Stated preference Survey Willingness to pay 

1 Introduction

Several different formats of preference questions have been used in discrete choice experiments.1 The most basic format is a single choice task from which a respondent chooses among two alternatives, as proposed by the NOAA Blue Ribbon panel (Arrow et al. 1993) in response to the contingent-valuation debate following the Exxon Valdez disaster. Hanemann (1985) and Carson (1985) proposed the double-bound binary-choice format, in which respondents were asked a follow-up question that proposed a higher or lower price for the good or program depending on the initial response. In recent years, the valuation literature has shifted toward the multinomial-choice format which developed in the marketing literature.2 The multinomial-choice format presents respondents a choice task with three or more alternatives from which to choose, and instead of only price varying across respondents, multiple attributes vary across both alternatives and respondents. Usually, respondents are asked to evaluate multiple choice tasks, in spite of the fact that numerous behavioral anomalies have been observed under repeated-choice formats, including status-quo effects and strategic behavior, that would seem to at least call into question the use of repeated-choice in certain situations.3 Perhaps in an effort to avoid the pitfalls of the repeated-choice format, we are aware of at least three field survey papers that use a single choice task only: List et al. (2006), Newell and Swallow (2013) and Petrolia et al. (2014).

Finally, the best-worst scaling (BWS) format has also emerged in the past few years as an alternative to the above formats (see Louviere et al. 2015; Flynn and Marley 2014; Flynn et al. 2007; Marley and Louviere 2005; Potoglou et al. 2011; Scarpa et al. 2011). This format asks respondents to indicate the “best” alternative among a set and then to indicate the “worst” alternative, and then, of the remaining alternatives, to indicate the “best” of those remaining, then the “worst”, etc., until a full ranking is achieved.

In this paper, we employ what we believe is the most comprehensive suite of comparison criteria for elicitation formats to date in the literature. Specifically, we test whether results differ by parameter estimates, scale factors, preference heterogeneity, status-quo effects, attribute non-attendance, and magnitude and precision of welfare measures. Analysis is conducted using a specification of the random-parameters logit model that accounts for scale differences across both alternatives (i.e., to relax the Independence from Irrelevant Alternatives assumption) and elicitation formats, as well as to implement Carson and Czajkowski’s (2013) reparameterization of the coefficient on (the negative of) price to enforce a theoretically correct positive coefficient.4 Attribute non-attendance comparisons are made using a variant of Hensher et al.’s (2012) “\(2^{\mathrm{K}}\)” model. Tests of equality of parameter vectors and scale factors across elicitation formats follow a variation on the method of Swait and Louviere (1993) and Blamey et al. (2002).

The use of so many different preference question formats in the literature reveals the lack of consensus regarding the best format to be used. For example, the single binary-choice format proposed by the Blue Ribbon panel can be made incentive compatible (Carson and Groves 2007; Vossler et al. 2012). But an understandable temptation among practitioners is to collect more information from each respondent, with the hope being that doing so will save money or increase the reliability of estimates. Unfortunately for practitioners, however, a deviation from the single binary-choice format is often accompanied by introduced behavioral anomalies. For example, respondents reacted to the follow-up question of the double-bound format in ways that researchers did not intend, which cast doubt on the legitimacy of those responses. Consequently, the double-bound format was largely abandoned. Similarly, multinomial-choice formats gather more information per respondent than a single binary-choice question, however they can be made incentive-compatible only under extremely restrictive conditions (see Carson and Groves 2007 for a discussion of these conditions). And like the double-bound binary-choice format, the repeated multinomial-choice format has been found to yield unexpected behavioral anomalies.5 McNair et al. (2012) find that relatively few respondents answer consistently with traditional assumptions of truthful, independent responses with stable preferences. Day et al. (2012) find evidence of position-dependent order effects. McNair et al. (2011) find no significant difference between responses to a single binary-choice question and the first of a repeated binary-choice question sequence, but find differences between the former and subsequent responses in the repeated sequence. Silz-Carson et al. (2010) find a greater proportion of status-quo responses under the repeated multinomial-choice format relative to one-shot binary-choice. Bateman et al. (2004) find evidence of order effects on sensitivity to scope. Day and Prades (2010) find that the probability of a particular alternative being chosen changes significantly under certain price and commodity sequences.

The argument is made that choosing “bests” and “worsts” in the BWS format is a relatively easy task for respondents, and that this cognitive ease yields more accurate preference information compared with other formats, such as a direct ranking of alternatives, where respondents can quickly become overwhelmed when there are more than a few alternatives.6 The BWS format also yields more information per choice set compared to a multinomial-choice format because a full ranking is achieved. Although the literature on this format is relatively young, early evidence indicates, however, that it may have its own challenges. For example, Rigby et al. (2015) find significant differences in error variance between “best” and “worst” choices.

As applied researchers seek to collect information which can be used in policy and other decisions, it is important to understand the tradeoffs between cost-effectiveness, estimate reliability, and estimate validity among different question formats. In this paper, we empirically examine differences between three closely-related preference question formats used in the literature: the single multinomial-choice (SMC) format, the repeated multinomial-choice (RMC) format, and the best-worst scaling (BWS) format. If respondents behave similarly under the RMC format as they do under the SMC format, researchers can feel free to ask multiple preference questions and thereby gain more information per respondent and save money in the process. Likewise, if respondents behave similarly under the BWS format as they do under the SMC format, researchers can feel free to elicit a full preference ranking and thereby increase the amount of information collected per response even further.7 Given the comparisons of multinomial-choice or repeated-choice question formats to the incentive-compatible single binary-choice format extant in the literature, our focus here is on whether inferences (e.g., parameter direction and significance) and post-estimation measures (e.g., willingness-to-pay) are robust to the question format used.

To our knowledge, no study has compared these three elicitation formats directly. Scheufele and Bennett (2012), which compares single and repeated choice formats, is the closest to our study, but focuses only on the binary-choice question format, the repeated version of which is not typical of choice experiments. Three other papers somewhat related to our study are Bateman et al. (2004), Day et al. (2012), and Beaumais et al. (2015). Bateman et al. (2004) focuses on the repeated binary-choice format with increasing project scope over choice tasks. They do not directly compare single to repeated choice: rather, they compare repeated choice with varying disclosure formats. Day et al. (2012) also use a repeated binary-choice experiment with “better” or “worse” attribute sequences and they focus on disclosure (of information to respondents) formats. Beaumais et al. (2015) focus on respondents’ ability to fully rank a set of alternatives, and administered a survey that allowed respondents to state their actual ranking capability, which was then used to condition their econometric model. Given the widespread use of the RMC format, the growing use of BWS formats, and the limited use of the closely-related SMC format, we believe that a careful empirical examination of differences between these formats is warranted, just as there have been extensive examinations of differences between single- and repeated binary-choice questions and between binary- and multinomial-choice question formats.

Overall we find limited evidence of any differences in attribute parameter estimates among the three elicitation treatments. We also find little evidence of differences in attribute increment values across elicitation treatments. We do, however, find significant differences in status-quo effects across elicitation treatments, with the RMC treatment resulting in greater proportions of “action” votes, i.e., votes in favor of one of the non-status-quo alternatives, and consequently, higher program-level welfare estimates relative to the SMC and SBW formats. Also, we find that the SMC format yields drastically less precise welfare estimates compared with the other formats. We also find significant differences in attribute non-attendance behavior across elicitation formats, although there appears to be little consistency in class shares even within a given elicitation treatment.

2 Experimental Design and Data

The analysis utilizes data from a study on ecosystem service valuation for services delivered by two habitats, oyster reefs and salt marshes, along the Gulf of Mexico. Additional details not covered here, and other analyses, can be found in Interis and Petrolia (2016). The choice experiment focused on four specific ecosystem services: increased water quality, improved flood protection, increased commercial fisheries support, and increased wading bird population.8 The specific levels of each service provided, and well as the proposed bid levels, were expressed using the language reported in the right-hand column of Table 1. Given the above service attributes and levels, the choice experiment design was developed using Ngene software, in which 24 choice sets were created in order to maximize D-efficiency (See ChoiceMetrics 2011).9

Three question format treatments were designed; SMC, RMC (with four choice questions), and a single-question version of the BWS format, which we refer to here as “single best-worst” (SBW). For the SMC and SBW treatments, a respondent was randomly assigned to one of the 24 choice sets. For the RMC treatment, the 24 choice sets were grouped into 6 blocks of 4 choice sets each. Each respondent was randomly assigned to one of the 6 blocks. The order of presentation of the choice sets within each block were fixed.10
Table 1

Attributes, attribute levels, and descriptions

Habitat construction program attribute

Levels

Increased water quality

(No, 10, or 20%) reduction in nitrogen and phosphorus

Improved flood protection

(5, 10, or 15%) increase in the number of homes protected

Increased commercial fisheries support

(10, 20, or 30%) increase in annual seafood catch

Increased wading bird population

(No, 5, or 10%) increase in wading bird population

Total one-time cost to your household

($5, $10, $25, $50, $75, $100, $150, $200)

Our SBW format is an application of “Case III” BWS (see Flynn and Marley 2014), in which there are three alternatives (as with the other formats in our study), and the “best” and “worst” of the three presented alternatives are elicited, thus yielding a full ranking.11 This ranking was then decomposed following the method of rank-ordered explosion proposed by Chapman and Staelin (1982), which, in our case, yields two choice observations for each choice question asked: a three-alternative observation (first-best case) and a two-alternative observation (second-best case).12 Thus in our design, where N is the number of choices observed and J is the number of choice questions a respondent faces, we observe, for the SMC treatment, a total of \(\hbox {N}_{\mathrm{SMC}}\) choices over to a total of \(\hbox {N}_{\mathrm{SMC}}/\hbox {J}_{\mathrm{SMC}}\) respondents (where \(\hbox {J}_{\mathrm{SMC}}=1\)); for the RMC treatment, we observe \(\hbox {N}_{\mathrm{RMC}}\) choices over \(\hbox {N}_{\mathrm{RMC}}/\hbox {J}_{\mathrm{RMC}}\) respondents (where \(\hbox {J}_{\mathrm{RMC}}=4\)); and for the SBW treatment, we observe \(\hbox {N}_{\mathrm{SBW}}\) choices over \(\hbox {N}_{\mathrm{SBW}}/\hbox {J}_{\mathrm{SBW}}\) respondents (where \(\hbox {J}_{\mathrm{SBW}}=2\)). Specifically, we set a target of 500 choice observations for each treatment. Thus, for SMC, we had a target of 500 respondents who evaluated one choice set each; for RMC, a target of 500 / 4 = 125 respondents that evaluated 4 choice sets each; and for SBW, a target of 500 / 2 = 250 respondents that evaluated one choice each but provided two responses (first-best and second-best). See Fig. 1 for an example choice set.
Fig. 1

Example choice question

The payment mechanism specified was a one-time payment collected on the respondent’s state tax return filed the following year. It was stipulated that the tax revenue would partially cover the cost of an implemented program with the remainder of funds coming from existing tax dollars.13 It was explained that construction would commence the following year and take five years to complete. It was stated that the expected benefits—the provided ecosystem services—were expected to last 30 years after completion.

To increase the perception that their responses would be meaningful in the sense that they could actually influence future policy (Carson and Groves 2007), respondents were told at the beginning of the survey that a large number of taxpayers would be taking the survey and that their responses would be shared with policy-makers and could affect how much they pay in taxes in the future. Respondents were then given some information about their assigned habitat including an explanation of some of the ecosystem services it provides. Then it was explained that policy-makers were considering implementing a habitat construction program and details were given about how such a program would be implemented, including how many acres of habitat would be created and when, where, and by whom they would be created. Respondents were shown maps of candidate locations within each water body of where habitat could potentially be constructed, and where existing habitat is located already.

The survey was administered by GfK Custom Research. In April 2013, an initial pretest of the survey was administered to 25 respondents to make sure the online survey was working properly and to elicit open-ended feedback about respondent understanding and ease of completion. The final survey was administered in May and June 2013. Respondents were randomly assigned to some combination of habitat and elicitation format.

The first sample comes from a state-level survey administered to Louisiana households regarding a hypothetical restoration of oyster reefs in Barataria-Terrebonne Bay, Louisiana. The SMC treatment has 494 respondents providing 494 choice observations; the RMC treatment has 145 respondents providing 579 choice observations; and the SBW treatment has 226 respondents providing 452 choice observations. The second sample comes from a second state-level survey administered to Louisiana households, but focuses on a salt marsh habitat restoration. The SMC treatment has 518 respondents providing 518 choice observations; and the RMC treatment has 134 respondents providing 536 choice observations; this sample did not include an SBW treatment. The third sample comes from a Gulf of Mexico regional survey administered to households across the five Gulf states that valued ecosystem services derived from a multi-state oyster-reef restoration project. The SMC treatment has 459 respondents providing 459 choice observations; the RMC treatment has 117 respondents providing 467 choice observations; and the SBW treatment has 237 respondents providing 473 choice observations.

Attitudinal and demographic indicators were compared across elicitation treatments. Pearson Chi-square tests were used to test for significant differences in categorical variables across elicitation treatments for each sample, and t-tests were used for age, household size, and income category. With very few exceptions—noted by asterisks in the tables (See tables A1–A3 of the Appendix in ESM)—we found no significant differences in the indicators across treatments, evidence that the independent treatment samples are statistically similar in terms of attitudes and demographics.

3 Econometric Model Specification

The general empirical specification of utility for each elicitation format model are, respectively:
$$\begin{aligned} U_{nj}^{{\textit{SMC}}}= & {} \left( \alpha ^{{\textit{SMC}}}\cdot \textit{action}_j+{{\varvec{\upbeta }}}^{{\textit{SMC}}'}\mathbf{x}_{nj}\right) +\left( \mu _n^{{\textit{SMC}}}\cdot \textit{action}_j+{{\varvec{\upsigma }}}_n^{{\textit{SMC}}'}\mathbf{x}_{nj}+\varepsilon _{nj}^{{\textit{SMC}}}\right) \nonumber \\ U_{nj}^{{\textit{RMC}}}= & {} \left( \alpha ^{{\textit{RMC}}}\cdot \textit{action}_j+{{{\varvec{\upgamma }}}'{} \mathbf{z}} +{{\varvec{\upbeta }}}^{{\textit{RMC}}'}\mathbf{x}_{nj}\right) +\left( \mu _n^{{\textit{RMC}}}\cdot \textit{action}_{j} +{{\varvec{\upsigma }}}_n^{{\textit{RMC}}'}{} \mathbf{x}_{nj} +\varepsilon _{nj}^{{\textit{RMC}}}\right) \nonumber \\ U_{nj}^{{\textit{SBW}}}= & {} \left( \alpha ^{{\textit{SBW}}}\cdot \textit{action}_j+{{\varvec{\upbeta }}}^{{\textit{SBW}}'} \mathbf{x}_{nj}\right) +\left( \mu _n^{{\textit{SBW}}}\cdot \textit{action}_{j}+{{\varvec{\upsigma }}}_n^{{\textit{SBW}}'}\mathbf{x}_{nj}+\varepsilon _{nj}^{{\textit{SBW}}}\right) \end{aligned}$$
(1)
where, following Train’s (2009) notation, action is a binary indicator for whether alternative j is one of the proposed “action” scenarios (as opposed to the “no action” status-quo alternative), \(\mathbf{x}\) is a vector of ecosystem service attribute levels for alternative j presented to respondent \(n, \mathbf{z}\) is a vector of binary indicators for each of the subsequent choice sets (the first choice set serves as the omitted base; relevant to the RMC treatment only); \(\alpha \) is a fixed coefficient associated with action, \({\varvec{\upbeta }}\) is a vector of fixed coefficients associated with the ecosystem service attributes, \({\varvec{\upgamma }}\) is a vector of fixed coefficients associated with the subsequent choice sets in the RMC treatment, \(\mu \) is a random term associated with action, \({{\varvec{\upsigma }}}\) is a vector of random terms associated with the ecosystem service attributes that captures preference heterogeneity, and \(\varepsilon \) is iid extreme value.14

3.1 Attribute Parameter Estimates and Scale Factors

Hypothesis tests presented here focus on testing the equivalence of particular subsets of the above parameters. We follow the approach of Swait and Louviere (1993) and Blamey et al. (2002), except that, instead of their grid-search approach, we use the random-parameters approach of controlling for scale differences as in Train (2009). Tests of attribute coefficient equivalence focus on testing the null hypothesis that, in the case of comparing SMC to RMC, \({{\varvec{\upbeta }}}^{{\textit{SMC}}}={{\varvec{\upbeta }}}^{{\textit{RMC}}} ={{\varvec{\upbeta }}}^{Pool}\) and \({{\varvec{\upsigma }}}^{{\textit{SMC}}}={{\varvec{\upsigma }}}^{{\textit{RMC}}} ={{\varvec{\upsigma }}}^{Pool}\). These hypotheses are referred to as H1A, following the notation of Swait and Louviere (1993). Each test of these hypotheses requires the construction of a constrained (i.e., “pooled”) model. Following our example of the case of testing SMC against RMC, we have:
$$\begin{aligned} U_{nj}^{Pool}= & {} \alpha ^{Pool}\cdot \textit{action}_j +\delta \cdot {\textit{SMC}}\cdot \textit{action}_j+{{\varvec{\upgamma }}}'{} \mathbf{z}_{j} +{{\varvec{\upbeta }}}^{Pool'}{} \mathbf{x}_{nj} \nonumber \\&+\,\lambda _n \cdot {\textit{RMC}}+\mu _n^{Pool} \cdot \textit{action}_j +{{\varvec{\upsigma }}}_n^{Pool'}{} \mathbf{x}_{nj} +\varepsilon _{nj}^{Pool} \end{aligned}$$
(2)
where \(\delta \) is a fixed coefficient on the interaction between SMC and action, and \(\lambda \) is a zero-mean (to prevent it from interfering with the action and repeated-choice question-order indicators; see Train 2009) random term associated with RMC observations. The former allows for status-quo effect differences, and the latter, for scale differences, across elicitation types. The effect of the inclusion of these two additional terms is to limit model restrictions to equality of the attribute parameter vectors \({{\varvec{\upbeta }}}\) and \({{\varvec{\upsigma }}}\). If the null for H1A is rejected, then it is concluded that attribute parameter estimates are statistically different across elicitation formats. If it is not rejected, a second hypothesis test is constructed, which is Swait and Louviere’s hypothesis “H1B” that \(\lambda =0\), i.e., that there are no scale differences across elicitation types. If the null on H1B is rejected, then one concludes that scale differs, but attribute parameter estimates do not, across elicitation formats. If H1B is not rejected, then one cannot reject the null hypothesis of no differences in either parameters or scale across elicitation formats. We specify two different models to test equivalence of attribute parameter estimates. The first specification constrains the vector of random terms associated with the attribute vector \({{\varvec{\upsigma }}}=0\), and amounts to an error-components logit model. In the second specification, we allow for preference heterogeneity by having \({{\varvec{\upsigma }}}\) be estimated freely and refer to this model as the “random-parameter logit”. These same models are also used to construct the two sets of welfare estimates used to compare differences in that realm across elicitation formats.

3.2 Status-Quo Effects

Status-quo effects are inherent preferences for the status-quo, or no-action, alternative, and are generally indicated by a positive coefficient on the constant term for an action alternative (after controlling for all other attribute effects). Action effects, on the other hand, are the opposite of status-quo effects. Hypothesis tests for testing for differences in status-quo effects across elicitation formats relies on a modified specification of (2) above:
$$\begin{aligned} U_{nj}^{Pool}= & {} \alpha ^{Pool}\cdot \textit{action}_j +\delta \cdot {\textit{SMC}}\cdot \textit{action}_{j}+{{\varvec{\upgamma }}}'{} \mathbf{z}_{j} +{{\varvec{\upbeta }}}^{Pool'}{} \mathbf{x}_{nj} +{{\varvec{\uptau }}}^{\prime }{\textit{RMC}}\cdot \mathbf{x}_{nj} \nonumber \\&+\,\lambda _n \cdot {\textit{RMC}}+\mu _n^{Pool} \cdot \textit{action}_j +\varepsilon _{nj}^{Pool} \end{aligned}$$
(3)
where \({{\varvec{\uptau }}}\) is a vector of fixed coefficients on the interaction of RMC and \(\mathbf{x}\). This interaction allows for differences in attribute coefficients across elicitation type. In this case, the null hypothesis is that \(\delta =0\), i.e., that there is no significant difference in status-quo effects across elicitation types.15

3.3 Attribute Non-attendance

Attribute non-attendance has become an area of great interest in the choice modelling literature. Efforts to identify this and related behavior include eliciting attribute attendance behavior directly from respondents and then controlling for it during modeling (e.g., Alemu et al. 2013; Scarpa et al. 2010; Campbell et al. 2008). Other papers focus on uncovering attribute non-attendance via the modeling process, with recent literature focusing on the use of latent-class models. Within this modeling framework, there are a variety of specifications, including the “\(2^{\mathrm{K}}\) model” which specifies a distinct class for every possible combination of attribute attendance (Hensher et al. 2012), a simplified version of the “\(2^{\mathrm{K}}\)” that focuses on a subset of the possible classes (Hensher and Greene 2010; Campbell et al. 2011); preference parameters constrained within classes (aggregation of common metric attributes, see Hensher et al. 2013; Hensher and Greene 2010); preference parameters constrained across classes (Hensher et al. 2012); correlated non-attendance across attributes (Collins et al. 2013); and most recently, a random-parameters specification within the latent classes (Hess et al. 2013; Hensher et al. 2013). Both of these latter papers find that adding the random-parameters specification increases the probability of membership to the full attribute attendance class, i.e., reduces the assignment of respondents with merely weak preferences to the non-attendance class, although Hensher et al. (2013) find that the addition of the random-parameters component may add only marginal improvements in model fit and may serve as a confounding effect. Our survey of this literature indicates that the best model specification is ultimately an empirical question, and specific to the data at hand.

We estimated a variety of models based on the above-cited literature and found that a model that specified three attribute attendance classes worked best for our data: those that attended to all attributes, those that did not attend to the price attribute, and those that did not attend to the non-price attributes. Given the findings of Hess et al. (2013) and Hensher et al. (2013), we estimated these models with both a fixed- and random-parameters specifications on the main attributes, but found the fixed-parameters specification to be sufficient.16

Because class shares enter the likelihood function as parameters to be optimized, we can use likelihood ratio tests to test for equivalence of class shares across elicitation types. Because our goal was to compare class shares across elicitation formats not parameter estimates across classes, we follow the approach of Hensher et al. (2012) and constrain all non-zero parameters to be equal across classes.17 The log-likelihood functions for the individual elicitation-type models can be written as follows:
$$\begin{aligned} \ln L^{{\textit{SMC}}}= & {} \sum _{n=1}^N {\ln \left[ {\sum _{j=1}^3{p_j^{{\textit{SMC}}} f\left( U_{nj}^{{\textit{SMC}}}|class=j\right) }}\right] }\nonumber \\ \ln L^{{\textit{RMC}}}= & {} \sum _{n=1}^N {\ln \left[ {\sum _{j=1}^3{p_j^{{\textit{RMC}}} f\left( U_{nj}^{{\textit{RMC}}}|class=j\right) }}\right] }\nonumber \\ \ln L^{{\textit{SBW}}}= & {} \sum _{n=1}^N {\ln \left[ {\sum _{j=1}^3{p_j^{{\textit{SBW}}} f\left( U_{nj}^{{\textit{SBW}}}|class=j\right) }}\right] } \end{aligned}$$
(4)
where \(p^{{\textit{SMC}}}\), \(p^{{\textit{RMC}}}\), and \(p^{{\textit{SBW}}}\) are the latent attribute non-attendance class shares for the three elicitation-format models, respectively, and the U functions are defined as in (1) above.18 A constrained (i.e., pooled) model is constructed to carry out the test of the null, whose log-likelihood function is:
$$\begin{aligned} \ln L^{Pool}=\sum _{n=1}^N {\ln \left[ {\sum _{j=1}^3 {p_j^{Pool} f\left( U_{nj}^{Pool} |class=j\right) }}\right] } \end{aligned}$$
(5)
where \(U_{nj}^{Pool}\) is defined as in (3) above. As before, interaction terms are added to the pooled models to allow for differences in all other variables other than class shares. The null hypothesis in the case of comparing SMC to RMC is that \(p^{{\textit{SMC}}}=p^{{\textit{RMC}}}=p^{Pool}\).

3.4 Welfare Estimates

We test whether different elicitation formats yield equivalent welfare estimates at two levels: individual attribute increment values and overall program values. Our null hypotheses are i) attribute increment values are equal across the elicitation formats; and ii) overall program willingness to pay values are equal across the elicitation formats. To test the hypotheses, means tests were conducted using the complete combinatorial approach of Poe et al. (2005), which involves subtracting each element of one simulated willingness to pay distribution from each element of the other simulated willingness to pay distribution and observing the proportion of observations that lie above or below zero. A two-sided test of equality is rejected at the 10, 5, or 1% level if twice the proportion of differences greater than or less than zero is less than 10, 5, or 1%, respectively.

All models were estimated using NLOGIT 5.0, using either the “RPLOGIT” or “LCRPLOGIT” routines Greene (2012). With the exception of price, we specify all random parameters as normally distributed. Although not explicitly shown in the above equations, all models, with the exception of the latent-class models, apply the adjustment to the price parameter suggested by Carson and Czajkowski (2013).19
Table 2

Error-components logit regression results

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Region–Oyster

SMC

RMC

SBW

SMC

RMC

SMC

RMC

SBW

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Action

1.77

1.63

6.80***

1.67

1.01**

0.43

0.82

1.18

4.32***

1.21

1.29

1.60

4.32***

1.22

1.02**

0.42

SD action

3.26*

1.78

6.96***

1.48

1.48***

0.51

2.33*

1.35

5.15***

1.01

4.09*

2.45

5.50***

1.12

1.92***

0.48

Ln(\(\upbeta \)(Neg. price))

\(-\)4.46***

0.13

\(-\)4.20***

0.08

\(-\)4.31***

0.15

\(-\)4.75***

0.14

\(-\)4.64***

0.10

\(-\)4.55***

0.13

\(-\)4.20***

0.09

\(-\)4.46***

0.14

RMC-Q2

  

\(-\)0.86

0.75

    

\(-\)1.23*

0.71

  

\(-\)0.41

0.62

  

RMC-Q3

  

\(-\)2.07***

0.74

    

\(-\)2.17***

0.63

  

\(-\)1.57**

0.70

  

RMC-Q4

  

\(-\)1.90**

0.80

    

\(-\)1.33*

0.71

  

\(-\)1.34*

0.70

  

Flood

0.36***

0.11

0.40***

0.09

0.30**

0.12

0.30***

0.09

0.45***

0.10

0.19*

0.11

0.18*

0.10

\(-\)0.08

0.13

Fish

0.33***

0.09

0.22**

0.09

0.28***

0.11

0.22***

0.09

0.21***

0.08

0.21**

0.10

0.15*

0.09

0.18*

0.10

Bird

0.37***

0.08

0.19**

0.08

0.17

0.11

0.45***

0.08

0.39***

0.08

0.34***

0.09

0.27***

0.10

0.41***

0.11

Water

0.30***

0.09

0.34***

0.08

0.24**

0.11

0.59***

0.09

0.31***

0.08

0.60***

0.11

0.53***

0.11

0.58***

0.13

N (no. resp.) \(=\)

494 (494)

579 (145)

452 (226)

518 (518)

536 (134)

459 (459)

467 (117)

473 (237)

LL \(=\)

\(-\)430.9

\(-\)402.2

\(-\)314.2

\(-\)443.4

\(-\)407.1

\(-\)443.0

\(-\)343.7

\(-\)345.4

 

SMC versus RMC

SMC versus SBW

RMC versus SBW

SMC versus RMC

SMC versus RMC

SMC versus SBW

RMC versus SBW

LL1

\(-\)430.85

\(-\)430.85

\(-\)402.19

\(-\)443.37

\(-\)442.90

\(-\)442.90

\(-\)343.67

LL2

\(-\)402.19

\(-\)314.20

\(-\)314.20

\(-\)407.06

\(-\)343.67

\(-\)345.44

\(-\)345.44

LL1 + LL2

\(-\)833.05

\(-\)745.06

\(-\)716.40

\(-\)850.43

\(-\)786.57

\(-\)788.35

\(-\)689.12

LL (pooled and scaled)

\(-\)836.83

\(-\)747.46

\(-\)717.09

\(-\)853.73

\(-\)790.17

\(-\)790.68

\(-\)691.84

\({\uplambda }\)(A) (5 df)

7.57

4.80

1.39

6.61

7.19

4.67

5.44

Reject H1A?

No

No

No

No

No

No

No

LL (pooled)

\(-\)837.33

\(-\)748.42

\(-\)730.67

\(-\)854.58

\(-\)790.18

\(-\)791.55

\(-\)698.63

\({\uplambda }\)(B) (1 df)

0.99

1.92

27.14

1.69

0.02

1.74

13.58

Reject H1B?

No

No

Yes***

No

No

No

Yes***

*, **, *** Statistical difference at the 10, 5, and 1% level

4 Results

4.1 Attribute Parameter Estimates and Scale Factors

Table 2 contains the results of the individual error-components models and the associated likelihood-ratio tests. Although it is not possible to compare individual parameter estimates directly due to possible differences in scale, we can compare signs and significance across the individual models. The SMC results are consistent across the three samples, and indicate no evidence of status-quo effects (i.e., a non-significant action parameter), but the significance on the standard deviation of the action parameter indicates that there are random error components. Both price and non-price attributes are significant with expected signs. For the RMC models, results are fairly consistent across samples, with some differences in significance of choice question indicators. For the RMC models, the action parameter is highly significant and positive, indicating evidence of action effects. The standard deviation on action is also highly significant. Additionally, the parameters for second, third, and fourth choice questions are all significant and negative with the exception of the second choice question term in the Louisiana–Oyster and Gulf of Mexico Region–Oyster samples. These indicate less action effects (or, said another way, a relatively greater tendency to choose the status-quo) relative to the first choice question. Both price and non-price attributes are significant with expected signs. The SBW results are also fairly consistent across samples: both indicate significant action effects, with significant associated error components. The price parameter is highly significant and of expected sign in both cases. For non-price attributes, however, one of the four are not significant, though it is a different attribute that is not significant in the two samples. Of those that are significant, they are of the expected sign.

We now move on to the tests of parameter equivalence. We find no instances of rejection of H1A, i.e., no evidence of differences in parameter estimates across pairs of elicitation types. Further, we find only two instances of rejection of H1B, i.e., differences in scale, and both of these occur when comparing RMC results with SBW results. Thus, based on these results, we find no evidence of parameter differences across elicitation format, and find scale differences to be limited to the case of RMC versus SBW.

Table 3 contains the results of the random-parameters models, which allows for the added dimension of comparing attribute preference heterogeneity across elicitation types. These results are more mixed. Evidence of preference heterogeneity differs across both elicitation types and samples, with SMC models producing only one instance of significant attribute preference heterogeneity across all three SMC models. RMC models result in somewhat higher instances of preference heterogeneity, but with no clear pattern across samples. SBW models also show limited evidence of preference heterogeneity. Turning to the likelihood ratio tests for these models, results are mostly consistent with those of the earlier error-component logit models. One exception is when comparing SMC to RMC for the Louisiana–Oyster sample. In this instance, H1A is rejected, indicating significant differences in attribute parameter estimates. However, significance is marginal (at the 90% level of significance), and this finding is not held up in the other two samples. As before, tests indicate significant differences in scale between RMC and SBW models only.
Table 3

Random-coefficients logit regression results

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Region–Oyster

SMC

RMC

SBW

SMC

RMC

SMC

RMC

SBW

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Action

0.94

0.75

6.83***

1.73

1.42*

0.79

0.87

1.72

3.75***

1.12

1.29

1.61

4.31***

1.37

0.89*

0.50

SD action

0.07

12.09

6.98***

1.53

0.50

3.20

2.95

2.50

4.63***

1.14

4.10*

2.48

5.67***

1.26

1.90***

0.73

Ln(\(\upbeta \)(Neg. price))

\(-\)4.12***

0.28

\(-\)4.13***

0.12

\(-\)3.82***

0.35

\(-\)4.42***

0.35

\(-\)4.24***

0.17

\(-\)4.55***

0.20

\(-\)4.00***

0.13

\(-\)4.23***

0.15

RMC-Q2

  

\(-\)0.87

0.78

    

\(-\)1.31*

0.79

  

\(-\)0.30

0.68

  

RMC-Q3

  

\(-\)2.13***

0.77

    

\(-\)2.26***

0.71

  

\(-\)1.69**

0.78

  

RMC-Q4

  

\(-\)1.95**

0.83

    

\(-\)1.27

0.84

  

\(-\)1.32*

0.76

  

Flood

0.45**

0.20

0.42***

0.12

0.88**

0.37

0.43*

0.22

0.61***

0.18

0.19*

0.11

0.18

0.15

\(-\)0.07

0.18

SD flood

1.38***

0.49

0.26

0.32

1.68***

0.64

0.50

0.76

0.89***

0.27

0.01

9.32

0.58*

0.30

0.02

2.58

Fish

0.45***

0.15

0.24**

0.10

0.50*

0.27

0.35*

0.19

0.30*

0.16

0.21**

0.10

0.17

0.14

0.34**

0.17

SD fish

0.33

0.87

0.04

1.34

0.73

0.52

0.13

2.05

0.75***

0.26

0.02

5.86

0.47

0.29

0.59**

0.24

Bird

0.44***

0.13

0.19**

0.10

0.21

0.21

0.68**

0.27

0.59***

0.17

0.34***

0.10

0.30**

0.15

0.50***

0.17

SD bird

0.06

3.04

0.42**

0.21

0.03

3.69

0.94

0.71

0.74**

0.31

0.11

1.76

0.57**

0.27

0.03

4.24

Water

0.40**

0.16

0.38***

0.10

0.47*

0.28

0.90**

0.35

0.47***

0.16

0.60***

0.14

0.73***

0.19

0.81***

0.22

SD water

0.43

0.79

0.25

0.33

0.94

0.62

1.12

0.80

0.79***

0.28

0.02

4.81

0.59*

0.33

0.94**

0.39

N (no. resp.) \(=\)

494 (494)

579 (145)

452 (226)

518 (518)

536 (134)

459 (459)

467 (117)

473 (237)

LL \(=\)

\(-\)425.3

\(-\)401.4

\(-\)298.9

\(-\)442.2

\(-\)398.0

\(-\)442.9

\(-\)339.9

\(-\)341.2

 

SMC versus RMC

SMC versus SBW

RMC versus SBW

SMC versus RMC

SMC versus RMC

SMC versus SBW

RMC versus SBW

LL1

\(-\)425.31

\(-\)425.31

\(-\)401.45

\(-\)442.20

\(-\)442.87

\(-\)442.87

\(-\)339.89

LL2

\(-\)401.45

\(-\)298.86

\(-\)298.86

\(-\)397.98

\(-\)339.89

\(-\)341.18

\(-\)341.18

LL1 \(+\) LL2

\(-\)826.76

\(-\)724.17

\(-\)700.31

\(-\)840.18

\(-\)782.76

\(-\)784.05

\(-\)681.08

LL (pooled and scaled)

\(-\)834.30

\(-\)726.87

\(-\)705.84

\(-\)844.18

\(-\)787.60

\(-\)787.59

\(-\)684.30

\({\uplambda }\)(A) (9 df)

15.07

5.41

11.07

8.00

9.68

7.07

6.46

Reject H1A?

Yes*

No

No

No

No

No

No

LL (pooled)

 

\(-\)725.94

\(-\)720.94

\(-\)844.70

\(-\)787.61

\(-\)788.50

\(-\)688.78

\({\uplambda }\)(B) (1 df)

 

\(-\)1.86

30.20

1.04

0.02

1.83

8.94

Reject H1B?

 

No

Yes***

No

No

No

Yes***

*, **, *** Statistical difference at the 10, 5, and 1% level

Table 4

Error-components logit results for pooled-model status-quo/action bias tests

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Region–Oyster

SMC/RMC

SMC/SBW

RMC/SBW

SMC/RMC

SMC/RMC

SMC/SBW

RMC/SBW

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Action

7.04***

1.75

1.65

1.55

6.81***

1.68

4.23***

1.17

4.25***

1.16

1.24

1.54

4.30***

1.16

SD action

2.99**

1.50

1.50***

0.46

1.54***

0.45

2.72**

1.13

4.38***

1.33

1.94***

0.48

1.91***

0.47

SMC \(\times \) action

\(-\)5.52***

1.95

    

\(-\)3.07**

1.39

\(-\)2.77**

1.34

    

SBW \(\times \) action

  

\(-\)0.64

1.60

\(-\)5.80***

1.72

    

\(-\)0.21

1.59

\(-\)3.29***

1.22

SD SMC

  

2.73

2.03

      

3.52

2.62

  

SD RMC

6.51***

1.60

  

6.72***

1.38

4.24***

1.21

3.28*

1.82

  

5.18***

1.26

Neg. price

0.01***

0.00

0.01***

0.00

0.02***

0.00

0.01***

0.00

0.01***

0.00

0.01***

0.00

0.01***

0.00

Neg. price \(\times \) RMC

0.00*

0.00

    

0.00

0.00

0.00*

0.00

    

Neg. price \(\times \) SBW

  

0.00

0.00

0.00

0.00

    

0.00

0.00

0.00

0.00

RMC-Q2

\(-\)0.86

0.71

  

\(-\)0.85

0.70

\(-\)1.23*

0.65

\(-\)0.40

0.64

  

\(-\)0.41

0.64

RMC-Q3

\(-\)2.08***

0.74

  

\(-\)2.06***

0.74

\(-\)2.14***

0.69

\(-\)1.56**

0.68

  

\(-\)1.56**

0.69

RMC-Q4

\(-\)1.90***

0.72

  

\(-\)1.89***

0.72

\(-\)1.31**

0.65

\(-\)1.33**

0.65

  

\(-\)1.33**

0.66

Flood

0.35***

0.11

0.36***

0.11

0.40***

0.10

0.30***

0.09

0.19*

0.11

0.19*

0.11

0.17*

0.10

Flood \(\times \) RMC

0.05

0.15

    

0.15

0.13

\(-\)0.02

0.15

    

Flood \(\times \) SBW

  

\(-\)0.06

0.16

\(-\)0.10

0.15

    

\(-\)0.27

0.17

\(-\)0.25

0.16

Fish

0.33***

0.09

0.33***

0.09

0.22***

0.08

0.23***

0.08

0.21**

0.10

0.21**

0.10

0.15*

0.09

Fish \(\times \) RMC

\(-\)0.11

0.12

    

\(-\)0.01

0.12

\(-\)0.06

0.13

    

Fish \(\times \) SBW

  

\(-\)0.05

0.14

0.07

0.13

    

\(-\)0.04

0.14

0.02

0.14

Bird

0.37***

0.09

0.37***

0.09

0.19**

0.08

0.45***

0.09

0.34***

0.09

0.34***

0.09

0.27***

0.10

Bird \(\times \) RMC

\(-\)0.17

0.12

    

\(-\)0.06

0.12

\(-\)0.06

0.13

    

Bird \(\times \) SBW

  

\(-\)0.19

0.14

\(-\)0.02

0.13

    

0.08

0.14

0.14

0.14

Water

0.31***

0.09

0.30***

0.09

0.34***

0.09

0.60***

0.09

0.60***

0.11

0.60***

0.11

0.53***

0.10

Water \(\times \) RMC

0.03

0.13

    

\(-\)0.29**

0.13

\(-\)0.07

0.15

    

Water \(\times \) SBW

  

\(-\)0.06

0.15

\(-\)0.09

0.14

    

\(-\)0.01

0.17

0.05

0.16

N (no. resp.) \(=\)

1073 (639)

946 (720)

1031 (371)

1054 (652)

926 (576)

932 (696)

940 (354)

LL \(=\)

\(-\)833.1

\(-\)745.1

\(-\)716.51

\(-\)850.5

\(-\)786.5

\(-\)788.4

\(-\)689

Restricted LL \(=\)

\(-\)835.0

\(-\)745.19

\(-\)728.7

\(-\)851.6

\(-\)789.0

\(-\)788.4

\(-\)694.1

\({\uplambda }\) (1 df)

3.76

0.22

24.30

2.24

4.99

0.02

10.04

Reject?

Yes*

No

Yes***

No

Yes**

No

Yes***

*, **, *** Statistical difference at the 10, 5, and 1% level

Table 5

Attribute non-attendance latent-class logit regression results

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Region–Oyster

SMC

RMC

SBW

SMC

RMC

SMC

RMC

SBW

Class shares

All AT

0.63

0.64

0.27

0.48

0.46

0.66

0.52

0.45

Price NAT

0.00

0.19

0.73

0.31

0.38

0.10

0.22

0.55

Non-price NAT

0.37

0.17

0.00

0.21

0.16

0.24

0.26

0.00

 

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Coef

SE

Action

0.62

0.39

0.56

0.49

\(-\)0.05

0.35

0.23

0.57

0.43

0.46

\(-\)0.35

0.43

0.54

0.41

\(-\)0.76**

0.31

SD action

0.00

0.15

0.00

45.16

0.00

0.19

0.00

0.15

0.00

47.14

0.00

36.45

0.00

59.32

0.00

0.15

Neg. price

0.01***

0.00

0.03***

0.00

0.07***

0.01

0.02***

0.01

0.03***

0.00

0.01***

0.00

0.03***

0.00

0.03***

0.00

RMC-Q2

  

\(-\)0.94

0.76

        

\(-\)0.40

0.60

  

RMC-Q3

  

\(-\)0.47

0.60

        

0.00

0.58

  

RMC-Q4

  

\(-\)1.22

0.78

        

\(-\)0.82

0.65

  

Flood

0.71**

0.29

0.79***

0.10

1.05***

0.18

0.42***

0.16

0.82***

0.11

0.23*

0.13

0.59***

0.13

0.27*

0.15

Fish

0.51***

0.16

0.61***

0.10

0.66***

0.14

0.29**

0.12

0.50***

0.09

0.30**

0.12

0.60***

0.11

0.79***

0.15

Bird

0.59***

0.18

0.36***

0.09

\(-\)0.06

0.13

0.58***

0.16

0.50***

0.10

0.43***

0.14

0.32***

0.12

0.44***

0.14

Water

0.47**

0.19

0.42***

0.09

\(-\)0.15

0.15

0.77***

0.21

0.34***

0.10

0.67***

0.20

0.63***

0.12

0.85***

0.17

N (no. resp.) \(=\)

494 (494)

579 (145)

452 (226)

518 (518)

536 (134)

459 (459)

467 (117)

473 (237)

LL \(=\)

\(-\)429.49

\(-\)435.28

\(-\)268.08

\(-\)441.32

\(-\)409.01

\(-\)445.12

\(-\)361.16

\(-\)302.52

 

SMC versus RMC

SMC versus SBW

RMC versus SBW

SMC versus RMC

SMC versus RMC

SMC versus SBW

RMC versus SBW

LL1

\(-\)429.49

\(-\)429.49

\(-\)435.28

\(-\)441.32

\(-\)445.12

\(-\)445.12

\(-\)361.16

LL2

\(-\)435.28

\(-\)268.08

\(-\)268.08

\(-\)409.01

\(-\)361.16

\(-\)302.52

\(-\)302.52

LL1 \(+\) LL2

\(-\)864.77

\(-\)697.57

\(-\)703.35

\(-\)850.32

\(-\)806.28

\(-\)747.64

\(-\)663.68

LL (pooled and scaled)

\(-\)867.64

\(-\)714.95

\(-\)720.76

\(-\)850.52

\(-\)806.52

\(-\)751.11

\(-\)685.36

\({\uplambda }\)(A) (3 df)

5.75

34.77

34.80

0.39

0.49

6.94

43.36

Reject H1A?

No

Yes***

Yes***

No

No

Yes*

Yes***

LL (pooled)

0.00

  

\(-\)850.52

\(-\)806.52

  

\({\uplambda }\)(B) (1 df)

0.00

  

0.39

0.00

  

Reject H1B?

No

  

No

No

  

*, **, *** Statistical difference at the 10, 5, and 1% level

4.2 Status-Quo Effects

As noted earlier, additional terms in which price and all non-price attributes are interacted with elicitation formats were included in the models to maximize parameter freedom and isolate the status-quo effects alone. Additionally, when models included RMC observations, binary indicator variables were included for the second, third, and fourth choice questions. Thus, the terms interacting elicitation format with action capture the pure difference in status-quo effects due to elicitation treatment, and significance of these interaction terms is an indication of differences in status-quo effects between elicitation types. For completeness, we also estimated the models omitting these interaction terms and constructed likelihood-ratio statistics to test the effect of constraining these terms to equal zero at the model level.

Table 4 contains the results of the pooled error-components logit models and likelihood ratio tests used to test for differences in status-quo effects. The relevant parameter estimates for these particular comparisons are highlighted in bold for convenience. Results comparing SMC to RMC are consistent across samples: the SMC x action interaction term is significant and, in this case, negative, indicating a higher proportion of action choices under the RMC elicitation format relative to SMC. Note well that the relevant comparison is the first question of the RMC survey compared to the SMC survey, since we also include question-order indicators for the second, third, and fourth RMC questions. The coefficients on these question-order indicators are negative in all cases, with the third and fourth ones significant in all cases, indicating relatively less action effects in subsequent questions. However, the coefficients on these subsequent question indicators are generally half the magnitude of the SMC \(\times \) action interaction term, implying that although there is less action effects in subsequent RMC questions relative to the first RMC question (which is consistent with the findings in the literature), there is still relatively more action effects in these subsequent questions relative to the SMC format. These findings are supported by the likelihood-ratio test of the unconstrained model against the status-quo effects constrained model, in which the null hypothesis is rejected in all cases except the Louisiana–Salt Marsh sample.

Results comparing SMC to SBW are also consistent across samples, and indicate no significant difference in status-quo effects between these two elicitation types, and these findings are supported by non-significant likelihood-ratio tests. Results comparing RMC to SBW are also consistent across samples; the interaction term (SBW \(\times \) Action) is significant and negative, indicating a higher proportion of action choices under the RMC elicitation format relative to SBW. These findings are also supported by significant likelihood-ratio tests. Additionally, the comparison to the RMC question-order effects are similar to that found above: even though subsequent RMC questions result in less action effects relative to the first RMC question, the difference is still less than the difference between the first RMC question and the SMC format. Thus, these results indicate significant differences in status-quo effects using the RMC elicitation format relative to that of SMC and SBW, specifically that the RMC elicitation format results in increased probabilities of “action” votes, although this result is somewhat mitigated in subsequent RMC questions.

4.3 Attribute Non-attendance

Table 5 contains the results of the attribute non-attendance latent-class logit models and associated likelihood ratio tests. As noted earlier, because we wish only to constrain class shares in the pooled models for the tests, we add interaction terms between all variables and elicitation type to introduce freedom in these parameter estimates. Before we compare results across elicitation treatments, we think it prudent to first compare results within treatments but across samples to discern first whether class shares are consistent within a given elicitation format. For SMC models, the class shares for the Louisiana–Oyster and Gulf of Mexico Regional–Oyster models are consistent, attributing 63–66% of the population to the “all attended to” (All AT) class, 24–37% to the “price not-attended-to” (Price NAT) class, and the lowest share (0–10%) to the “non-price attributes not attended to” (Non-Price NAT) class. The class shares for the Louisiana–Salt Marsh model are somewhat different, attributing lower shares to the All AT and non-price NAT classes and more to the price NAT class.

Among RMC models, class shares for the All AT class range from a high of 64% to a low of 46%, but in all cases, this is the dominant class. Shares for the remaining two classes vary, but tend to be fairly equally split, with no clear pattern for dominance. Among SBW models, the price NAT class dominates, from a high of 73% to a low of 55%, with the second-highest class share being the All AT class. Both SBW models attribute a zero share to the non-price NAT class.

Turning to the comparisons across elicitation formats and likelihood ratio tests, we find a rejection of the null hypothesis of model equivalence when SMC and RMC are compared to SBW. Tests are not rejected for SMC versus RMC. Thus, results indicate significant differences in estimated attribute non-attendance patterns for SBW, but similar patterns for SMC and RMC.20 However, we would add a word of caution to these results given that, as noted earlier, we observe variation in class shares across samples even within the same elicitation type, although those differences are more subtle compared to those observed across elicitation types.

4.4 Welfare Estimates

Two sets of welfare estimates were constructed from the error-components and random-coefficients logit model results reported in Tables 2 and 3. Table 6 displays the estimated attribute increment values and pair-wise tests of equality of mean attribute increment values between elicitation types. Confidence intervals were estimated using the Krinsky and Robb bootstrapping approach (see Haab and McConnell 2002).21 After exponentiating the price coefficient one can straightforwardly employ the Krinsky and Robb technique as moments of the willingness to pay distribution are now well-defined (Carson and Czajkowski 2013).
Table 6

Mean attribute increment value estimates (95% confidence intervals in parentheses), and tests of equality of means

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Regional–Oyster

SMC

RMC

SBW

SMC

RMC

SMC

RMC

SBW

 

Error-components logit models

Increased flood protection

$31

$27

$22

$34

$46

$18

$12

-$7

($13, $49)

($15, $38)

($5, $40)

($14, $57)

($28, $63

($-2, $39)

($-2, $25)

($-32, $15)

Improved fish productivity

$29

$14

$21

$26

$22

$20

$10

$15

($14, $46)

($3, $25)

($5, $38)

($6, $47)

($7, $38)

($3, $40)

($-1, $23)

($-3, $33)

Increased bird habitat

$32

$13

$13

$52

$40

$32

$18

$36

($17, $49)

($3, $23)

($-3, $31)

($32, $75)

($24, $58)

($15, $51)

($6, $30)

($17, $56)

Improved water quality

$26

$22

$18

$68

$32

$57

$35

$50

($11, $44)

($11, $33)

($2, $34)

($47, $93)

($14, $52)

($37, $78)

($22, $49)

($27, $76)

 

Random-coefficients logit models

Increased flood protection

$28

$26

$40

$36

$43

$18

$10

-$5

($15, $44)

($13, $38)

($11, $64)

($-1, $58)

($19, $66)

($-3, $39)

($-7, $26)

($-32, $18)

Improved fish productivity

$28

$15

$23

$29

$21

$20

$9

$23

($15, $44)

($3, $26)

($-3, $42)

($-1, $53)

($-2, $41)

($1, $41)

($-6, $24)

($0, $45)

Increased bird habitat

$27

$12

$9

$56

$41

$32

$16

$35

($15, $42)

($0, $25)

($-11, $31)

($20, $83)

($20, $63)

($14, $52)

($0, $33)

($12, $60)

Improved water quality

$25

$24

$21

$75

$33

$57

$40

$56

($9, $37)

($12, $35)

($-6, $40)

($31, $102)

($11, $58)

($36, $79)

($22, $58)

($28, $85)

 

Equality of mean attribute increment value estimate test results

SMC/RMC

SMC/SBW

RMC/SBW

SMC/RMC

SMC/RMC

SMC/SBW

RMC/SBW

 

Error-components logit models

Increased flood protection

=

=

=

=

=

=

=

Improved fish productivity

=

=

=

=

=

=

=

Increased bird habitat

**

=

=

=

=

=

=

Improved water quality

=

=

=

**

*

=

=

 

Random-coefficients logit models

Increased flood protection

=

=

=

=

=

=

=

Improved fish productivity

=

=

=

=

=

=

=

Increased bird habitat

*

=

=

=

=

=

=

Improved water quality

=

=

=

*

=

=

=

*, **, *** Statistical difference at the 10, 5, and 1% level, \(=\) indicates failure to reject statistical equality of 2-sided test

Table 7

Mean program-level welfare estimates (95% confidence intervals in parentheses), and tests of equality of means

 

Louisiana–Oyster

Louisiana–Salt Marsh

Gulf of Mexico Regional–Oyster

SMC

RMC-Q1

RMC-Avg

SBW

SMC

RMC-Q1

RMC-Avg

SMC

RMC-Q1

RMC-Avg

SBW

Error-components logit models

$331

$568

$515

$193

$335

$656

$574

$288

$386

$349

$190

($56, $602)

($352, $799)

($314, $729)

($157, $231)

($67, $593)

($417, $910)

($360, $805)

($-15, $578)

($239, $541)

($216, $489)

($145, $242)

Random-coefficients Logit Models

$222

$544

$492

$221

$334

$460

$404

$288

$331

$301

$188

($23, $421)

($332, $784)

($295, $715)

($128, $259)

($-43, $657)

($300, $651)

($261, $570)

($-24, $586)

($192, $474)

($175, $429)

($142, $238)

Equality of mean welfare estimate test results

 

Error-components logit models

Random-coefficients logit models

Error-components logit models

Random-coefficients logit models

Error-components logit models

Random-coefficients logit models

    

SMC/RMC-Q1

=

**

*

=

=

=

    

SMC/RMC-Avg

=

*

=

=

=

=

    

SMC/SBW

=

=

  

=

=

    

RMC-Q1/SBW

***

***

  

**

*

    

RMC-Avg/SBW

***

***

  

**

*

    

*, **, *** Statistical difference at the 10, 5, and 1% level, \(=\) indicates failure to reject statistical equality of 2-sided test

Test results following the complete combinatorial approach of Poe et al. (2005) reveal very few differences in incremental values of attributes across elicitation types.22 In fact, out of the 28 tests constructed over the error-components logit results, only three are significant, and for the random-parameters logit results, none is significant. Comparing SMC results to RMC results, only differences in the incremental value of bird habitat are detected for the Louisiana–Oyster sample, and only differences in the incremental value of improved water quality are detected for the Louisiana–Salt Marsh and Gulf of Mexico–Oyster samples. Furthermore, we observe no patterns in differences in precision of these estimates across elicitation treatments (based on the percentage difference between the mean and the upper or lower bound). Thus, our results indicate almost no differences in attribute increment values across elicitation types, and this finding is robust across the three samples tested.

We also constructed program-level welfare estimates, i.e., estimates of the mean maximum willingness to pay for a complete program that delivers a specific suite of ecosystem services. Here, we fix all service attributes at the intermediate level. These estimates account for all model variables, and so capture the effect on welfare of action and question-order effects. For the RMC treatment, a decision must be made on how to handle question-order effects. We specify two ways: the first way, labeled “RMC-Q1” calculates welfare under the counterfactual that all responses were first RMC question responses, i.e., the second, third, and fourth RMC question coefficients are zero-weighted. The second way, labeled “RMC-Avg” calculates welfare values under mean question-order effects, i.e., yields “average” question-order welfare estimates. Table 7 reports the means, confidence intervals, and results for tests of equality across elicitation treatments. The RMC-Q1 treatment yields the highest welfare estimates, followed by the RMC-Avg treatment, then the SMC treatment, with the SBW treatment yielding the lowest welfare estimates. These results are consistent across all samples and both error-components and random-coefficients model specifications. In terms of precision, we find that the SBW treatment yields the tightest welfare estimates (based on the percentage difference between the mean and the upper or lower bound), where the upper/lower bound represents a 17–27% change relative to the mean across samples and model specifications; the RMC treatment yields the second-tightest welfare estimates (39–44% range), with the SMC treatment yielding, by far, the widest estimates (77–104% range).23

Tests of equality indicate that the differences between the SMC and RMC treatments are statistically significant in only one case: under the error-components model specification for the Louisiana–Salt Marsh sample. No significant differences are found between the SMC and SBW treatments. Differences between the RMC treatment and the SBW treatment are statistically significant in both cases under the error-components specification only. Thus, in terms of program-level welfare estimates, the RMC treatments yield the highest means with intermediate precision; the SBW treatment yields the lowest means with the greatest precision; and the SMC treatment yields the lowest means with the least precision.

5 Conclusions

To summarize, after controlling for differences in scale across alternatives (action versus no-action) and elicitation format, we find very limited evidence of any differences in attribute parameter estimates among the three elicitation treatments, and find significant scale differences only when comparing RMC to SBW. We also find very little evidence of differences in attribute increment values (i.e., ecosystem service values) across elicitation treatments. Arguably, these are the two areas of greatest concern from a policy perspective regarding the performance of multinomial discrete-choice experiments. The implication here, based at least on our results, is that as long as the researcher controls for question-order effects and scale differences, attribute increment values are unaffected by the choice of elicitation format.

We do, however, find significant differences in status-quo effects across elicitation treatments. In the three independent samples we analyzed, the RMC treatments result in greater proportions of “action” votes relative to both SMC and SBW treatments. What is interesting is that although our results are consistent with the literature showing that there is greater status-quo effects in subsequent RMC questions relative to the first RMC question, we find that these subsequent RMC questions still result in more action effects relative to the other elicitation formats. This effect plays a significant role in the construction of program-level welfare estimates, where we also find significant differences: RMC treatments yield consistently higher welfare estimates relative to both SMC and SBW treatments, although for our samples, these differences are not universally statistically significant. This lack of statistical significance in some cases should not be interpreted, however, as a “green light”: in most cases, a researcher will choose one elicitation treatment and will take the resulting welfare estimates at face value; and for our samples at least, the RMC treatments yield welfare estimates that ranged between a low of 15% and a high of 195% greater than the other elicitation treatments, depending on sample and model specification. In terms of precision, however, the RMC treatments fared well, whereas the SMC treatment yielded the least precision by a wide margin. Taking these findings together, our results indicate that, in terms of welfare estimation, the choice of elicitation format may have little influence on individual attribute increments (e.g., ecosystem service valuation), but could have a large influence on program-level welfare estimates, both in terms of magnitude and precision.

Our results also indicate significant differences in attribute non-attendance behavior for the SBW elicitation format, but no significant differences between SMC and RMC formats. Although there was little consistency in class shares even within a given elicitation treatment across samples, the good news is that, for our samples, the SMC and RMC treatments yielded a plurality of respondents falling into the class that attends to all attributes, which is the class that researchers generally assume (or hope, rather) to be the case. The results of the SBW treatments, however, indicate a plurality of respondents in the class that does not attend to the price attribute. Although the larger implications of this finding is not obvious, it does indicate that elicitation treatments may induce different kinds of behavior with regard to how respondents perceive and react to the information provided in the choice sets. Further research focused on this issue is warranted.

So what have we learned? To our knowledge, no study has compared these elicitation formats directly. This gap in the literature is somewhat surprising, given the intense scrutiny of any variations to the single-choice referendum format that were introduced in the contingent valuation literature. Petrolia and Interis’s (2013) essay called attention to the potential risks of adopting a repeated-choice format and the potential for behavioral anomalies that could bias responses and subsequent results and conclusions. What our study here, finds, however, is that the differences may not be as bad as they feared, and appear to be limited to particular aspects of the results. It is important, however, to keep in mind that any lack of statistical significance reported here may be due to Type II error. Our study did not include any power analysis to determine the sample size necessary to detect significance in our tests across elicitation formats. In other words, it is an open question whether any of our null findings are a result of insufficient power. That said, if we take the results at face value, they imply that if researchers are interested primarily in individual attribute values, such as in the case of ecosystem-service valuation, then the choice of elicitation format may not matter, and the RMC format would be the most cost-effective approach. If they are interested in program-level welfare estimates, then the decision may require more deliberation, but even here, there is no clear winner; unlike the binary-choice format which is, at least in theory, incentive-compatible, there is no “standard” format among those we consider here, and so there is no way to discern which estimates are the “right” ones. Thus, the choice of question format should depend upon the modeling approach the researcher expects to use and on the desired outputs of the analysis.

Footnotes

  1. 1.

    We adopt the terminology of Carson and Louviere (2011) in which a discrete choice experiment is a survey in which respondents are asked to make a discrete choice from two or more alternatives within a choice set and the choice sets are carefully constructed by the researcher according to an experimental design.

  2. 2.

    This method can go by other names. See Carson and Louviere (2011) and Louviere et al. (2010) for discussions on nomenclature.

  3. 3.

    See Petrolia and Interis (2013) for a detailed discussion. It should be noted that status-quo effects and strategic behavior are not necessarily unique to the repeated multinomial-choice format. Day et al. (2012), e.g., find status-quo effects in a repeated binary-choice format. As for strategic behavior, Samuelson (1954, p. 188) states that “it is in the selfish interest of each person to give false signals, to pretend to have less interest in a given collective consumption activity than he really has.” Following Samuelson’s view, Carson (2012, p. 37) states that “those answering contingent valuation surveys about a public good should follow a free-rider approach of pretending to be less interested, hoping that the costs of providing the public good will fall on others”.

  4. 4.

    This adjustment to the price coefficient ensures that the sampling distribution of the price parameter will be entirely in the negative domain, so that when calculating, for example, willingness-to-pay values there will not be any division by zero or by a positive price parameter.

  5. 5.

    It is worth noting that Holmes and Boyle (2005), Pattison et al. (2011), Day et al. (2012), and Carlsson et al. (2012) indicate that the repeated-response format may allow for learning, implying that initial choice tasks may be less informative than latter ones, or in some cases, should be discarded. Ladenburg and Olsen (2008) cite their results as evidence of this effect. Scheufele and Bennett (2012) point out, however, that it is also possible that respondents to a repeated-choice survey discover the possibility of responding strategically as they progress through the choice tasks, and this “strategic learning” may coincide with learning about the choice task.

  6. 6.

    A related finding is that of Meyerhoff and Liebe (2009) that choice task complexity can lead to an increased probability of status-quo votes.

  7. 7.

    It is important to note that these tests are conducted from a purely empirical basis because there exists no theory that would dictate which elicitation format, in a multinomial-choice setting, is the “standard”. Unlike single binary-choice questions, which have been shown to be incentive-compatible, multinomial-choice questions are not incentive compatible, at least in a field setting (see Carson and Groves 2007 and Petrolia and Interis 2013), and it is in this setting that our interest lies here, given the widespread use of these elicitation methods in the field for policy-relevant valuation. Multinomial-choice questions can be made incentive-compatible in a lab setting (see Taylor et al. 2010).

  8. 8.

    A reviewer pointed out that the water-quality attribute could be a prior for the fisheries and wading bird attributes. Unfortunately, we did not consider this possibility when designing the survey, so we acknowledge this potential weakness.

  9. 9.

    S-efficiency (Bliemer and Rose 2005) was also evaluated for each individual parameter, assuming both fixed- and random-parameters models. S-efficiency provides a lower bound on sample size to obtain significant estimates for each coefficient (Bliemer and Rose 2009). We specified coefficient priors of 0.3 (mean) and 0.15 (standard deviation), normally distributed, on all non-price attributes, and −0.005 on price, with \(t_{0.05} =1.96\), corresponding to a 95% confidence level. Assuming a fixed parameter model, the largest s-value was \(\sim \)6, implying that we would need to replicate the 24-row design a minimum of 6 times (i.e., \(24 \times 6 = 144\) choice observations) to obtain significance on our coefficients. Assuming a random-parameter design, the largest s-value on the mean coefficients was \(\sim \)12, implying a minimum of 288 choice observations. Each of our individual samples contained \(\sim \)500 choice observations, roughly twice the number required for the larger of the two s-values calculated. Note, however, that these efficiency measures do not speak to the question of whether our sample size is large enough to establish sufficient power for tests across elicitation formats. We did not conduct any such power analysis for this purpose.

  10. 10.

    Because we did not randomize the order of presentation of choice sets within blocks, there is some possibility of confounding effects of our order-effect variables used in the regression models.

  11. 11.

    This format may differ somewhat from other studies that have utilized the SBW elicitation format. In those studies, it appears that the choices are sequential, so that the respondent chooses the “best”, then is shown only the remaining alternatives and is asked to indicate the “worst”, etc., until all alternatives have been fully ranked.

  12. 12.

    Let A and B represent a pair of alternatives in a choice set. The second-best case operates under the assumption that the probability of A being chosen as “worst” is equal to the probability of B being chosen as “best”. This rank-order explosion is also known as the Plackett–Luce model (Marden 1995), the choice-based method of conjoint analysis (Hair et al. 2010), and most frequently, rank-ordered logit (StataCorp 2013).

  13. 13.

    A reviewer pointed out that our inclusion of language that the program would be partially funded with existing tax dollars may have introduced a problem with the resulting welfare estimates. If we do not know how respondents interpreted what would happen to the existing tax dollars in the event of no project, then we do not know how much utility loss they would be willing to trade for the utility gain of the described policy. If so, then our estimates may not be a sufficient money metric measure of the utility change. However, this potential flaw should not invalidate the comparisons made here.

  14. 14.

    Under the RMC and SBW cases, models were specified as a panel, such that individual-specific coefficients for random parameters were constrained to be equal across observations for the same respondent.

  15. 15.

    Note that we fix \({{\varvec{\upsigma }}}=0\) in these models, i.e., we do not allow for preference heterogeneity in the attributes when testing for differences in status-quo effects.

  16. 16.

    The random-parameters specification did not significantly improve model fit, did not indicate any significant preference heterogeneity beyond that already captured by the latent classes, and all of the tests of class-share differences across elicitation types were identical to those of the fixed-parameters models. These results are reported in table A7 of the Appendix in ESM.

  17. 17.

    We also estimated the models without the parameter restrictions across classes, and tests of class differences are exactly identical to the main results. In two cases the test of scale parameter differences is significant (Louisiana-Salt SMC vs. RMC and GOM-Oyster SMC vs. RMC). These results are reported in table A6 of the Appendix in ESM.

  18. 18.

    Note that we fix \({{\varvec{\upsigma }}}=0\) in these models as well.

  19. 19.

    This adjustment is not applied to the latent-class logit models due to the difficulty of imposing log-normal distributions in that setting. Because we are not using these results to construct welfare estimates, this omission should not affect the results, which generally comes into play during the simulation stage of welfare estimate construction (see Carson and Czajkowski 2013).

  20. 20.

    In an alternative specification that follows the two-class model used by Hess et al. (2013) (all attended to and none attended to), test results were identical with one exception, where we found no significant class-share differences between SMC and SBW for the Louisiana Oyster sample. It should be noted that the 3-class results reported here indicate that the “price NAT” class, which is the class omitted in the 2-class model, has a larger share than one or both of the other classes in 5 out of the 8 models estimated, implying that a model that omits this class may be misspecified to begin with. These results are reported in table A5 of the Appendix in ESM.

  21. 21.

    For the error-components logit models, 10,000 draws were used for each parameter. The Krinsky and Robb approach needs to be adjusted, however, when there are random parameters in order to account for the distribution of the random parameters in the population. For the random-coefficients logit models, we therefore used 5000 draws for each parameter, including the mean and standard deviations of the distributions of random variables. This creates 5000 simulated distributions for each random parameter, each defined by a mean and standard deviation. We then made 5000 draws of a given random parameter from each of these 5000 simulated distributions (see Hensher and Greene 2003). This yields \(5000^{2}\) total random draws for each random variable.

  22. 22.

    These tests across two simulated distributions of length n and m involve the creation of an \(n-\)by-m vector, which exceeds computational capacity for the random coefficients, each of which has a simulated distribution of length \(5000^{2}\). We therefore re-estimated each of the random-coefficients models using 100 draws in each stage and conducted these tests using simulated distributions of length \(100^{2}\). The confidence intervals for the random-coefficients models presented are from the first simulation of vectors length \(5000^{2}\), however. Although these intervals are intended to be more precise than intervals based on vectors of only \(100^{2}\) draws, the tests of equality of means are not affected by the number of draws, as it is obvious that the confidence intervals for the random coefficients are very wide and overlap greatly.

  23. 23.

    Precision is also a function of sample size. Although all of our samples were in the neighborhood of 500 observations, there were some minor differences, which could account partially for these differences in precision.

Notes

Acknowledgements

The authors thank A.A.J. Marley and two anonymous referees for comments that greatly improved the manuscript. This research was conducted under award NA10OAR4170078 to the Mississippi-Alabama Sea Grant Consortium by the NOAA Office of Ocean and Atmospheric Research, U.S. Department of Commerce, and was supported by the USDA Cooperative State Research, Education & Extension Service, Multistate Project W-3133 “Benefits and Costs of Natural Resources Policies Affecting Ecosystem Services on Public and Private Lands” (Hatch # MIS-033130).

Supplementary material

10640_2016_83_MOESM1_ESM.docx (68 kb)
Supplementary material 1 (docx 67 KB)

References

  1. Alemu MH, Mørkbak MR, Olsen SB, Jensen CL (2013) Attending to the reasons for attribute non-attendance in choice experiments. Environ Resour Econ 54:333–359CrossRefGoogle Scholar
  2. Arrow K, Solow R, Portney PR, Leamer EE, Radner R, Schuman H (1993) Report of the NOAA panel on contingent valuation. Fed Regist 58:4601–4614Google Scholar
  3. Bateman IJ, Cole M, Cooper P, Georgiou S, Hadley D, Poe GL (2004) On visible choice sets and scope sensitivity. J Environ Econ Manag 47:71–93CrossRefGoogle Scholar
  4. Beaumais O, Prunetti D, Casacianca A, Pieri X (2015) Improving solid waste management in the Island of Beauty (Corsica): a latent-class rank-ordered logit approach with observed heterogeneous ranking capabilities. Revue d’economie politique 125(2):209–231CrossRefGoogle Scholar
  5. Blamey RK, Bennett JW, Louviere JJ, Morrison MD, Rolfe JC (2002) Attribute causality in environmental choice modelling. Environ Resour Econ 23:167–186CrossRefGoogle Scholar
  6. Bliemer MCJ, Rose JM (2009) Efficiency and sample size requirements for stated choice experiments. Transportation Research Board Annual Meeting, Washington, DCGoogle Scholar
  7. Bliemer MCJ, Rose JM (2005) Efficiency and sample size requirements for stated choice studies. Report ITLS-WP-05-08, Institute of Transport and Logistics Studies, University of SydneyGoogle Scholar
  8. Campbell D, Hensher DA, Scarpa R (2011) Non-attendance to attributes in environmental choice analysis: a latent class specification. J Environ Plan Manag 54(8):061–76CrossRefGoogle Scholar
  9. Campbell D, Hutchinson WG, Scarpa R (2008) Incorporating discontinuous preferences into the analysis of discrete choice experiments. Environ Resour Econ 41:401–417CrossRefGoogle Scholar
  10. Carlsson F, Mørkbak MR, Olsen SB (2012) The first time is the hardest: a test of ordering effects in choice experiments. J Choice Model 5(2):19–37CrossRefGoogle Scholar
  11. Carson RT (2012) Contingent valuation: a practical alternative when prices aren’t available. J Econ Perspect 26(4):27–42CrossRefGoogle Scholar
  12. Carson RT (1985) Three essays on contingent valuation. PhD thesis, University of California, BerkeleyGoogle Scholar
  13. Carson RT, Czajkowski M (2013) A new baseline model for estimating willingness to pay from discrete choice models. Presented at the 2013 international choice modelling conference, July. http://www.icmconference.org.uk/index.php/icmc/ICMC2013/paper/view/730. Cited 9 Dec 2014
  14. Carson RT, Groves T (2007) Incentive and informational properties of preference questions. Environ Resour Econ 37:181–210CrossRefGoogle Scholar
  15. Carson RT, Louviere JJ (2011) A common nomenclature for stated preference elicitation approaches. Environ Resour Econ 49:539–559CrossRefGoogle Scholar
  16. Chapman RG, Staelin R (1982) Exploiting rank ordered choice set data within the stochastic utility model. J Mark Res XIX:288–301Google Scholar
  17. ChoiceMetrics (2011) Ngene 1.1 user manual and reference guideGoogle Scholar
  18. Collins AT, Rose JM, Hensher DA (2013) Specification issues in a generalized random parameters attribute nonattendance model. Transp Res Part B 56:234–253CrossRefGoogle Scholar
  19. Day B, Bateman IJ, Carson RT, Dupont D, Louviere JJ, Morimoto S, Scarpa R, Wang P (2012) Ordering effects and choice set awareness in repeat-response stated preference studies. J Environ Econ Manag 63:73–91CrossRefGoogle Scholar
  20. Day B, Prades JLP (2010) Ordering anomalies in choice experiments. J Environ Econ Manag 59:271–285CrossRefGoogle Scholar
  21. Flynn TN, Louviere JJ, Peters TJ, Coast J (2007) Best-worst scaling: what it can do for health care research and how to do it. J Health Econ 26:71–89CrossRefGoogle Scholar
  22. Flynn T, Marley AJ (2014) Best worst scaling: theory and methods. In: Hess S, Daly A (eds) Handbook of choice modelling. Edward Elgar Publishing, Camberley, pp 178–201Google Scholar
  23. Greene WH (2012) Reference Guide, NLOGIT Version 5.0, Econometric Software, Inc., Plainview, NYGoogle Scholar
  24. Haab TC, McConnell KE (2002) Valuing environmental and natural resources: the econometrics of non-market valuation. Edward Elgar, NorthamptonCrossRefGoogle Scholar
  25. Hair JF Jr, Black WC, Babin BJ, Anderson RE (2010) Multivariate data analysis, 7th edn. Pearson, Upper Saddle RiverGoogle Scholar
  26. Hanemann W (1985) Some issues in continuous- and discrete-response contingent valuation studies. Northeast J Agric Econ 14:5–13Google Scholar
  27. Hensher DA, Collins AT, Greene WH (2013) Accounting for attribute non-attendance and common-metric aggregation in a probabilistic decision process mixed multinomial logit model: a warning on potential confounding. Transportation 40:1003–1020CrossRefGoogle Scholar
  28. Hensher DA, Greene WH (2003) The mixed logit model: the state of the practice. Transportation 30:133–176CrossRefGoogle Scholar
  29. Hensher DA, Greene WH (2010) Non-attendance and dual processing of common-metric attribute in choice analysis: a latent class specification. Empir Econ 39:413–426CrossRefGoogle Scholar
  30. Hensher DA, Rose JM, Greene WH (2012) Inferring attribute non-attendance from stated choice data: implications for willingness to pay estimates and a warning for stated choice experiment design. Transportation 39:235–245CrossRefGoogle Scholar
  31. Hess S, Stathopoulos A, Campbell D, O’Neill V, Caussade S (2013) It’s not that I don’t care, I just don’t care very much: confounding between attribute non-attendance and taste heterogeneity. Transportation 40:583–607CrossRefGoogle Scholar
  32. Holmes TP, Boyle KJ (2005) Dynamic learning and context-dependence in sequential, attribute-based, stated-preference valuation questions. Land Econ 81:114–126CrossRefGoogle Scholar
  33. Interis MG, Petrolia DR (2016) Location, location, habitat: how the value of ecosystem services varies across location and by habitat. Land Econ 92(2):292–307CrossRefGoogle Scholar
  34. Ladenburg J, Olsen SB (2008) Gender-specific starting point bias in choice experiments: evidence from an empirical study. J Environ Econ Manag 56:275–285CrossRefGoogle Scholar
  35. List JA, Sinha P, Taylor MH (2006) Using choice experiments to value non-market goods and services: evidence from field experiments. B.E. J Econ Anal Policy 5(2):1–37Google Scholar
  36. Louviere JJ, Flynn TN, Carson RT (2010) Discrete choice experiments are not conjoint analysis. J Choice Model 3(3):57–72CrossRefGoogle Scholar
  37. Louviere JJ, Flynn TN, Marley AAJ (2015) Best-worst scaling: theory, methods and applications. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  38. Marden JI (1995) Analyzing and modeling rank data. Chapman and Hall, LondonGoogle Scholar
  39. Marley AAJ, Louviere JJ (2005) Some probabilistic models of best, worst, and best-worst choices. J Math Psychol 49:464–480CrossRefGoogle Scholar
  40. McNair B, Bennett J, Hensher D (2011) A comparison of responses to single and repeated discrete choice questions. Resour Energy Econ 33:554–571CrossRefGoogle Scholar
  41. McNair B, Hensher D, Bennett B (2012) Modelling heterogeneity in response behavior towards a sequence of discrete choice questions: a probabilistic decision process model. Environ Resour Econ 51:599–616CrossRefGoogle Scholar
  42. Meyerhoff J, Liebe U (2009) Status quo effect in choice experiments: empirical evidence on attitudes and choice task complexity. Land Econ 85(3):515–528CrossRefGoogle Scholar
  43. Newell LW, Swallow SK (2013) Real-payment choice experiments: valuing forested wetlands and spatial attributes within a landscape context. Ecol Econ 92:37–47CrossRefGoogle Scholar
  44. Pattison J, Boxall PC, Adamowicz WL (2011) The economic benefits of wetland retention and restoration in Manitoba. Can J Agric Econ 59:223–244CrossRefGoogle Scholar
  45. Petrolia DR, Interis MG (2013) Should we be using repeated-choice surveys to value public goods? Assoc Environ Resour Econ Newsl 33(2):19–25Google Scholar
  46. Petrolia DR, Interis MG, Hwang J (2014) America’s wetland? A national survey of willingness to pay for restoration of Louisiana’s coastal wetlands. Mar Resour Econ 29(1):17–37CrossRefGoogle Scholar
  47. Poe G, Giraud K, Loomis J (2005) Computational methods for measuring the difference of empirical distributions. Am J Agric Econ 87(2):353–365CrossRefGoogle Scholar
  48. Potoglou D, Burge P, Flynn T, Netten A, Malley J, Forder J, Brazier JE (2011) Best-worst scaling vs. discrete choice experiments: an empirical comparison using social care data. Soc Sci Med 72:1717–1727CrossRefGoogle Scholar
  49. Rigby D, Burton M, Lusk JL (2015) Journals, preferences, and publishing in Agricultural and Environmental Economics. Am J Agric Econ 97(2):490–509CrossRefGoogle Scholar
  50. Samuelson PA (1954) The pure theory of public expenditure. Rev Econ Stat 36(4):387–389CrossRefGoogle Scholar
  51. Scarpa R, Notaro S, Louviere JJ, Raffaelli R (2011) Exploring scale effects of best/worst rank ordered choice data to estimate benefits of tourism in Alpine Grazing Commons. Am J Agric Econ 93(3):813–828CrossRefGoogle Scholar
  52. Scarpa R, Thiene M, Hensher DA (2010) Monitoring choice task attribute attendance in nonmarket valuation of multiple park management services: does it matter? Land Econ 86(4):817–839CrossRefGoogle Scholar
  53. Scheufele G, Bennett J (2012) Response strategies and learning in discrete choice experiments. Environ Resour Econ 52:435–453CrossRefGoogle Scholar
  54. Silz-Carson K, Chilton SM, Hutchinson WG (2010) Bias in choice experiments for public goods. Newcastle discussion papers in Economics, no. 2010/05, Newcastle University Business SchoolGoogle Scholar
  55. StataCorp (2013) Stata release 13.0 statistical software. StataCorp LP, College StationGoogle Scholar
  56. Swait J, Louviere J (1993) The role of the scale parameter in the estimation and comparison of multinomial logit models. J Mark Res XXX:305–314Google Scholar
  57. Taylor LO, Morrison MD, Boyle KJ (2010) Exchange rules and the incentive compatibility of choice experiments. Environ Resour Econ 47:197–220CrossRefGoogle Scholar
  58. Train KE (2009) Discrete choice methods with simulation, 2nd edn. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  59. Vossler CA, Doyon M, Rondeau D (2012) Truth in consequentiality: theory and field evidence on discrete choice experiments. Am Econ J Microecon 4:145–171CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  1. 1.Deparment of Agricultural EconomicsMississippi State UniversityMississippi StateUSA
  2. 2.Fish and Wildlife Research InstituteFlorida Fish and Wildlife Conservation CommissionGainesvilleUSA

Personalised recommendations